D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16<h1>API Documentation</h1>
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) Mathieu Daëron'
  23__license__   = 'MIT License - https://opensource.org/licenses/MIT'
  24__date__      = '2024-11-17'
  25__version__   = '2.4.2'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42
  43typer.rich_utils.STYLE_HELPTEXT = ''
  44
  45rcParams['font.family'] = 'sans-serif'
  46rcParams['font.sans-serif'] = 'Helvetica'
  47rcParams['font.size'] = 10
  48rcParams['mathtext.fontset'] = 'custom'
  49rcParams['mathtext.rm'] = 'sans'
  50rcParams['mathtext.bf'] = 'sans:bold'
  51rcParams['mathtext.it'] = 'sans:italic'
  52rcParams['mathtext.cal'] = 'sans:italic'
  53rcParams['mathtext.default'] = 'rm'
  54rcParams['xtick.major.size'] = 4
  55rcParams['xtick.major.width'] = 1
  56rcParams['ytick.major.size'] = 4
  57rcParams['ytick.major.width'] = 1
  58rcParams['axes.grid'] = False
  59rcParams['axes.linewidth'] = 1
  60rcParams['grid.linewidth'] = .75
  61rcParams['grid.linestyle'] = '-'
  62rcParams['grid.alpha'] = .15
  63rcParams['savefig.dpi'] = 150
  64
  65Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  66_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  67def fCO2eqD47_Petersen(T):
  68	'''
  69	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  70	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  71
  72	'''
  73	return float(_fCO2eqD47_Petersen(T))
  74
  75
  76Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  77_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  78def fCO2eqD47_Wang(T):
  79	'''
  80	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  81	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  82	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  83	'''
  84	return float(_fCO2eqD47_Wang(T))
  85
  86
  87def correlated_sum(X, C, w = None):
  88	'''
  89	Compute covariance-aware linear combinations
  90
  91	**Parameters**
  92	
  93	+ `X`: list or 1-D array of values to sum
  94	+ `C`: covariance matrix for the elements of `X`
  95	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  96	       (all equal to 1 by default)
  97
  98	Return the sum (and its SE) of the elements of `X`, with optional weights equal
  99	to the elements of `w`, accounting for covariances between the elements of `X`.
 100	'''
 101	if w is None:
 102		w = [1 for x in X]
 103	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 104
 105
 106def make_csv(x, hsep = ',', vsep = '\n'):
 107	'''
 108	Formats a list of lists of strings as a CSV
 109
 110	**Parameters**
 111
 112	+ `x`: the list of lists of strings to format
 113	+ `hsep`: the field separator (`,` by default)
 114	+ `vsep`: the line-ending convention to use (`\\n` by default)
 115
 116	**Example**
 117
 118	```py
 119	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 120	```
 121
 122	outputs:
 123
 124	```py
 125	a,b,c
 126	d,e,f
 127	```
 128	'''
 129	return vsep.join([hsep.join(l) for l in x])
 130
 131
 132def pf(txt):
 133	'''
 134	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 135	'''
 136	return txt.replace('-','_').replace('.','_').replace(' ','_')
 137
 138
 139def smart_type(x):
 140	'''
 141	Tries to convert string `x` to a float if it includes a decimal point, or
 142	to an integer if it does not. If both attempts fail, return the original
 143	string unchanged.
 144	'''
 145	try:
 146		y = float(x)
 147	except ValueError:
 148		return x
 149	if '.' not in x:
 150		return int(y)
 151	return y
 152
 153class _Defaults():
 154	def __init__(self):
 155		pass
 156
 157D47crunch_defaults = _Defaults()
 158D47crunch_defaults.PRETTY_TABLE_VSEP = '—'
 159
 160def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
 161	'''
 162	Reads a list of lists of strings and outputs an ascii table
 163
 164	**Parameters**
 165
 166	+ `x`: a list of lists of strings
 167	+ `header`: the number of lines to treat as header lines
 168	+ `hsep`: the horizontal separator between columns
 169	+ `vsep`: the character to use as vertical separator
 170	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 171
 172	**Example**
 173
 174	```py
 175	print(pretty_table([
 176		['A', 'B', 'C'],
 177		['1', '1.9999', 'foo'],
 178		['10', 'x', 'bar'],
 179	]))
 180	```
 181	yields:	
 182	```
 183	——  ——————  ———
 184	A        B    C
 185	——  ——————  ———
 186	1   1.9999  foo
 187	10       x  bar
 188	——  ——————  ———
 189	```
 190
 191	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
 192	
 193	```py
 194	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
 195	print(pretty_table([
 196		['A', 'B', 'C'],
 197		['1', '1.9999', 'foo'],
 198		['10', 'x', 'bar'],
 199	]))
 200	```
 201	yields:	
 202	```
 203	==  ======  ===
 204	A        B    C
 205	==  ======  ===
 206	1   1.9999  foo
 207	10       x  bar
 208	==  ======  ===
 209	```
 210	'''
 211	
 212	if vsep is None:
 213		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
 214	
 215	txt = []
 216	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 217
 218	if len(widths) > len(align):
 219		align += '>' * (len(widths)-len(align))
 220	sepline = hsep.join([vsep*w for w in widths])
 221	txt += [sepline]
 222	for k,l in enumerate(x):
 223		if k and k == header:
 224			txt += [sepline]
 225		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 226	txt += [sepline]
 227	txt += ['']
 228	return '\n'.join(txt)
 229
 230
 231def transpose_table(x):
 232	'''
 233	Transpose a list if lists
 234
 235	**Parameters**
 236
 237	+ `x`: a list of lists
 238
 239	**Example**
 240
 241	```py
 242	x = [[1, 2], [3, 4]]
 243	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 244	```
 245	'''
 246	return [[e for e in c] for c in zip(*x)]
 247
 248
 249def w_avg(X, sX) :
 250	'''
 251	Compute variance-weighted average
 252
 253	Returns the value and SE of the weighted average of the elements of `X`,
 254	with relative weights equal to their inverse variances (`1/sX**2`).
 255
 256	**Parameters**
 257
 258	+ `X`: array-like of elements to average
 259	+ `sX`: array-like of the corresponding SE values
 260
 261	**Tip**
 262
 263	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 264	they may be rearranged using `zip()`:
 265
 266	```python
 267	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 268	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 269	```
 270	'''
 271	X = [ x for x in X ]
 272	sX = [ sx for sx in sX ]
 273	W = [ sx**-2 for sx in sX ]
 274	W = [ w/sum(W) for w in W ]
 275	Xavg = sum([ w*x for w,x in zip(W,X) ])
 276	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 277	return Xavg, sXavg
 278
 279
 280def read_csv(filename, sep = ''):
 281	'''
 282	Read contents of `filename` in csv format and return a list of dictionaries.
 283
 284	In the csv string, spaces before and after field separators (`','` by default)
 285	are optional.
 286
 287	**Parameters**
 288
 289	+ `filename`: the csv file to read
 290	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 291	whichever appers most often in the contents of `filename`.
 292	'''
 293	with open(filename) as fid:
 294		txt = fid.read()
 295
 296	if sep == '':
 297		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 298	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 299	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 300
 301
 302def simulate_single_analysis(
 303	sample = 'MYSAMPLE',
 304	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 305	d13C_VPDB = None, d18O_VPDB = None,
 306	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 307	a47 = 1., b47 = 0., c47 = -0.9,
 308	a48 = 1., b48 = 0., c48 = -0.45,
 309	Nominal_D47 = None,
 310	Nominal_D48 = None,
 311	Nominal_d13C_VPDB = None,
 312	Nominal_d18O_VPDB = None,
 313	ALPHA_18O_ACID_REACTION = None,
 314	R13_VPDB = None,
 315	R17_VSMOW = None,
 316	R18_VSMOW = None,
 317	LAMBDA_17 = None,
 318	R18_VPDB = None,
 319	):
 320	'''
 321	Compute working-gas delta values for a single analysis, assuming a stochastic working
 322	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 323	
 324	**Parameters**
 325
 326	+ `sample`: sample name
 327	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 328		(respectively –4 and +26 ‰ by default)
 329	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 330	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 331		of the carbonate sample
 332	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 333		Δ48 values if `D47` or `D48` are not specified
 334	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 335		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 336	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 337	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 338		correction parameters (by default equal to the `D4xdata` default values)
 339	
 340	Returns a dictionary with fields
 341	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 342	'''
 343
 344	if Nominal_d13C_VPDB is None:
 345		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 346
 347	if Nominal_d18O_VPDB is None:
 348		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 349
 350	if ALPHA_18O_ACID_REACTION is None:
 351		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 352
 353	if R13_VPDB is None:
 354		R13_VPDB = D4xdata().R13_VPDB
 355
 356	if R17_VSMOW is None:
 357		R17_VSMOW = D4xdata().R17_VSMOW
 358
 359	if R18_VSMOW is None:
 360		R18_VSMOW = D4xdata().R18_VSMOW
 361
 362	if LAMBDA_17 is None:
 363		LAMBDA_17 = D4xdata().LAMBDA_17
 364
 365	if R18_VPDB is None:
 366		R18_VPDB = D4xdata().R18_VPDB
 367	
 368	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 369	
 370	if Nominal_D47 is None:
 371		Nominal_D47 = D47data().Nominal_D47
 372
 373	if Nominal_D48 is None:
 374		Nominal_D48 = D48data().Nominal_D48
 375	
 376	if d13C_VPDB is None:
 377		if sample in Nominal_d13C_VPDB:
 378			d13C_VPDB = Nominal_d13C_VPDB[sample]
 379		else:
 380			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
 381
 382	if d18O_VPDB is None:
 383		if sample in Nominal_d18O_VPDB:
 384			d18O_VPDB = Nominal_d18O_VPDB[sample]
 385		else:
 386			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 387
 388	if D47 is None:
 389		if sample in Nominal_D47:
 390			D47 = Nominal_D47[sample]
 391		else:
 392			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 393
 394	if D48 is None:
 395		if sample in Nominal_D48:
 396			D48 = Nominal_D48[sample]
 397		else:
 398			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 399
 400	X = D4xdata()
 401	X.R13_VPDB = R13_VPDB
 402	X.R17_VSMOW = R17_VSMOW
 403	X.R18_VSMOW = R18_VSMOW
 404	X.LAMBDA_17 = LAMBDA_17
 405	X.R18_VPDB = R18_VPDB
 406	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 407
 408	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 409		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 410		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 411		)
 412	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 413		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 414		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 415		D17O=D17O, D47=D47, D48=D48, D49=D49,
 416		)
 417	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 418		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 419		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 420		D17O=D17O,
 421		)
 422	
 423	d45 = 1000 * (R45/R45wg - 1)
 424	d46 = 1000 * (R46/R46wg - 1)
 425	d47 = 1000 * (R47/R47wg - 1)
 426	d48 = 1000 * (R48/R48wg - 1)
 427	d49 = 1000 * (R49/R49wg - 1)
 428
 429	for k in range(3): # dumb iteration to adjust for small changes in d47
 430		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 431		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 432		d47 = 1000 * (R47raw/R47wg - 1)
 433		d48 = 1000 * (R48raw/R48wg - 1)
 434
 435	return dict(
 436		Sample = sample,
 437		D17O = D17O,
 438		d13Cwg_VPDB = d13Cwg_VPDB,
 439		d18Owg_VSMOW = d18Owg_VSMOW,
 440		d45 = d45,
 441		d46 = d46,
 442		d47 = d47,
 443		d48 = d48,
 444		d49 = d49,
 445		)
 446
 447
 448def virtual_data(
 449	samples = [],
 450	a47 = 1., b47 = 0., c47 = -0.9,
 451	a48 = 1., b48 = 0., c48 = -0.45,
 452	rd45 = 0.020, rd46 = 0.060,
 453	rD47 = 0.015, rD48 = 0.045,
 454	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 455	session = None,
 456	Nominal_D47 = None, Nominal_D48 = None,
 457	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 458	ALPHA_18O_ACID_REACTION = None,
 459	R13_VPDB = None,
 460	R17_VSMOW = None,
 461	R18_VSMOW = None,
 462	LAMBDA_17 = None,
 463	R18_VPDB = None,
 464	seed = 0,
 465	shuffle = True,
 466	):
 467	'''
 468	Return list with simulated analyses from a single session.
 469	
 470	**Parameters**
 471	
 472	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 473	    * `Sample`: the name of the sample
 474	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 475	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 476	    * `N`: how many analyses to generate for this sample
 477	+ `a47`: scrambling factor for Δ47
 478	+ `b47`: compositional nonlinearity for Δ47
 479	+ `c47`: working gas offset for Δ47
 480	+ `a48`: scrambling factor for Δ48
 481	+ `b48`: compositional nonlinearity for Δ48
 482	+ `c48`: working gas offset for Δ48
 483	+ `rd45`: analytical repeatability of δ45
 484	+ `rd46`: analytical repeatability of δ46
 485	+ `rD47`: analytical repeatability of Δ47
 486	+ `rD48`: analytical repeatability of Δ48
 487	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 488		(by default equal to the `simulate_single_analysis` default values)
 489	+ `session`: name of the session (no name by default)
 490	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 491		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 492	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 493		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 494		(by default equal to the `simulate_single_analysis` defaults)
 495	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 496		(by default equal to the `simulate_single_analysis` defaults)
 497	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 498		correction parameters (by default equal to the `simulate_single_analysis` default)
 499	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 500	+ `shuffle`: randomly reorder the sequence of analyses
 501	
 502		
 503	Here is an example of using this method to generate an arbitrary combination of
 504	anchors and unknowns for a bunch of sessions:
 505
 506	```py
 507	.. include:: ../../code_examples/virtual_data/example.py
 508	```
 509	
 510	This should output something like:
 511	
 512	```
 513	.. include:: ../../code_examples/virtual_data/output.txt
 514	```
 515	'''
 516	
 517	kwargs = locals().copy()
 518
 519	from numpy import random as nprandom
 520	if seed:
 521		rng = nprandom.default_rng(seed)
 522	else:
 523		rng = nprandom.default_rng()
 524	
 525	N = sum([s['N'] for s in samples])
 526	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 527	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 528	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 529	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 530	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 531	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 532	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 533	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 534	
 535	k = 0
 536	out = []
 537	for s in samples:
 538		kw = {}
 539		kw['sample'] = s['Sample']
 540		kw = {
 541			**kw,
 542			**{var: kwargs[var]
 543				for var in [
 544					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 545					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 546					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 547					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 548					]
 549				if kwargs[var] is not None},
 550			**{var: s[var]
 551				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 552				if var in s},
 553			}
 554
 555		sN = s['N']
 556		while sN:
 557			out.append(simulate_single_analysis(**kw))
 558			out[-1]['d45'] += errors45[k]
 559			out[-1]['d46'] += errors46[k]
 560			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 561			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 562			sN -= 1
 563			k += 1
 564
 565		if session is not None:
 566			for r in out:
 567				r['Session'] = session
 568
 569		if shuffle:
 570			nprandom.shuffle(out)
 571
 572	return out
 573
 574def table_of_samples(
 575	data47 = None,
 576	data48 = None,
 577	dir = 'output',
 578	filename = None,
 579	save_to_file = True,
 580	print_out = True,
 581	output = None,
 582	):
 583	'''
 584	Print out, save to disk and/or return a combined table of samples
 585	for a pair of `D47data` and `D48data` objects.
 586
 587	**Parameters**
 588
 589	+ `data47`: `D47data` instance
 590	+ `data48`: `D48data` instance
 591	+ `dir`: the directory in which to save the table
 592	+ `filename`: the name to the csv file to write to
 593	+ `save_to_file`: whether to save the table to disk
 594	+ `print_out`: whether to print out the table
 595	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 596		if set to `'raw'`: return a list of list of strings
 597		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 598	'''
 599	if data47 is None:
 600		if data48 is None:
 601			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 602		else:
 603			return data48.table_of_samples(
 604				dir = dir,
 605				filename = filename,
 606				save_to_file = save_to_file,
 607				print_out = print_out,
 608				output = output
 609				)
 610	else:
 611		if data48 is None:
 612			return data47.table_of_samples(
 613				dir = dir,
 614				filename = filename,
 615				save_to_file = save_to_file,
 616				print_out = print_out,
 617				output = output
 618				)
 619		else:
 620			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 621			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 622			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 623
 624			if save_to_file:
 625				if not os.path.exists(dir):
 626					os.makedirs(dir)
 627				if filename is None:
 628					filename = f'D47D48_samples.csv'
 629				with open(f'{dir}/{filename}', 'w') as fid:
 630					fid.write(make_csv(out))
 631			if print_out:
 632				print('\n'+pretty_table(out))
 633			if output == 'raw':
 634				return out
 635			elif output == 'pretty':
 636				return pretty_table(out)
 637
 638
 639def table_of_sessions(
 640	data47 = None,
 641	data48 = None,
 642	dir = 'output',
 643	filename = None,
 644	save_to_file = True,
 645	print_out = True,
 646	output = None,
 647	):
 648	'''
 649	Print out, save to disk and/or return a combined table of sessions
 650	for a pair of `D47data` and `D48data` objects.
 651	***Only applicable if the sessions in `data47` and those in `data48`
 652	consist of the exact same sets of analyses.***
 653
 654	**Parameters**
 655
 656	+ `data47`: `D47data` instance
 657	+ `data48`: `D48data` instance
 658	+ `dir`: the directory in which to save the table
 659	+ `filename`: the name to the csv file to write to
 660	+ `save_to_file`: whether to save the table to disk
 661	+ `print_out`: whether to print out the table
 662	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 663		if set to `'raw'`: return a list of list of strings
 664		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 665	'''
 666	if data47 is None:
 667		if data48 is None:
 668			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 669		else:
 670			return data48.table_of_sessions(
 671				dir = dir,
 672				filename = filename,
 673				save_to_file = save_to_file,
 674				print_out = print_out,
 675				output = output
 676				)
 677	else:
 678		if data48 is None:
 679			return data47.table_of_sessions(
 680				dir = dir,
 681				filename = filename,
 682				save_to_file = save_to_file,
 683				print_out = print_out,
 684				output = output
 685				)
 686		else:
 687			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 688			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 689			for k,x in enumerate(out47[0]):
 690				if k>7:
 691					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 692					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 693			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 694
 695			if save_to_file:
 696				if not os.path.exists(dir):
 697					os.makedirs(dir)
 698				if filename is None:
 699					filename = f'D47D48_sessions.csv'
 700				with open(f'{dir}/{filename}', 'w') as fid:
 701					fid.write(make_csv(out))
 702			if print_out:
 703				print('\n'+pretty_table(out))
 704			if output == 'raw':
 705				return out
 706			elif output == 'pretty':
 707				return pretty_table(out)
 708
 709
 710def table_of_analyses(
 711	data47 = None,
 712	data48 = None,
 713	dir = 'output',
 714	filename = None,
 715	save_to_file = True,
 716	print_out = True,
 717	output = None,
 718	):
 719	'''
 720	Print out, save to disk and/or return a combined table of analyses
 721	for a pair of `D47data` and `D48data` objects.
 722
 723	If the sessions in `data47` and those in `data48` do not consist of
 724	the exact same sets of analyses, the table will have two columns
 725	`Session_47` and `Session_48` instead of a single `Session` column.
 726
 727	**Parameters**
 728
 729	+ `data47`: `D47data` instance
 730	+ `data48`: `D48data` instance
 731	+ `dir`: the directory in which to save the table
 732	+ `filename`: the name to the csv file to write to
 733	+ `save_to_file`: whether to save the table to disk
 734	+ `print_out`: whether to print out the table
 735	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 736		if set to `'raw'`: return a list of list of strings
 737		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 738	'''
 739	if data47 is None:
 740		if data48 is None:
 741			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 742		else:
 743			return data48.table_of_analyses(
 744				dir = dir,
 745				filename = filename,
 746				save_to_file = save_to_file,
 747				print_out = print_out,
 748				output = output
 749				)
 750	else:
 751		if data48 is None:
 752			return data47.table_of_analyses(
 753				dir = dir,
 754				filename = filename,
 755				save_to_file = save_to_file,
 756				print_out = print_out,
 757				output = output
 758				)
 759		else:
 760			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 761			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 762			
 763			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 764				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 765			else:
 766				out47[0][1] = 'Session_47'
 767				out48[0][1] = 'Session_48'
 768				out47 = transpose_table(out47)
 769				out48 = transpose_table(out48)
 770				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 771
 772			if save_to_file:
 773				if not os.path.exists(dir):
 774					os.makedirs(dir)
 775				if filename is None:
 776					filename = f'D47D48_sessions.csv'
 777				with open(f'{dir}/{filename}', 'w') as fid:
 778					fid.write(make_csv(out))
 779			if print_out:
 780				print('\n'+pretty_table(out))
 781			if output == 'raw':
 782				return out
 783			elif output == 'pretty':
 784				return pretty_table(out)
 785
 786
 787def _fullcovar(minresult, epsilon = 0.01, named = False):
 788	'''
 789	Construct full covariance matrix in the case of constrained parameters
 790	'''
 791	
 792	import asteval
 793	
 794	def f(values):
 795		interp = asteval.Interpreter()
 796		for n,v in zip(minresult.var_names, values):
 797			interp(f'{n} = {v}')
 798		for q in minresult.params:
 799			if minresult.params[q].expr:
 800				interp(f'{q} = {minresult.params[q].expr}')
 801		return np.array([interp.symtable[q] for q in minresult.params])
 802
 803	# construct Jacobian
 804	J = np.zeros((minresult.nvarys, len(minresult.params)))
 805	X = np.array([minresult.params[p].value for p in minresult.var_names])
 806	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 807
 808	for j in range(minresult.nvarys):
 809		x1 = [_ for _ in X]
 810		x1[j] += epsilon * sX[j]
 811		x2 = [_ for _ in X]
 812		x2[j] -= epsilon * sX[j]
 813		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 814
 815	_names = [q for q in minresult.params]
 816	_covar = J.T @ minresult.covar @ J
 817	_se = np.diag(_covar)**.5
 818	_correl = _covar.copy()
 819	for k,s in enumerate(_se):
 820		if s:
 821			_correl[k,:] /= s
 822			_correl[:,k] /= s
 823
 824	if named:
 825		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 826		_se = {i: _se[i] for i in minresult.params}
 827		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 828
 829	return _names, _covar, _se, _correl
 830
 831
 832class D4xdata(list):
 833	'''
 834	Store and process data for a large set of Δ47 and/or Δ48
 835	analyses, usually comprising more than one analytical session.
 836	'''
 837
 838	### 17O CORRECTION PARAMETERS
 839	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 840	'''
 841	Absolute (13C/12C) ratio of VPDB.
 842	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 843	'''
 844
 845	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 846	'''
 847	Absolute (18O/16C) ratio of VSMOW.
 848	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 849	'''
 850
 851	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 852	'''
 853	Mass-dependent exponent for triple oxygen isotopes.
 854	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 855	'''
 856
 857	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 858	'''
 859	Absolute (17O/16C) ratio of VSMOW.
 860	By default equal to 0.00038475
 861	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 862	rescaled to `R13_VPDB`)
 863	'''
 864
 865	R18_VPDB = R18_VSMOW * 1.03092
 866	'''
 867	Absolute (18O/16C) ratio of VPDB.
 868	By definition equal to `R18_VSMOW * 1.03092`.
 869	'''
 870
 871	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 872	'''
 873	Absolute (17O/16C) ratio of VPDB.
 874	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 875	'''
 876
 877	LEVENE_REF_SAMPLE = 'ETH-3'
 878	'''
 879	After the Δ4x standardization step, each sample is tested to
 880	assess whether the Δ4x variance within all analyses for that
 881	sample differs significantly from that observed for a given reference
 882	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 883	which yields a p-value corresponding to the null hypothesis that the
 884	underlying variances are equal).
 885
 886	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 887	sample should be used as a reference for this test.
 888	'''
 889
 890	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 891	'''
 892	Specifies the 18O/16O fractionation factor generally applicable
 893	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 894	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 895
 896	By default equal to 1.008129 (calcite reacted at 90 °C,
 897	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 898	'''
 899
 900	Nominal_d13C_VPDB = {
 901		'ETH-1': 2.02,
 902		'ETH-2': -10.17,
 903		'ETH-3': 1.71,
 904		}	# (Bernasconi et al., 2018)
 905	'''
 906	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 907	`D4xdata.standardize_d13C()`.
 908
 909	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 910	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 911	'''
 912
 913	Nominal_d18O_VPDB = {
 914		'ETH-1': -2.19,
 915		'ETH-2': -18.69,
 916		'ETH-3': -1.78,
 917		}	# (Bernasconi et al., 2018)
 918	'''
 919	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 920	`D4xdata.standardize_d18O()`.
 921
 922	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 923	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 924	'''
 925
 926	d13C_STANDARDIZATION_METHOD = '2pt'
 927	'''
 928	Method by which to standardize δ13C values:
 929	
 930	+ `none`: do not apply any δ13C standardization.
 931	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 932	minimize the difference between final δ13C_VPDB values and
 933	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 934	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 935	values so as to minimize the difference between final δ13C_VPDB
 936	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 937	is defined).
 938	'''
 939
 940	d18O_STANDARDIZATION_METHOD = '2pt'
 941	'''
 942	Method by which to standardize δ18O values:
 943	
 944	+ `none`: do not apply any δ18O standardization.
 945	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 946	minimize the difference between final δ18O_VPDB values and
 947	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 948	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 949	values so as to minimize the difference between final δ18O_VPDB
 950	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 951	is defined).
 952	'''
 953
 954	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 955		'''
 956		**Parameters**
 957
 958		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 959		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 960		+ `mass`: `'47'` or `'48'`
 961		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 962		+ `session`: define session name for analyses without a `Session` key
 963		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 964
 965		Returns a `D4xdata` object derived from `list`.
 966		'''
 967		self._4x = mass
 968		self.verbose = verbose
 969		self.prefix = 'D4xdata'
 970		self.logfile = logfile
 971		list.__init__(self, l)
 972		self.Nf = None
 973		self.repeatability = {}
 974		self.refresh(session = session)
 975
 976
 977	def make_verbal(oldfun):
 978		'''
 979		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 980		'''
 981		@wraps(oldfun)
 982		def newfun(*args, verbose = '', **kwargs):
 983			myself = args[0]
 984			oldprefix = myself.prefix
 985			myself.prefix = oldfun.__name__
 986			if verbose != '':
 987				oldverbose = myself.verbose
 988				myself.verbose = verbose
 989			out = oldfun(*args, **kwargs)
 990			myself.prefix = oldprefix
 991			if verbose != '':
 992				myself.verbose = oldverbose
 993			return out
 994		return newfun
 995
 996
 997	def msg(self, txt):
 998		'''
 999		Log a message to `self.logfile`, and print it out if `verbose = True`
1000		'''
1001		self.log(txt)
1002		if self.verbose:
1003			print(f'{f"[{self.prefix}]":<16} {txt}')
1004
1005
1006	def vmsg(self, txt):
1007		'''
1008		Log a message to `self.logfile` and print it out
1009		'''
1010		self.log(txt)
1011		print(txt)
1012
1013
1014	def log(self, *txts):
1015		'''
1016		Log a message to `self.logfile`
1017		'''
1018		if self.logfile:
1019			with open(self.logfile, 'a') as fid:
1020				for txt in txts:
1021					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1022
1023
1024	def refresh(self, session = 'mySession'):
1025		'''
1026		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1027		'''
1028		self.fill_in_missing_info(session = session)
1029		self.refresh_sessions()
1030		self.refresh_samples()
1031
1032
1033	def refresh_sessions(self):
1034		'''
1035		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1036		to `False` for all sessions.
1037		'''
1038		self.sessions = {
1039			s: {'data': [r for r in self if r['Session'] == s]}
1040			for s in sorted({r['Session'] for r in self})
1041			}
1042		for s in self.sessions:
1043			self.sessions[s]['scrambling_drift'] = False
1044			self.sessions[s]['slope_drift'] = False
1045			self.sessions[s]['wg_drift'] = False
1046			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1047			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1048
1049
1050	def refresh_samples(self):
1051		'''
1052		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1053		'''
1054		self.samples = {
1055			s: {'data': [r for r in self if r['Sample'] == s]}
1056			for s in sorted({r['Sample'] for r in self})
1057			}
1058		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1059		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1060
1061
1062	def read(self, filename, sep = '', session = ''):
1063		'''
1064		Read file in csv format to load data into a `D47data` object.
1065
1066		In the csv file, spaces before and after field separators (`','` by default)
1067		are optional. Each line corresponds to a single analysis.
1068
1069		The required fields are:
1070
1071		+ `UID`: a unique identifier
1072		+ `Session`: an identifier for the analytical session
1073		+ `Sample`: a sample identifier
1074		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1075
1076		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1077		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1078		and `d49` are optional, and set to NaN by default.
1079
1080		**Parameters**
1081
1082		+ `fileneme`: the path of the file to read
1083		+ `sep`: csv separator delimiting the fields
1084		+ `session`: set `Session` field to this string for all analyses
1085		'''
1086		with open(filename) as fid:
1087			self.input(fid.read(), sep = sep, session = session)
1088
1089
1090	def input(self, txt, sep = '', session = ''):
1091		'''
1092		Read `txt` string in csv format to load analysis data into a `D47data` object.
1093
1094		In the csv string, spaces before and after field separators (`','` by default)
1095		are optional. Each line corresponds to a single analysis.
1096
1097		The required fields are:
1098
1099		+ `UID`: a unique identifier
1100		+ `Session`: an identifier for the analytical session
1101		+ `Sample`: a sample identifier
1102		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1103
1104		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1105		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1106		and `d49` are optional, and set to NaN by default.
1107
1108		**Parameters**
1109
1110		+ `txt`: the csv string to read
1111		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1112		whichever appers most often in `txt`.
1113		+ `session`: set `Session` field to this string for all analyses
1114		'''
1115		if sep == '':
1116			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1117		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1118		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1119
1120		if session != '':
1121			for r in data:
1122				r['Session'] = session
1123
1124		self += data
1125		self.refresh()
1126
1127
1128	@make_verbal
1129	def wg(self, samples = None, a18_acid = None):
1130		'''
1131		Compute bulk composition of the working gas for each session based on
1132		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1133		`self.Nominal_d18O_VPDB`.
1134		'''
1135
1136		self.msg('Computing WG composition:')
1137
1138		if a18_acid is None:
1139			a18_acid = self.ALPHA_18O_ACID_REACTION
1140		if samples is None:
1141			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1142
1143		assert a18_acid, f'Acid fractionation factor should not be zero.'
1144
1145		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1146		R45R46_standards = {}
1147		for sample in samples:
1148			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1149			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1150			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1151			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1152			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1153
1154			C12_s = 1 / (1 + R13_s)
1155			C13_s = R13_s / (1 + R13_s)
1156			C16_s = 1 / (1 + R17_s + R18_s)
1157			C17_s = R17_s / (1 + R17_s + R18_s)
1158			C18_s = R18_s / (1 + R17_s + R18_s)
1159
1160			C626_s = C12_s * C16_s ** 2
1161			C627_s = 2 * C12_s * C16_s * C17_s
1162			C628_s = 2 * C12_s * C16_s * C18_s
1163			C636_s = C13_s * C16_s ** 2
1164			C637_s = 2 * C13_s * C16_s * C17_s
1165			C727_s = C12_s * C17_s ** 2
1166
1167			R45_s = (C627_s + C636_s) / C626_s
1168			R46_s = (C628_s + C637_s + C727_s) / C626_s
1169			R45R46_standards[sample] = (R45_s, R46_s)
1170		
1171		for s in self.sessions:
1172			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1173			assert db, f'No sample from {samples} found in session "{s}".'
1174# 			dbsamples = sorted({r['Sample'] for r in db})
1175
1176			X = [r['d45'] for r in db]
1177			Y = [R45R46_standards[r['Sample']][0] for r in db]
1178			x1, x2 = np.min(X), np.max(X)
1179
1180			if x1 < x2:
1181				wgcoord = x1/(x1-x2)
1182			else:
1183				wgcoord = 999
1184
1185			if wgcoord < -.5 or wgcoord > 1.5:
1186				# unreasonable to extrapolate to d45 = 0
1187				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1188			else :
1189				# d45 = 0 is reasonably well bracketed
1190				R45_wg = np.polyfit(X, Y, 1)[1]
1191
1192			X = [r['d46'] for r in db]
1193			Y = [R45R46_standards[r['Sample']][1] for r in db]
1194			x1, x2 = np.min(X), np.max(X)
1195
1196			if x1 < x2:
1197				wgcoord = x1/(x1-x2)
1198			else:
1199				wgcoord = 999
1200
1201			if wgcoord < -.5 or wgcoord > 1.5:
1202				# unreasonable to extrapolate to d46 = 0
1203				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1204			else :
1205				# d46 = 0 is reasonably well bracketed
1206				R46_wg = np.polyfit(X, Y, 1)[1]
1207
1208			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1209
1210			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1211
1212			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1213			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1214			for r in self.sessions[s]['data']:
1215				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1216				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1217
1218
1219	def compute_bulk_delta(self, R45, R46, D17O = 0):
1220		'''
1221		Compute δ13C_VPDB and δ18O_VSMOW,
1222		by solving the generalized form of equation (17) from
1223		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1224		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1225		solving the corresponding second-order Taylor polynomial.
1226		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1227		'''
1228
1229		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1230
1231		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1232		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1233		C = 2 * self.R18_VSMOW
1234		D = -R46
1235
1236		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1237		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1238		cc = A + B + C + D
1239
1240		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1241
1242		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1243		R17 = K * R18 ** self.LAMBDA_17
1244		R13 = R45 - 2 * R17
1245
1246		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1247
1248		return d13C_VPDB, d18O_VSMOW
1249
1250
1251	@make_verbal
1252	def crunch(self, verbose = ''):
1253		'''
1254		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1255		'''
1256		for r in self:
1257			self.compute_bulk_and_clumping_deltas(r)
1258		self.standardize_d13C()
1259		self.standardize_d18O()
1260		self.msg(f"Crunched {len(self)} analyses.")
1261
1262
1263	def fill_in_missing_info(self, session = 'mySession'):
1264		'''
1265		Fill in optional fields with default values
1266		'''
1267		for i,r in enumerate(self):
1268			if 'D17O' not in r:
1269				r['D17O'] = 0.
1270			if 'UID' not in r:
1271				r['UID'] = f'{i+1}'
1272			if 'Session' not in r:
1273				r['Session'] = session
1274			for k in ['d47', 'd48', 'd49']:
1275				if k not in r:
1276					r[k] = np.nan
1277
1278
1279	def standardize_d13C(self):
1280		'''
1281		Perform δ13C standadization within each session `s` according to
1282		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1283		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1284		may be redefined abitrarily at a later stage.
1285		'''
1286		for s in self.sessions:
1287			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1288				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1289				X,Y = zip(*XY)
1290				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1291					offset = np.mean(Y) - np.mean(X)
1292					for r in self.sessions[s]['data']:
1293						r['d13C_VPDB'] += offset				
1294				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1295					a,b = np.polyfit(X,Y,1)
1296					for r in self.sessions[s]['data']:
1297						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1298
1299	def standardize_d18O(self):
1300		'''
1301		Perform δ18O standadization within each session `s` according to
1302		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1303		which is defined by default by `D47data.refresh_sessions()`as equal to
1304		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1305		'''
1306		for s in self.sessions:
1307			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1308				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1309				X,Y = zip(*XY)
1310				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1311				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1312					offset = np.mean(Y) - np.mean(X)
1313					for r in self.sessions[s]['data']:
1314						r['d18O_VSMOW'] += offset				
1315				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1316					a,b = np.polyfit(X,Y,1)
1317					for r in self.sessions[s]['data']:
1318						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1319	
1320
1321	def compute_bulk_and_clumping_deltas(self, r):
1322		'''
1323		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1324		'''
1325
1326		# Compute working gas R13, R18, and isobar ratios
1327		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1328		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1329		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1330
1331		# Compute analyte isobar ratios
1332		R45 = (1 + r['d45'] / 1000) * R45_wg
1333		R46 = (1 + r['d46'] / 1000) * R46_wg
1334		R47 = (1 + r['d47'] / 1000) * R47_wg
1335		R48 = (1 + r['d48'] / 1000) * R48_wg
1336		R49 = (1 + r['d49'] / 1000) * R49_wg
1337
1338		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1339		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1340		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1341
1342		# Compute stochastic isobar ratios of the analyte
1343		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1344			R13, R18, D17O = r['D17O']
1345		)
1346
1347		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1348		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1349		if (R45 / R45stoch - 1) > 5e-8:
1350			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1351		if (R46 / R46stoch - 1) > 5e-8:
1352			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1353
1354		# Compute raw clumped isotope anomalies
1355		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1356		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1357		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1358
1359
1360	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1361		'''
1362		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1363		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1364		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1365		'''
1366
1367		# Compute R17
1368		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1369
1370		# Compute isotope concentrations
1371		C12 = (1 + R13) ** -1
1372		C13 = C12 * R13
1373		C16 = (1 + R17 + R18) ** -1
1374		C17 = C16 * R17
1375		C18 = C16 * R18
1376
1377		# Compute stochastic isotopologue concentrations
1378		C626 = C16 * C12 * C16
1379		C627 = C16 * C12 * C17 * 2
1380		C628 = C16 * C12 * C18 * 2
1381		C636 = C16 * C13 * C16
1382		C637 = C16 * C13 * C17 * 2
1383		C638 = C16 * C13 * C18 * 2
1384		C727 = C17 * C12 * C17
1385		C728 = C17 * C12 * C18 * 2
1386		C737 = C17 * C13 * C17
1387		C738 = C17 * C13 * C18 * 2
1388		C828 = C18 * C12 * C18
1389		C838 = C18 * C13 * C18
1390
1391		# Compute stochastic isobar ratios
1392		R45 = (C636 + C627) / C626
1393		R46 = (C628 + C637 + C727) / C626
1394		R47 = (C638 + C728 + C737) / C626
1395		R48 = (C738 + C828) / C626
1396		R49 = C838 / C626
1397
1398		# Account for stochastic anomalies
1399		R47 *= 1 + D47 / 1000
1400		R48 *= 1 + D48 / 1000
1401		R49 *= 1 + D49 / 1000
1402
1403		# Return isobar ratios
1404		return R45, R46, R47, R48, R49
1405
1406
1407	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1408		'''
1409		Split unknown samples by UID (treat all analyses as different samples)
1410		or by session (treat analyses of a given sample in different sessions as
1411		different samples).
1412
1413		**Parameters**
1414
1415		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1416		+ `grouping`: `by_uid` | `by_session`
1417		'''
1418		if samples_to_split == 'all':
1419			samples_to_split = [s for s in self.unknowns]
1420		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1421		self.grouping = grouping.lower()
1422		if self.grouping in gkeys:
1423			gkey = gkeys[self.grouping]
1424		for r in self:
1425			if r['Sample'] in samples_to_split:
1426				r['Sample_original'] = r['Sample']
1427				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1428			elif r['Sample'] in self.unknowns:
1429				r['Sample_original'] = r['Sample']
1430		self.refresh_samples()
1431
1432
1433	def unsplit_samples(self, tables = False):
1434		'''
1435		Reverse the effects of `D47data.split_samples()`.
1436		
1437		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1438		
1439		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1440		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1441		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1442		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1443		that case session-averaged Δ4x values are statistically independent).
1444		'''
1445		unknowns_old = sorted({s for s in self.unknowns})
1446		CM_old = self.standardization.covar[:,:]
1447		VD_old = self.standardization.params.valuesdict().copy()
1448		vars_old = self.standardization.var_names
1449
1450		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1451
1452		Ns = len(vars_old) - len(unknowns_old)
1453		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1454		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1455
1456		W = np.zeros((len(vars_new), len(vars_old)))
1457		W[:Ns,:Ns] = np.eye(Ns)
1458		for u in unknowns_new:
1459			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1460			if self.grouping == 'by_session':
1461				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1462			elif self.grouping == 'by_uid':
1463				weights = [1 for s in splits]
1464			sw = sum(weights)
1465			weights = [w/sw for w in weights]
1466			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1467
1468		CM_new = W @ CM_old @ W.T
1469		V = W @ np.array([[VD_old[k]] for k in vars_old])
1470		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1471
1472		self.standardization.covar = CM_new
1473		self.standardization.params.valuesdict = lambda : VD_new
1474		self.standardization.var_names = vars_new
1475
1476		for r in self:
1477			if r['Sample'] in self.unknowns:
1478				r['Sample_split'] = r['Sample']
1479				r['Sample'] = r['Sample_original']
1480
1481		self.refresh_samples()
1482		self.consolidate_samples()
1483		self.repeatabilities()
1484
1485		if tables:
1486			self.table_of_analyses()
1487			self.table_of_samples()
1488
1489	def assign_timestamps(self):
1490		'''
1491		Assign a time field `t` of type `float` to each analysis.
1492
1493		If `TimeTag` is one of the data fields, `t` is equal within a given session
1494		to `TimeTag` minus the mean value of `TimeTag` for that session.
1495		Otherwise, `TimeTag` is by default equal to the index of each analysis
1496		in the dataset and `t` is defined as above.
1497		'''
1498		for session in self.sessions:
1499			sdata = self.sessions[session]['data']
1500			try:
1501				t0 = np.mean([r['TimeTag'] for r in sdata])
1502				for r in sdata:
1503					r['t'] = r['TimeTag'] - t0
1504			except KeyError:
1505				t0 = (len(sdata)-1)/2
1506				for t,r in enumerate(sdata):
1507					r['t'] = t - t0
1508
1509
1510	def report(self):
1511		'''
1512		Prints a report on the standardization fit.
1513		Only applicable after `D4xdata.standardize(method='pooled')`.
1514		'''
1515		report_fit(self.standardization)
1516
1517
1518	def combine_samples(self, sample_groups):
1519		'''
1520		Combine analyses of different samples to compute weighted average Δ4x
1521		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1522		dictionary.
1523		
1524		Caution: samples are weighted by number of replicate analyses, which is a
1525		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1526		correlated analytical errors for one or more samples).
1527		
1528		Returns a tuplet of:
1529		
1530		+ the list of group names
1531		+ an array of the corresponding Δ4x values
1532		+ the corresponding (co)variance matrix
1533		
1534		**Parameters**
1535
1536		+ `sample_groups`: a dictionary of the form:
1537		```py
1538		{'group1': ['sample_1', 'sample_2'],
1539		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1540		```
1541		'''
1542		
1543		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1544		groups = sorted(sample_groups.keys())
1545		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1546		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1547		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1548		W = np.array([
1549			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1550			for j in groups])
1551		D4x_new = W @ D4x_old
1552		CM_new = W @ CM_old @ W.T
1553
1554		return groups, D4x_new[:,0], CM_new
1555		
1556
1557	@make_verbal
1558	def standardize(self,
1559		method = 'pooled',
1560		weighted_sessions = [],
1561		consolidate = True,
1562		consolidate_tables = False,
1563		consolidate_plots = False,
1564		constraints = {},
1565		):
1566		'''
1567		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1568		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1569		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1570		i.e. that their true Δ4x value does not change between sessions,
1571		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1572		`'indep_sessions'`, the standardization processes each session independently, based only
1573		on anchors analyses.
1574		'''
1575
1576		self.standardization_method = method
1577		self.assign_timestamps()
1578
1579		if method == 'pooled':
1580			if weighted_sessions:
1581				for session_group in weighted_sessions:
1582					if self._4x == '47':
1583						X = D47data([r for r in self if r['Session'] in session_group])
1584					elif self._4x == '48':
1585						X = D48data([r for r in self if r['Session'] in session_group])
1586					X.Nominal_D4x = self.Nominal_D4x.copy()
1587					X.refresh()
1588					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1589					w = np.sqrt(result.redchi)
1590					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1591					for r in X:
1592						r[f'wD{self._4x}raw'] *= w
1593			else:
1594				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1595				for r in self:
1596					r[f'wD{self._4x}raw'] = 1.
1597
1598			params = Parameters()
1599			for k,session in enumerate(self.sessions):
1600				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1601				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1602				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1603				s = pf(session)
1604				params.add(f'a_{s}', value = 0.9)
1605				params.add(f'b_{s}', value = 0.)
1606				params.add(f'c_{s}', value = -0.9)
1607				params.add(f'a2_{s}', value = 0.,
1608# 					vary = self.sessions[session]['scrambling_drift'],
1609					)
1610				params.add(f'b2_{s}', value = 0.,
1611# 					vary = self.sessions[session]['slope_drift'],
1612					)
1613				params.add(f'c2_{s}', value = 0.,
1614# 					vary = self.sessions[session]['wg_drift'],
1615					)
1616				if not self.sessions[session]['scrambling_drift']:
1617					params[f'a2_{s}'].expr = '0'
1618				if not self.sessions[session]['slope_drift']:
1619					params[f'b2_{s}'].expr = '0'
1620				if not self.sessions[session]['wg_drift']:
1621					params[f'c2_{s}'].expr = '0'
1622
1623			for sample in self.unknowns:
1624				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1625
1626			for k in constraints:
1627				params[k].expr = constraints[k]
1628
1629			def residuals(p):
1630				R = []
1631				for r in self:
1632					session = pf(r['Session'])
1633					sample = pf(r['Sample'])
1634					if r['Sample'] in self.Nominal_D4x:
1635						R += [ (
1636							r[f'D{self._4x}raw'] - (
1637								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1638								+ p[f'b_{session}'] * r[f'd{self._4x}']
1639								+	p[f'c_{session}']
1640								+ r['t'] * (
1641									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1642									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1643									+	p[f'c2_{session}']
1644									)
1645								)
1646							) / r[f'wD{self._4x}raw'] ]
1647					else:
1648						R += [ (
1649							r[f'D{self._4x}raw'] - (
1650								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1651								+ p[f'b_{session}'] * r[f'd{self._4x}']
1652								+	p[f'c_{session}']
1653								+ r['t'] * (
1654									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1655									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1656									+	p[f'c2_{session}']
1657									)
1658								)
1659							) / r[f'wD{self._4x}raw'] ]
1660				return R
1661
1662			M = Minimizer(residuals, params)
1663			result = M.least_squares()
1664			self.Nf = result.nfree
1665			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1666			new_names, new_covar, new_se = _fullcovar(result)[:3]
1667			result.var_names = new_names
1668			result.covar = new_covar
1669
1670			for r in self:
1671				s = pf(r["Session"])
1672				a = result.params.valuesdict()[f'a_{s}']
1673				b = result.params.valuesdict()[f'b_{s}']
1674				c = result.params.valuesdict()[f'c_{s}']
1675				a2 = result.params.valuesdict()[f'a2_{s}']
1676				b2 = result.params.valuesdict()[f'b2_{s}']
1677				c2 = result.params.valuesdict()[f'c2_{s}']
1678				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1679				
1680
1681			self.standardization = result
1682
1683			for session in self.sessions:
1684				self.sessions[session]['Np'] = 3
1685				for k in ['scrambling', 'slope', 'wg']:
1686					if self.sessions[session][f'{k}_drift']:
1687						self.sessions[session]['Np'] += 1
1688
1689			if consolidate:
1690				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1691			return result
1692
1693
1694		elif method == 'indep_sessions':
1695
1696			if weighted_sessions:
1697				for session_group in weighted_sessions:
1698					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1699					X.Nominal_D4x = self.Nominal_D4x.copy()
1700					X.refresh()
1701					# This is only done to assign r['wD47raw'] for r in X:
1702					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1703					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1704			else:
1705				self.msg('All weights set to 1 ‰')
1706				for r in self:
1707					r[f'wD{self._4x}raw'] = 1
1708
1709			for session in self.sessions:
1710				s = self.sessions[session]
1711				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1712				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1713				s['Np'] = sum(p_active)
1714				sdata = s['data']
1715
1716				A = np.array([
1717					[
1718						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1719						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1720						1 / r[f'wD{self._4x}raw'],
1721						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1722						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1723						r['t'] / r[f'wD{self._4x}raw']
1724						]
1725					for r in sdata if r['Sample'] in self.anchors
1726					])[:,p_active] # only keep columns for the active parameters
1727				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1728				s['Na'] = Y.size
1729				CM = linalg.inv(A.T @ A)
1730				bf = (CM @ A.T @ Y).T[0,:]
1731				k = 0
1732				for n,a in zip(p_names, p_active):
1733					if a:
1734						s[n] = bf[k]
1735# 						self.msg(f'{n} = {bf[k]}')
1736						k += 1
1737					else:
1738						s[n] = 0.
1739# 						self.msg(f'{n} = 0.0')
1740
1741				for r in sdata :
1742					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1743					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1744					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1745
1746				s['CM'] = np.zeros((6,6))
1747				i = 0
1748				k_active = [j for j,a in enumerate(p_active) if a]
1749				for j,a in enumerate(p_active):
1750					if a:
1751						s['CM'][j,k_active] = CM[i,:]
1752						i += 1
1753
1754			if not weighted_sessions:
1755				w = self.rmswd()['rmswd']
1756				for r in self:
1757						r[f'wD{self._4x}'] *= w
1758						r[f'wD{self._4x}raw'] *= w
1759				for session in self.sessions:
1760					self.sessions[session]['CM'] *= w**2
1761
1762			for session in self.sessions:
1763				s = self.sessions[session]
1764				s['SE_a'] = s['CM'][0,0]**.5
1765				s['SE_b'] = s['CM'][1,1]**.5
1766				s['SE_c'] = s['CM'][2,2]**.5
1767				s['SE_a2'] = s['CM'][3,3]**.5
1768				s['SE_b2'] = s['CM'][4,4]**.5
1769				s['SE_c2'] = s['CM'][5,5]**.5
1770
1771			if not weighted_sessions:
1772				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1773			else:
1774				self.Nf = 0
1775				for sg in weighted_sessions:
1776					self.Nf += self.rmswd(sessions = sg)['Nf']
1777
1778			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1779
1780			avgD4x = {
1781				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1782				for sample in self.samples
1783				}
1784			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1785			rD4x = (chi2/self.Nf)**.5
1786			self.repeatability[f'sigma_{self._4x}'] = rD4x
1787
1788			if consolidate:
1789				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1790
1791
1792	def standardization_error(self, session, d4x, D4x, t = 0):
1793		'''
1794		Compute standardization error for a given session and
1795		(δ47, Δ47) composition.
1796		'''
1797		a = self.sessions[session]['a']
1798		b = self.sessions[session]['b']
1799		c = self.sessions[session]['c']
1800		a2 = self.sessions[session]['a2']
1801		b2 = self.sessions[session]['b2']
1802		c2 = self.sessions[session]['c2']
1803		CM = self.sessions[session]['CM']
1804
1805		x, y = D4x, d4x
1806		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1807# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1808		dxdy = -(b+b2*t) / (a+a2*t)
1809		dxdz = 1. / (a+a2*t)
1810		dxda = -x / (a+a2*t)
1811		dxdb = -y / (a+a2*t)
1812		dxdc = -1. / (a+a2*t)
1813		dxda2 = -x * a2 / (a+a2*t)
1814		dxdb2 = -y * t / (a+a2*t)
1815		dxdc2 = -t / (a+a2*t)
1816		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1817		sx = (V @ CM @ V.T) ** .5
1818		return sx
1819
1820
1821	@make_verbal
1822	def summary(self,
1823		dir = 'output',
1824		filename = None,
1825		save_to_file = True,
1826		print_out = True,
1827		):
1828		'''
1829		Print out an/or save to disk a summary of the standardization results.
1830
1831		**Parameters**
1832
1833		+ `dir`: the directory in which to save the table
1834		+ `filename`: the name to the csv file to write to
1835		+ `save_to_file`: whether to save the table to disk
1836		+ `print_out`: whether to print out the table
1837		'''
1838
1839		out = []
1840		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1841		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1842		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1843		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1844		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1845		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1847		out += [['Model degrees of freedom', f"{self.Nf}"]]
1848		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1849		out += [['Standardization method', self.standardization_method]]
1850
1851		if save_to_file:
1852			if not os.path.exists(dir):
1853				os.makedirs(dir)
1854			if filename is None:
1855				filename = f'D{self._4x}_summary.csv'
1856			with open(f'{dir}/{filename}', 'w') as fid:
1857				fid.write(make_csv(out))
1858		if print_out:
1859			self.msg('\n' + pretty_table(out, header = 0))
1860
1861
1862	@make_verbal
1863	def table_of_sessions(self,
1864		dir = 'output',
1865		filename = None,
1866		save_to_file = True,
1867		print_out = True,
1868		output = None,
1869		):
1870		'''
1871		Print out an/or save to disk a table of sessions.
1872
1873		**Parameters**
1874
1875		+ `dir`: the directory in which to save the table
1876		+ `filename`: the name to the csv file to write to
1877		+ `save_to_file`: whether to save the table to disk
1878		+ `print_out`: whether to print out the table
1879		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1880		    if set to `'raw'`: return a list of list of strings
1881		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1882		'''
1883		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1884		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1885		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1886
1887		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1888		if include_a2:
1889			out[-1] += ['a2 ± SE']
1890		if include_b2:
1891			out[-1] += ['b2 ± SE']
1892		if include_c2:
1893			out[-1] += ['c2 ± SE']
1894		for session in self.sessions:
1895			out += [[
1896				session,
1897				f"{self.sessions[session]['Na']}",
1898				f"{self.sessions[session]['Nu']}",
1899				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1900				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1901				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1902				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1903				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1904				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1905				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1906				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1907				]]
1908			if include_a2:
1909				if self.sessions[session]['scrambling_drift']:
1910					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1911				else:
1912					out[-1] += ['']
1913			if include_b2:
1914				if self.sessions[session]['slope_drift']:
1915					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1916				else:
1917					out[-1] += ['']
1918			if include_c2:
1919				if self.sessions[session]['wg_drift']:
1920					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1921				else:
1922					out[-1] += ['']
1923
1924		if save_to_file:
1925			if not os.path.exists(dir):
1926				os.makedirs(dir)
1927			if filename is None:
1928				filename = f'D{self._4x}_sessions.csv'
1929			with open(f'{dir}/{filename}', 'w') as fid:
1930				fid.write(make_csv(out))
1931		if print_out:
1932			self.msg('\n' + pretty_table(out))
1933		if output == 'raw':
1934			return out
1935		elif output == 'pretty':
1936			return pretty_table(out)
1937
1938
1939	@make_verbal
1940	def table_of_analyses(
1941		self,
1942		dir = 'output',
1943		filename = None,
1944		save_to_file = True,
1945		print_out = True,
1946		output = None,
1947		):
1948		'''
1949		Print out an/or save to disk a table of analyses.
1950
1951		**Parameters**
1952
1953		+ `dir`: the directory in which to save the table
1954		+ `filename`: the name to the csv file to write to
1955		+ `save_to_file`: whether to save the table to disk
1956		+ `print_out`: whether to print out the table
1957		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1958		    if set to `'raw'`: return a list of list of strings
1959		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1960		'''
1961
1962		out = [['UID','Session','Sample']]
1963		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1964		for f in extra_fields:
1965			out[-1] += [f[0]]
1966		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1967		for r in self:
1968			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1969			for f in extra_fields:
1970				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1971			out[-1] += [
1972				f"{r['d13Cwg_VPDB']:.3f}",
1973				f"{r['d18Owg_VSMOW']:.3f}",
1974				f"{r['d45']:.6f}",
1975				f"{r['d46']:.6f}",
1976				f"{r['d47']:.6f}",
1977				f"{r['d48']:.6f}",
1978				f"{r['d49']:.6f}",
1979				f"{r['d13C_VPDB']:.6f}",
1980				f"{r['d18O_VSMOW']:.6f}",
1981				f"{r['D47raw']:.6f}",
1982				f"{r['D48raw']:.6f}",
1983				f"{r['D49raw']:.6f}",
1984				f"{r[f'D{self._4x}']:.6f}"
1985				]
1986		if save_to_file:
1987			if not os.path.exists(dir):
1988				os.makedirs(dir)
1989			if filename is None:
1990				filename = f'D{self._4x}_analyses.csv'
1991			with open(f'{dir}/{filename}', 'w') as fid:
1992				fid.write(make_csv(out))
1993		if print_out:
1994			self.msg('\n' + pretty_table(out))
1995		return out
1996
1997	@make_verbal
1998	def covar_table(
1999		self,
2000		correl = False,
2001		dir = 'output',
2002		filename = None,
2003		save_to_file = True,
2004		print_out = True,
2005		output = None,
2006		):
2007		'''
2008		Print out, save to disk and/or return the variance-covariance matrix of D4x
2009		for all unknown samples.
2010
2011		**Parameters**
2012
2013		+ `dir`: the directory in which to save the csv
2014		+ `filename`: the name of the csv file to write to
2015		+ `save_to_file`: whether to save the csv
2016		+ `print_out`: whether to print out the matrix
2017		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2018		    if set to `'raw'`: return a list of list of strings
2019		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2020		'''
2021		samples = sorted([u for u in self.unknowns])
2022		out = [[''] + samples]
2023		for s1 in samples:
2024			out.append([s1])
2025			for s2 in samples:
2026				if correl:
2027					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2028				else:
2029					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2030
2031		if save_to_file:
2032			if not os.path.exists(dir):
2033				os.makedirs(dir)
2034			if filename is None:
2035				if correl:
2036					filename = f'D{self._4x}_correl.csv'
2037				else:
2038					filename = f'D{self._4x}_covar.csv'
2039			with open(f'{dir}/{filename}', 'w') as fid:
2040				fid.write(make_csv(out))
2041		if print_out:
2042			self.msg('\n'+pretty_table(out))
2043		if output == 'raw':
2044			return out
2045		elif output == 'pretty':
2046			return pretty_table(out)
2047
2048	@make_verbal
2049	def table_of_samples(
2050		self,
2051		dir = 'output',
2052		filename = None,
2053		save_to_file = True,
2054		print_out = True,
2055		output = None,
2056		):
2057		'''
2058		Print out, save to disk and/or return a table of samples.
2059
2060		**Parameters**
2061
2062		+ `dir`: the directory in which to save the csv
2063		+ `filename`: the name of the csv file to write to
2064		+ `save_to_file`: whether to save the csv
2065		+ `print_out`: whether to print out the table
2066		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2067		    if set to `'raw'`: return a list of list of strings
2068		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2069		'''
2070
2071		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2072		for sample in self.anchors:
2073			out += [[
2074				f"{sample}",
2075				f"{self.samples[sample]['N']}",
2076				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2077				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2078				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2079				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2080				]]
2081		for sample in self.unknowns:
2082			out += [[
2083				f"{sample}",
2084				f"{self.samples[sample]['N']}",
2085				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2086				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2087				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2088				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2089				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2090				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2091				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2092				]]
2093		if save_to_file:
2094			if not os.path.exists(dir):
2095				os.makedirs(dir)
2096			if filename is None:
2097				filename = f'D{self._4x}_samples.csv'
2098			with open(f'{dir}/{filename}', 'w') as fid:
2099				fid.write(make_csv(out))
2100		if print_out:
2101			self.msg('\n'+pretty_table(out))
2102		if output == 'raw':
2103			return out
2104		elif output == 'pretty':
2105			return pretty_table(out)
2106
2107
2108	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2109		'''
2110		Generate session plots and save them to disk.
2111
2112		**Parameters**
2113
2114		+ `dir`: the directory in which to save the plots
2115		+ `figsize`: the width and height (in inches) of each plot
2116		+ `filetype`: 'pdf' or 'png'
2117		+ `dpi`: resolution for PNG output
2118		'''
2119		if not os.path.exists(dir):
2120			os.makedirs(dir)
2121
2122		for session in self.sessions:
2123			sp = self.plot_single_session(session, xylimits = 'constant')
2124			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2125			ppl.close(sp.fig)
2126			
2127
2128
2129	@make_verbal
2130	def consolidate_samples(self):
2131		'''
2132		Compile various statistics for each sample.
2133
2134		For each anchor sample:
2135
2136		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2137		+ `SE_D47` or `SE_D48`: set to zero by definition
2138
2139		For each unknown sample:
2140
2141		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2142		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2143
2144		For each anchor and unknown:
2145
2146		+ `N`: the total number of analyses of this sample
2147		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2148		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2149		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2150		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2151		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2152		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2153		'''
2154		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2155		for sample in self.samples:
2156			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2157			if self.samples[sample]['N'] > 1:
2158				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2159
2160			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2161			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2162
2163			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2164			if len(D4x_pop) > 2:
2165				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2166			
2167		if self.standardization_method == 'pooled':
2168			for sample in self.anchors:
2169				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2170				self.samples[sample][f'SE_D{self._4x}'] = 0.
2171			for sample in self.unknowns:
2172				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2173				try:
2174					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2175				except ValueError:
2176					# when `sample` is constrained by self.standardize(constraints = {...}),
2177					# it is no longer listed in self.standardization.var_names.
2178					# Temporary fix: define SE as zero for now
2179					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2180
2181		elif self.standardization_method == 'indep_sessions':
2182			for sample in self.anchors:
2183				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2184				self.samples[sample][f'SE_D{self._4x}'] = 0.
2185			for sample in self.unknowns:
2186				self.msg(f'Consolidating sample {sample}')
2187				self.unknowns[sample][f'session_D{self._4x}'] = {}
2188				session_avg = []
2189				for session in self.sessions:
2190					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2191					if sdata:
2192						self.msg(f'{sample} found in session {session}')
2193						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2194						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2195						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2196						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2197						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2198						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2199						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2200				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2201				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2202				wsum = sum([weights[s] for s in weights])
2203				for s in weights:
2204					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2205
2206		for r in self:
2207			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2208
2209
2210
2211	def consolidate_sessions(self):
2212		'''
2213		Compute various statistics for each session.
2214
2215		+ `Na`: Number of anchor analyses in the session
2216		+ `Nu`: Number of unknown analyses in the session
2217		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2218		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2219		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2220		+ `a`: scrambling factor
2221		+ `b`: compositional slope
2222		+ `c`: WG offset
2223		+ `SE_a`: Model stadard erorr of `a`
2224		+ `SE_b`: Model stadard erorr of `b`
2225		+ `SE_c`: Model stadard erorr of `c`
2226		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2227		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2228		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2229		+ `a2`: scrambling factor drift
2230		+ `b2`: compositional slope drift
2231		+ `c2`: WG offset drift
2232		+ `Np`: Number of standardization parameters to fit
2233		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2234		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2235		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2236		'''
2237		for session in self.sessions:
2238			if 'd13Cwg_VPDB' not in self.sessions[session]:
2239				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2240			if 'd18Owg_VSMOW' not in self.sessions[session]:
2241				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2242			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2243			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2244
2245			self.msg(f'Computing repeatabilities for session {session}')
2246			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2247			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2248			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2249
2250		if self.standardization_method == 'pooled':
2251			for session in self.sessions:
2252
2253				# different (better?) computation of D4x repeatability for each session:
2254				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2255				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2256
2257				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2258				i = self.standardization.var_names.index(f'a_{pf(session)}')
2259				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2260
2261				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2262				i = self.standardization.var_names.index(f'b_{pf(session)}')
2263				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2264
2265				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2266				i = self.standardization.var_names.index(f'c_{pf(session)}')
2267				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2268
2269				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2270				if self.sessions[session]['scrambling_drift']:
2271					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2272					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2273				else:
2274					self.sessions[session]['SE_a2'] = 0.
2275
2276				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2277				if self.sessions[session]['slope_drift']:
2278					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2279					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2280				else:
2281					self.sessions[session]['SE_b2'] = 0.
2282
2283				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2284				if self.sessions[session]['wg_drift']:
2285					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2286					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2287				else:
2288					self.sessions[session]['SE_c2'] = 0.
2289
2290				i = self.standardization.var_names.index(f'a_{pf(session)}')
2291				j = self.standardization.var_names.index(f'b_{pf(session)}')
2292				k = self.standardization.var_names.index(f'c_{pf(session)}')
2293				CM = np.zeros((6,6))
2294				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2295				try:
2296					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2297					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2298					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2299					try:
2300						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2301						CM[3,4] = self.standardization.covar[i2,j2]
2302						CM[4,3] = self.standardization.covar[j2,i2]
2303					except ValueError:
2304						pass
2305					try:
2306						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2307						CM[3,5] = self.standardization.covar[i2,k2]
2308						CM[5,3] = self.standardization.covar[k2,i2]
2309					except ValueError:
2310						pass
2311				except ValueError:
2312					pass
2313				try:
2314					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2315					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2316					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2317					try:
2318						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2319						CM[4,5] = self.standardization.covar[j2,k2]
2320						CM[5,4] = self.standardization.covar[k2,j2]
2321					except ValueError:
2322						pass
2323				except ValueError:
2324					pass
2325				try:
2326					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2327					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2328					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2329				except ValueError:
2330					pass
2331
2332				self.sessions[session]['CM'] = CM
2333
2334		elif self.standardization_method == 'indep_sessions':
2335			pass # Not implemented yet
2336
2337
2338	@make_verbal
2339	def repeatabilities(self):
2340		'''
2341		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2342		(for all samples, for anchors, and for unknowns).
2343		'''
2344		self.msg('Computing reproducibilities for all sessions')
2345
2346		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2347		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2348		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2349		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2350		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2351
2352
2353	@make_verbal
2354	def consolidate(self, tables = True, plots = True):
2355		'''
2356		Collect information about samples, sessions and repeatabilities.
2357		'''
2358		self.consolidate_samples()
2359		self.consolidate_sessions()
2360		self.repeatabilities()
2361
2362		if tables:
2363			self.summary()
2364			self.table_of_sessions()
2365			self.table_of_analyses()
2366			self.table_of_samples()
2367
2368		if plots:
2369			self.plot_sessions()
2370
2371
2372	@make_verbal
2373	def rmswd(self,
2374		samples = 'all samples',
2375		sessions = 'all sessions',
2376		):
2377		'''
2378		Compute the χ2, root mean squared weighted deviation
2379		(i.e. reduced χ2), and corresponding degrees of freedom of the
2380		Δ4x values for samples in `samples` and sessions in `sessions`.
2381		
2382		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2383		'''
2384		if samples == 'all samples':
2385			mysamples = [k for k in self.samples]
2386		elif samples == 'anchors':
2387			mysamples = [k for k in self.anchors]
2388		elif samples == 'unknowns':
2389			mysamples = [k for k in self.unknowns]
2390		else:
2391			mysamples = samples
2392
2393		if sessions == 'all sessions':
2394			sessions = [k for k in self.sessions]
2395
2396		chisq, Nf = 0, 0
2397		for sample in mysamples :
2398			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2399			if len(G) > 1 :
2400				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2401				Nf += (len(G) - 1)
2402				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2403		r = (chisq / Nf)**.5 if Nf > 0 else 0
2404		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2405		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2406
2407	
2408	@make_verbal
2409	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2410		'''
2411		Compute the repeatability of `[r[key] for r in self]`
2412		'''
2413
2414		if samples == 'all samples':
2415			mysamples = [k for k in self.samples]
2416		elif samples == 'anchors':
2417			mysamples = [k for k in self.anchors]
2418		elif samples == 'unknowns':
2419			mysamples = [k for k in self.unknowns]
2420		else:
2421			mysamples = samples
2422
2423		if sessions == 'all sessions':
2424			sessions = [k for k in self.sessions]
2425
2426		if key in ['D47', 'D48']:
2427			# Full disclosure: the definition of Nf is tricky/debatable
2428			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2429			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2430			Nf = len(G)
2431# 			print(f'len(G) = {Nf}')
2432			Nf -= len([s for s in mysamples if s in self.unknowns])
2433# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2434			for session in sessions:
2435				Np = len([
2436					_ for _ in self.standardization.params
2437					if (
2438						self.standardization.params[_].expr is not None
2439						and (
2440							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2441							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2442							)
2443						)
2444					])
2445# 				print(f'session {session}: {Np} parameters to consider')
2446				Na = len({
2447					r['Sample'] for r in self.sessions[session]['data']
2448					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2449					})
2450# 				print(f'session {session}: {Na} different anchors in that session')
2451				Nf -= min(Np, Na)
2452# 			print(f'Nf = {Nf}')
2453
2454# 			for sample in mysamples :
2455# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2456# 				if len(X) > 1 :
2457# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2458# 					if sample in self.unknowns:
2459# 						Nf += len(X) - 1
2460# 					else:
2461# 						Nf += len(X)
2462# 			if samples in ['anchors', 'all samples']:
2463# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2464			r = (chisq / Nf)**.5 if Nf > 0 else 0
2465
2466		else: # if key not in ['D47', 'D48']
2467			chisq, Nf = 0, 0
2468			for sample in mysamples :
2469				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2470				if len(X) > 1 :
2471					Nf += len(X) - 1
2472					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2473			r = (chisq / Nf)**.5 if Nf > 0 else 0
2474
2475		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2476		return r
2477
2478	def sample_average(self, samples, weights = 'equal', normalize = True):
2479		'''
2480		Weighted average Δ4x value of a group of samples, accounting for covariance.
2481
2482		Returns the weighed average Δ4x value and associated SE
2483		of a group of samples. Weights are equal by default. If `normalize` is
2484		true, `weights` will be rescaled so that their sum equals 1.
2485
2486		**Examples**
2487
2488		```python
2489		self.sample_average(['X','Y'], [1, 2])
2490		```
2491
2492		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2493		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2494		values of samples X and Y, respectively.
2495
2496		```python
2497		self.sample_average(['X','Y'], [1, -1], normalize = False)
2498		```
2499
2500		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2501		'''
2502		if weights == 'equal':
2503			weights = [1/len(samples)] * len(samples)
2504
2505		if normalize:
2506			s = sum(weights)
2507			if s:
2508				weights = [w/s for w in weights]
2509
2510		try:
2511# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2512# 			C = self.standardization.covar[indices,:][:,indices]
2513			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2514			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2515			return correlated_sum(X, C, weights)
2516		except ValueError:
2517			return (0., 0.)
2518
2519
2520	def sample_D4x_covar(self, sample1, sample2 = None):
2521		'''
2522		Covariance between Δ4x values of samples
2523
2524		Returns the error covariance between the average Δ4x values of two
2525		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2526		returns the Δ4x variance for that sample.
2527		'''
2528		if sample2 is None:
2529			sample2 = sample1
2530		if self.standardization_method == 'pooled':
2531			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2532			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2533			return self.standardization.covar[i, j]
2534		elif self.standardization_method == 'indep_sessions':
2535			if sample1 == sample2:
2536				return self.samples[sample1][f'SE_D{self._4x}']**2
2537			else:
2538				c = 0
2539				for session in self.sessions:
2540					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2541					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2542					if sdata1 and sdata2:
2543						a = self.sessions[session]['a']
2544						# !! TODO: CM below does not account for temporal changes in standardization parameters
2545						CM = self.sessions[session]['CM'][:3,:3]
2546						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2547						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2548						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2549						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2550						c += (
2551							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2552							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2553							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2554							@ CM
2555							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2556							) / a**2
2557				return float(c)
2558
2559	def sample_D4x_correl(self, sample1, sample2 = None):
2560		'''
2561		Correlation between Δ4x errors of samples
2562
2563		Returns the error correlation between the average Δ4x values of two samples.
2564		'''
2565		if sample2 is None or sample2 == sample1:
2566			return 1.
2567		return (
2568			self.sample_D4x_covar(sample1, sample2)
2569			/ self.unknowns[sample1][f'SE_D{self._4x}']
2570			/ self.unknowns[sample2][f'SE_D{self._4x}']
2571			)
2572
2573	def plot_single_session(self,
2574		session,
2575		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2576		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2577		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2578		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2579		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2580		xylimits = 'free', # | 'constant'
2581		x_label = None,
2582		y_label = None,
2583		error_contour_interval = 'auto',
2584		fig = 'new',
2585		):
2586		'''
2587		Generate plot for a single session
2588		'''
2589		if x_label is None:
2590			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2591		if y_label is None:
2592			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2593
2594		out = _SessionPlot()
2595		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2596		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2597		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2598		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2599		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2600		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2601		anchor_avg = (np.array([ np.array([
2602				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2603				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2604				]) for sample in anchors]).T,
2605			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2606		unknown_avg = (np.array([ np.array([
2607				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2608				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2609				]) for sample in unknowns]).T,
2610			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2611		
2612		
2613		if fig == 'new':
2614			out.fig = ppl.figure(figsize = (6,6))
2615			ppl.subplots_adjust(.1,.1,.9,.9)
2616
2617		out.anchor_analyses, = ppl.plot(
2618			anchors_d,
2619			anchors_D,
2620			**kw_plot_anchors)
2621		out.unknown_analyses, = ppl.plot(
2622			unknowns_d,
2623			unknowns_D,
2624			**kw_plot_unknowns)
2625		out.anchor_avg = ppl.plot(
2626			*anchor_avg,
2627			**kw_plot_anchor_avg)
2628		out.unknown_avg = ppl.plot(
2629			*unknown_avg,
2630			**kw_plot_unknown_avg)
2631		if xylimits == 'constant':
2632			x = [r[f'd{self._4x}'] for r in self]
2633			y = [r[f'D{self._4x}'] for r in self]
2634			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2635			w, h = x2-x1, y2-y1
2636			x1 -= w/20
2637			x2 += w/20
2638			y1 -= h/20
2639			y2 += h/20
2640			ppl.axis([x1, x2, y1, y2])
2641		elif xylimits == 'free':
2642			x1, x2, y1, y2 = ppl.axis()
2643		else:
2644			x1, x2, y1, y2 = ppl.axis(xylimits)
2645				
2646		if error_contour_interval != 'none':
2647			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2648			XI,YI = np.meshgrid(xi, yi)
2649			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2650			if error_contour_interval == 'auto':
2651				rng = np.max(SI) - np.min(SI)
2652				if rng <= 0.01:
2653					cinterval = 0.001
2654				elif rng <= 0.03:
2655					cinterval = 0.004
2656				elif rng <= 0.1:
2657					cinterval = 0.01
2658				elif rng <= 0.3:
2659					cinterval = 0.03
2660				elif rng <= 1.:
2661					cinterval = 0.1
2662				else:
2663					cinterval = 0.5
2664			else:
2665				cinterval = error_contour_interval
2666
2667			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2668			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2669			out.clabel = ppl.clabel(out.contour)
2670			contour = (XI, YI, SI, cval, cinterval)
2671
2672		if fig == None:
2673			return {
2674			'anchors':anchors,
2675			'unknowns':unknowns,
2676			'anchors_d':anchors_d,
2677			'anchors_D':anchors_D,
2678			'unknowns_d':unknowns_d,
2679			'unknowns_D':unknowns_D,
2680			'anchor_avg':anchor_avg,
2681			'unknown_avg':unknown_avg,
2682			'contour':contour,
2683			}
2684
2685		ppl.xlabel(x_label)
2686		ppl.ylabel(y_label)
2687		ppl.title(session, weight = 'bold')
2688		ppl.grid(alpha = .2)
2689		out.ax = ppl.gca()		
2690
2691		return out
2692
2693	def plot_residuals(
2694		self,
2695		kde = False,
2696		hist = False,
2697		binwidth = 2/3,
2698		dir = 'output',
2699		filename = None,
2700		highlight = [],
2701		colors = None,
2702		figsize = None,
2703		dpi = 100,
2704		yspan = None,
2705		):
2706		'''
2707		Plot residuals of each analysis as a function of time (actually, as a function of
2708		the order of analyses in the `D4xdata` object)
2709
2710		+ `kde`: whether to add a kernel density estimate of residuals
2711		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2712		+ `histbins`: specify bin edges for the histogram
2713		+ `dir`: the directory in which to save the plot
2714		+ `highlight`: a list of samples to highlight
2715		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2716		+ `figsize`: (width, height) of figure
2717		+ `dpi`: resolution for PNG output
2718		+ `yspan`: factor controlling the range of y values shown in plot
2719		  (by default: `yspan = 1.5 if kde else 1.0`)
2720		'''
2721		
2722		from matplotlib import ticker
2723
2724		if yspan is None:
2725			if kde:
2726				yspan = 1.5
2727			else:
2728				yspan = 1.0
2729		
2730		# Layout
2731		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2732		if hist or kde:
2733			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2734			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2735		else:
2736			ppl.subplots_adjust(.08,.05,.78,.8)
2737			ax1 = ppl.subplot(111)
2738		
2739		# Colors
2740		N = len(self.anchors)
2741		if colors is None:
2742			if len(highlight) > 0:
2743				Nh = len(highlight)
2744				if Nh == 1:
2745					colors = {highlight[0]: (0,0,0)}
2746				elif Nh == 3:
2747					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2748				elif Nh == 4:
2749					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2750				else:
2751					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2752			else:
2753				if N == 3:
2754					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2755				elif N == 4:
2756					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2757				else:
2758					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2759
2760		ppl.sca(ax1)
2761		
2762		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2763
2764		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2765
2766		session = self[0]['Session']
2767		x1 = 0
2768# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2769		x_sessions = {}
2770		one_or_more_singlets = False
2771		one_or_more_multiplets = False
2772		multiplets = set()
2773		for k,r in enumerate(self):
2774			if r['Session'] != session:
2775				x2 = k-1
2776				x_sessions[session] = (x1+x2)/2
2777				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2778				session = r['Session']
2779				x1 = k
2780			singlet = len(self.samples[r['Sample']]['data']) == 1
2781			if not singlet:
2782				multiplets.add(r['Sample'])
2783			if r['Sample'] in self.unknowns:
2784				if singlet:
2785					one_or_more_singlets = True
2786				else:
2787					one_or_more_multiplets = True
2788			kw = dict(
2789				marker = 'x' if singlet else '+',
2790				ms = 4 if singlet else 5,
2791				ls = 'None',
2792				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2793				mew = 1,
2794				alpha = 0.2 if singlet else 1,
2795				)
2796			if highlight and r['Sample'] not in highlight:
2797				kw['alpha'] = 0.2
2798			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2799		x2 = k
2800		x_sessions[session] = (x1+x2)/2
2801
2802		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2803		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2804		if not (hist or kde):
2805			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2806			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2807
2808		xmin, xmax, ymin, ymax = ppl.axis()
2809		if yspan != 1:
2810			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2811		for s in x_sessions:
2812			ppl.text(
2813				x_sessions[s],
2814				ymax +1,
2815				s,
2816				va = 'bottom',
2817				**(
2818					dict(ha = 'center')
2819					if len(self.sessions[s]['data']) > (0.15 * len(self))
2820					else dict(ha = 'left', rotation = 45)
2821					)
2822				)
2823
2824		if hist or kde:
2825			ppl.sca(ax2)
2826
2827		for s in colors:
2828			kw['marker'] = '+'
2829			kw['ms'] = 5
2830			kw['mec'] = colors[s]
2831			kw['label'] = s
2832			kw['alpha'] = 1
2833			ppl.plot([], [], **kw)
2834
2835		kw['mec'] = (0,0,0)
2836
2837		if one_or_more_singlets:
2838			kw['marker'] = 'x'
2839			kw['ms'] = 4
2840			kw['alpha'] = .2
2841			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2842			ppl.plot([], [], **kw)
2843
2844		if one_or_more_multiplets:
2845			kw['marker'] = '+'
2846			kw['ms'] = 4
2847			kw['alpha'] = 1
2848			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2849			ppl.plot([], [], **kw)
2850
2851		if hist or kde:
2852			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2853		else:
2854			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2855		leg.set_zorder(-1000)
2856
2857		ppl.sca(ax1)
2858
2859		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2860		ppl.xticks([])
2861		ppl.axis([-1, len(self), None, None])
2862
2863		if hist or kde:
2864			ppl.sca(ax2)
2865			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2866
2867			if kde:
2868				from scipy.stats import gaussian_kde
2869				yi = np.linspace(ymin, ymax, 201)
2870				xi = gaussian_kde(X).evaluate(yi)
2871				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2872# 				ppl.plot(xi, yi, 'k-', lw = 1)
2873			elif hist:
2874				ppl.hist(
2875					X,
2876					orientation = 'horizontal',
2877					histtype = 'stepfilled',
2878					ec = [.4]*3,
2879					fc = [.25]*3,
2880					alpha = .25,
2881					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2882					)
2883			ppl.text(0, 0,
2884				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2885				size = 7.5,
2886				alpha = 1,
2887				va = 'center',
2888				ha = 'left',
2889				)
2890
2891			ppl.axis([0, None, ymin, ymax])
2892			ppl.xticks([])
2893			ppl.yticks([])
2894# 			ax2.spines['left'].set_visible(False)
2895			ax2.spines['right'].set_visible(False)
2896			ax2.spines['top'].set_visible(False)
2897			ax2.spines['bottom'].set_visible(False)
2898
2899		ax1.axis([None, None, ymin, ymax])
2900
2901		if not os.path.exists(dir):
2902			os.makedirs(dir)
2903		if filename is None:
2904			return fig
2905		elif filename == '':
2906			filename = f'D{self._4x}_residuals.pdf'
2907		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2908		ppl.close(fig)
2909				
2910
2911	def simulate(self, *args, **kwargs):
2912		'''
2913		Legacy function with warning message pointing to `virtual_data()`
2914		'''
2915		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2916
2917	def plot_distribution_of_analyses(
2918		self,
2919		dir = 'output',
2920		filename = None,
2921		vs_time = False,
2922		figsize = (6,4),
2923		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2924		output = None,
2925		dpi = 100,
2926		):
2927		'''
2928		Plot temporal distribution of all analyses in the data set.
2929		
2930		**Parameters**
2931
2932		+ `dir`: the directory in which to save the plot
2933		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2934		+ `dpi`: resolution for PNG output
2935		+ `figsize`: (width, height) of figure
2936		+ `dpi`: resolution for PNG output
2937		'''
2938
2939		asamples = [s for s in self.anchors]
2940		usamples = [s for s in self.unknowns]
2941		if output is None or output == 'fig':
2942			fig = ppl.figure(figsize = figsize)
2943			ppl.subplots_adjust(*subplots_adjust)
2944		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2945		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2946		Xmax += (Xmax-Xmin)/40
2947		Xmin -= (Xmax-Xmin)/41
2948		for k, s in enumerate(asamples + usamples):
2949			if vs_time:
2950				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2951			else:
2952				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2953			Y = [-k for x in X]
2954			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2955			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2956			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2957		ppl.axis([Xmin, Xmax, -k-1, 1])
2958		ppl.xlabel('\ntime')
2959		ppl.gca().annotate('',
2960			xy = (0.6, -0.02),
2961			xycoords = 'axes fraction',
2962			xytext = (.4, -0.02), 
2963            arrowprops = dict(arrowstyle = "->", color = 'k'),
2964            )
2965			
2966
2967		x2 = -1
2968		for session in self.sessions:
2969			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2970			if vs_time:
2971				ppl.axvline(x1, color = 'k', lw = .75)
2972			if x2 > -1:
2973				if not vs_time:
2974					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2975			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2976# 			from xlrd import xldate_as_datetime
2977# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2978			if vs_time:
2979				ppl.axvline(x2, color = 'k', lw = .75)
2980				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2981			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2982
2983		ppl.xticks([])
2984		ppl.yticks([])
2985
2986		if output is None:
2987			if not os.path.exists(dir):
2988				os.makedirs(dir)
2989			if filename == None:
2990				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2991			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2992			ppl.close(fig)
2993		elif output == 'ax':
2994			return ppl.gca()
2995		elif output == 'fig':
2996			return fig
2997
2998
2999	def plot_bulk_compositions(
3000		self,
3001		samples = None,
3002		dir = 'output/bulk_compositions',
3003		figsize = (6,6),
3004		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3005		show = False,
3006		sample_color = (0,.5,1),
3007		analysis_color = (.7,.7,.7),
3008		labeldist = 0.3,
3009		radius = 0.05,
3010		):
3011		'''
3012		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3013		
3014		By default, creates a directory `./output/bulk_compositions` where plots for
3015		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3016		
3017		
3018		**Parameters**
3019
3020		+ `samples`: Only these samples are processed (by default: all samples).
3021		+ `dir`: where to save the plots
3022		+ `figsize`: (width, height) of figure
3023		+ `subplots_adjust`: passed to `subplots_adjust()`
3024		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3025		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3026		+ `sample_color`: color used for replicate markers/labels
3027		+ `analysis_color`: color used for sample markers/labels
3028		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3029		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3030		'''
3031
3032		from matplotlib.patches import Ellipse
3033
3034		if samples is None:
3035			samples = [_ for _ in self.samples]
3036
3037		saved = {}
3038
3039		for s in samples:
3040
3041			fig = ppl.figure(figsize = figsize)
3042			fig.subplots_adjust(*subplots_adjust)
3043			ax = ppl.subplot(111)
3044			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3045			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3046			ppl.title(s)
3047
3048
3049			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3050			UID = [_['UID'] for _ in self.samples[s]['data']]
3051			XY0 = XY.mean(0)
3052
3053			for xy in XY:
3054				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3055				
3056			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3057			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3058			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3059			saved[s] = [XY, XY0]
3060			
3061			x1, x2, y1, y2 = ppl.axis()
3062			x0, dx = (x1+x2)/2, (x2-x1)/2
3063			y0, dy = (y1+y2)/2, (y2-y1)/2
3064			dx, dy = [max(max(dx, dy), radius)]*2
3065
3066			ppl.axis([
3067				x0 - 1.2*dx,
3068				x0 + 1.2*dx,
3069				y0 - 1.2*dy,
3070				y0 + 1.2*dy,
3071				])			
3072
3073			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3074
3075			for xy, uid in zip(XY, UID):
3076
3077				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3078				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3079
3080				if (vector_in_display_space**2).sum() > 0:
3081
3082					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3083					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3084					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3085					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3086
3087					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3088
3089				else:
3090
3091					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3092
3093			if radius:
3094				ax.add_artist(Ellipse(
3095					xy = XY0,
3096					width = radius*2,
3097					height = radius*2,
3098					ls = (0, (2,2)),
3099					lw = .7,
3100					ec = analysis_color,
3101					fc = 'None',
3102					))
3103				ppl.text(
3104					XY0[0],
3105					XY0[1]-radius,
3106					f'\n± {radius*1e3:.0f} ppm',
3107					color = analysis_color,
3108					va = 'top',
3109					ha = 'center',
3110					linespacing = 0.4,
3111					size = 8,
3112					)
3113
3114			if not os.path.exists(dir):
3115				os.makedirs(dir)
3116			fig.savefig(f'{dir}/{s}.pdf')
3117			ppl.close(fig)
3118
3119		fig = ppl.figure(figsize = figsize)
3120		fig.subplots_adjust(*subplots_adjust)
3121		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3122		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3123
3124		for s in saved:
3125			for xy in saved[s][0]:
3126				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3127			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3128			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3129			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3130
3131		x1, x2, y1, y2 = ppl.axis()
3132		ppl.axis([
3133			x1 - (x2-x1)/10,
3134			x2 + (x2-x1)/10,
3135			y1 - (y2-y1)/10,
3136			y2 + (y2-y1)/10,
3137			])			
3138
3139
3140		if not os.path.exists(dir):
3141			os.makedirs(dir)
3142		fig.savefig(f'{dir}/__all__.pdf')
3143		if show:
3144			ppl.show()
3145		ppl.close(fig)
3146		
3147
3148	def _save_D4x_correl(
3149		self,
3150		samples = None,
3151		dir = 'output',
3152		filename = None,
3153		D4x_precision = 4,
3154		correl_precision = 4,
3155		):
3156		'''
3157		Save D4x values along with their SE and correlation matrix.
3158
3159		**Parameters**
3160
3161		+ `samples`: Only these samples are output (by default: all samples).
3162		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3163		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3164		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3165		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3166		'''
3167		if samples is None:
3168			samples = sorted([s for s in self.unknowns])
3169		
3170		out = [['Sample']] + [[s] for s in samples]
3171		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3172		for k,s in enumerate(samples):
3173			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3174			for s2 in samples:
3175				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3176		
3177		if not os.path.exists(dir):
3178			os.makedirs(dir)
3179		if filename is None:
3180			filename = f'D{self._4x}_correl.csv'
3181		with open(f'{dir}/{filename}', 'w') as fid:
3182			fid.write(make_csv(out))
3183		
3184		
3185		
3186
3187class D47data(D4xdata):
3188	'''
3189	Store and process data for a large set of Δ47 analyses,
3190	usually comprising more than one analytical session.
3191	'''
3192
3193	Nominal_D4x = {
3194		'ETH-1':   0.2052,
3195		'ETH-2':   0.2085,
3196		'ETH-3':   0.6132,
3197		'ETH-4':   0.4511,
3198		'IAEA-C1': 0.3018,
3199		'IAEA-C2': 0.6409,
3200		'MERCK':   0.5135,
3201		} # I-CDES (Bernasconi et al., 2021)
3202	'''
3203	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3204	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3205	reference frame.
3206
3207	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3208	```py
3209	{
3210		'ETH-1'   : 0.2052,
3211		'ETH-2'   : 0.2085,
3212		'ETH-3'   : 0.6132,
3213		'ETH-4'   : 0.4511,
3214		'IAEA-C1' : 0.3018,
3215		'IAEA-C2' : 0.6409,
3216		'MERCK'   : 0.5135,
3217	}
3218	```
3219	'''
3220
3221
3222	@property
3223	def Nominal_D47(self):
3224		return self.Nominal_D4x
3225	
3226
3227	@Nominal_D47.setter
3228	def Nominal_D47(self, new):
3229		self.Nominal_D4x = dict(**new)
3230		self.refresh()
3231
3232
3233	def __init__(self, l = [], **kwargs):
3234		'''
3235		**Parameters:** same as `D4xdata.__init__()`
3236		'''
3237		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3238
3239
3240	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3241		'''
3242		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3243		value for that temperature, and add treat these samples as additional anchors.
3244
3245		**Parameters**
3246
3247		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3248		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3249		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3250		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3251		if `new`: keep pre-existing anchors but update them in case of conflict
3252		between old and new Δ47 values;
3253		if `old`: keep pre-existing anchors but preserve their original Δ47
3254		values in case of conflict.
3255		'''
3256		f = {
3257			'petersen': fCO2eqD47_Petersen,
3258			'wang': fCO2eqD47_Wang,
3259			}[fCo2eqD47]
3260		foo = {}
3261		for r in self:
3262			if 'Teq' in r:
3263				if r['Sample'] in foo:
3264					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3265				else:
3266					foo[r['Sample']] = f(r['Teq'])
3267			else:
3268					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3269
3270		if priority == 'replace':
3271			self.Nominal_D47 = {}
3272		for s in foo:
3273			if priority != 'old' or s not in self.Nominal_D47:
3274				self.Nominal_D47[s] = foo[s]
3275	
3276	def save_D47_correl(self, *args, **kwargs):
3277		return self._save_D4x_correl(*args, **kwargs)
3278
3279	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3280
3281
3282class D48data(D4xdata):
3283	'''
3284	Store and process data for a large set of Δ48 analyses,
3285	usually comprising more than one analytical session.
3286	'''
3287
3288	Nominal_D4x = {
3289		'ETH-1':  0.138,
3290		'ETH-2':  0.138,
3291		'ETH-3':  0.270,
3292		'ETH-4':  0.223,
3293		'GU-1':  -0.419,
3294		} # (Fiebig et al., 2019, 2021)
3295	'''
3296	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3297	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3298	reference frame.
3299
3300	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3301	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3302
3303	```py
3304	{
3305		'ETH-1' :  0.138,
3306		'ETH-2' :  0.138,
3307		'ETH-3' :  0.270,
3308		'ETH-4' :  0.223,
3309		'GU-1'  : -0.419,
3310	}
3311	```
3312	'''
3313
3314
3315	@property
3316	def Nominal_D48(self):
3317		return self.Nominal_D4x
3318
3319	
3320	@Nominal_D48.setter
3321	def Nominal_D48(self, new):
3322		self.Nominal_D4x = dict(**new)
3323		self.refresh()
3324
3325
3326	def __init__(self, l = [], **kwargs):
3327		'''
3328		**Parameters:** same as `D4xdata.__init__()`
3329		'''
3330		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3331
3332	def save_D48_correl(self, *args, **kwargs):
3333		return self._save_D4x_correl(*args, **kwargs)
3334
3335	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3336
3337
3338class D49data(D4xdata):
3339	'''
3340	Store and process data for a large set of Δ49 analyses,
3341	usually comprising more than one analytical session.
3342	'''
3343	
3344	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3345	'''
3346	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3347	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3348	reference frame.
3349
3350	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3351
3352	```py
3353	{
3354		"1000C": 0.0,
3355		"25C": 2.228
3356	}
3357	```
3358	'''
3359	
3360	@property
3361	def Nominal_D49(self):
3362		return self.Nominal_D4x
3363	
3364	@Nominal_D49.setter
3365	def Nominal_D49(self, new):
3366		self.Nominal_D4x = dict(**new)
3367		self.refresh()
3368	
3369	def __init__(self, l=[], **kwargs):
3370		'''
3371		**Parameters:** same as `D4xdata.__init__()`
3372		'''
3373		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3374	
3375	def save_D49_correl(self, *args, **kwargs):
3376		return self._save_D4x_correl(*args, **kwargs)
3377	
3378	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3379
3380class _SessionPlot():
3381	'''
3382	Simple placeholder class
3383	'''
3384	def __init__(self):
3385		pass
3386
3387_app = typer.Typer(
3388	add_completion = False,
3389	context_settings={'help_option_names': ['-h', '--help']},
3390	rich_markup_mode = 'rich',
3391	)
3392
3393@_app.command()
3394def _cli(
3395	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3396	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3397	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3398	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3399	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3400	):
3401	"""
3402	Process raw D47 data and return standardized results.
3403	
3404	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3405	
3406	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3407	the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3408	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3409	the following sequence is applied:
3410	
3411	* [b]D47data.wg()[/b]
3412	* [b]D47data.crunch()[/b]
3413	* [b]D47data.standardize()[/b]
3414	* [b]D47data.summary()[/b]
3415	* [b]D47data.table_of_samples()[/b]
3416	* [b]D47data.table_of_sessions()[/b]
3417	* [b]D47data.plot_sessions()[/b]
3418	* [b]D47data.plot_residuals()[/b]
3419	* [b]D47data.table_of_analyses()[/b]
3420	* [b]D47data.plot_distribution_of_analyses()[/b]
3421	* [b]D47data.plot_bulk_compositions()[/b]
3422	* [b]D47data.save_D47_correl()[/b]
3423	
3424	Optionally, also apply similar methods for [b]]D48[/b].
3425	
3426	[b]Example CSV file for --anchors option:[/b]	
3427	[i]
3428	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3429	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3430	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3431	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3432	ETH-4,            ,           ,  0.4511,  0.223
3433	[/i]
3434	Except for [i]Sample[/i], none of the columns above are mandatory.
3435
3436	[b]Example CSV file for --exclude option:[/b]	
3437	[i]
3438	Sample,  UID
3439	 FOO-1,
3440	 BAR-2,
3441	      ,  A04
3442	      ,  A17
3443	      ,  A88
3444	[/i]
3445	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3446	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3447	Neither column is mandatory.
3448	"""
3449
3450	data = D47data()
3451	data.read(rawdata)
3452
3453	if exclude != 'none':
3454		exclude = read_csv(exclude)
3455		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3456		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3457	else:
3458		exclude_uid = []
3459		exclude_sample = []
3460	
3461	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3462
3463	if anchors != 'none':
3464		anchors = read_csv(anchors)
3465		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3466			data.Nominal_d13C_VPDB = {
3467				_['Sample']: _['d13C_VPDB']
3468				for _ in anchors
3469				if 'd13C_VPDB' in _
3470				}
3471		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3472			data.Nominal_d18O_VPDB = {
3473				_['Sample']: _['d18O_VPDB']
3474				for _ in anchors
3475				if 'd18O_VPDB' in _
3476				}
3477		if len([_ for _ in anchors if 'D47' in _]):
3478			data.Nominal_D4x = {
3479				_['Sample']: _['D47']
3480				for _ in anchors
3481				if 'D47' in _
3482				}
3483
3484	data.refresh()
3485	data.wg()
3486	data.crunch()
3487	data.standardize()
3488	data.summary(dir = output_dir)
3489	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3490	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3491	data.plot_sessions(dir = output_dir)
3492	data.save_D47_correl(dir = output_dir)
3493	
3494	if not run_D48:
3495		data.table_of_samples(dir = output_dir)
3496		data.table_of_analyses(dir = output_dir)
3497		data.table_of_sessions(dir = output_dir)
3498
3499
3500	if run_D48:
3501		data2 = D48data()
3502		print(rawdata)
3503		data2.read(rawdata)
3504
3505		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3506
3507		if anchors != 'none':
3508			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3509				data2.Nominal_d13C_VPDB = {
3510					_['Sample']: _['d13C_VPDB']
3511					for _ in anchors
3512					if 'd13C_VPDB' in _
3513					}
3514			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3515				data2.Nominal_d18O_VPDB = {
3516					_['Sample']: _['d18O_VPDB']
3517					for _ in anchors
3518					if 'd18O_VPDB' in _
3519					}
3520			if len([_ for _ in anchors if 'D48' in _]):
3521				data2.Nominal_D4x = {
3522					_['Sample']: _['D48']
3523					for _ in anchors
3524					if 'D48' in _
3525					}
3526
3527		data2.refresh()
3528		data2.wg()
3529		data2.crunch()
3530		data2.standardize()
3531		data2.summary(dir = output_dir)
3532		data2.plot_sessions(dir = output_dir)
3533		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3534		data2.plot_distribution_of_analyses(dir = output_dir)
3535		data2.save_D48_correl(dir = output_dir)
3536
3537		table_of_analyses(data, data2, dir = output_dir)
3538		table_of_samples(data, data2, dir = output_dir)
3539		table_of_sessions(data, data2, dir = output_dir)
3540		
3541def __cli():
3542	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
68def fCO2eqD47_Petersen(T):
69	'''
70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
72
73	'''
74	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
79def fCO2eqD47_Wang(T):
80	'''
81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
84	'''
85	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 88def correlated_sum(X, C, w = None):
 89	'''
 90	Compute covariance-aware linear combinations
 91
 92	**Parameters**
 93	
 94	+ `X`: list or 1-D array of values to sum
 95	+ `C`: covariance matrix for the elements of `X`
 96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 97	       (all equal to 1 by default)
 98
 99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
100	to the elements of `w`, accounting for covariances between the elements of `X`.
101	'''
102	if w is None:
103		w = [1 for x in X]
104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
107def make_csv(x, hsep = ',', vsep = '\n'):
108	'''
109	Formats a list of lists of strings as a CSV
110
111	**Parameters**
112
113	+ `x`: the list of lists of strings to format
114	+ `hsep`: the field separator (`,` by default)
115	+ `vsep`: the line-ending convention to use (`\\n` by default)
116
117	**Example**
118
119	```py
120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
121	```
122
123	outputs:
124
125	```py
126	a,b,c
127	d,e,f
128	```
129	'''
130	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
133def pf(txt):
134	'''
135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
136	'''
137	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
140def smart_type(x):
141	'''
142	Tries to convert string `x` to a float if it includes a decimal point, or
143	to an integer if it does not. If both attempts fail, return the original
144	string unchanged.
145	'''
146	try:
147		y = float(x)
148	except ValueError:
149		return x
150	if '.' not in x:
151		return int(y)
152	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

D47crunch_defaults = <D47crunch._Defaults object>
def pretty_table(x, header=1, hsep=' ', vsep=None, align='<'):
161def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
162	'''
163	Reads a list of lists of strings and outputs an ascii table
164
165	**Parameters**
166
167	+ `x`: a list of lists of strings
168	+ `header`: the number of lines to treat as header lines
169	+ `hsep`: the horizontal separator between columns
170	+ `vsep`: the character to use as vertical separator
171	+ `align`: string of left (`<`) or right (`>`) alignment characters.
172
173	**Example**
174
175	```py
176	print(pretty_table([
177		['A', 'B', 'C'],
178		['1', '1.9999', 'foo'],
179		['10', 'x', 'bar'],
180	]))
181	```
182	yields:	
183	```
184	——  ——————  ———
185	A        B    C
186	——  ——————  ———
187	1   1.9999  foo
188	10       x  bar
189	——  ——————  ———
190	```
191
192	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
193	
194	```py
195	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
196	print(pretty_table([
197		['A', 'B', 'C'],
198		['1', '1.9999', 'foo'],
199		['10', 'x', 'bar'],
200	]))
201	```
202	yields:	
203	```
204	==  ======  ===
205	A        B    C
206	==  ======  ===
207	1   1.9999  foo
208	10       x  bar
209	==  ======  ===
210	```
211	'''
212	
213	if vsep is None:
214		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
215	
216	txt = []
217	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
218
219	if len(widths) > len(align):
220		align += '>' * (len(widths)-len(align))
221	sepline = hsep.join([vsep*w for w in widths])
222	txt += [sepline]
223	for k,l in enumerate(x):
224		if k and k == header:
225			txt += [sepline]
226		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
227	txt += [sepline]
228	txt += ['']
229	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

——  ——————  ———
A        B    C
——  ——————  ———
1   1.9999  foo
10       x  bar
——  ——————  ———

To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:

D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

==  ======  ===
A        B    C
==  ======  ===
1   1.9999  foo
10       x  bar
==  ======  ===
def transpose_table(x):
232def transpose_table(x):
233	'''
234	Transpose a list if lists
235
236	**Parameters**
237
238	+ `x`: a list of lists
239
240	**Example**
241
242	```py
243	x = [[1, 2], [3, 4]]
244	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
245	```
246	'''
247	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
250def w_avg(X, sX) :
251	'''
252	Compute variance-weighted average
253
254	Returns the value and SE of the weighted average of the elements of `X`,
255	with relative weights equal to their inverse variances (`1/sX**2`).
256
257	**Parameters**
258
259	+ `X`: array-like of elements to average
260	+ `sX`: array-like of the corresponding SE values
261
262	**Tip**
263
264	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
265	they may be rearranged using `zip()`:
266
267	```python
268	foo = [(0, 1), (1, 0.5), (2, 0.5)]
269	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
270	```
271	'''
272	X = [ x for x in X ]
273	sX = [ sx for sx in sX ]
274	W = [ sx**-2 for sx in sX ]
275	W = [ w/sum(W) for w in W ]
276	Xavg = sum([ w*x for w,x in zip(W,X) ])
277	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
278	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
281def read_csv(filename, sep = ''):
282	'''
283	Read contents of `filename` in csv format and return a list of dictionaries.
284
285	In the csv string, spaces before and after field separators (`','` by default)
286	are optional.
287
288	**Parameters**
289
290	+ `filename`: the csv file to read
291	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
292	whichever appers most often in the contents of `filename`.
293	'''
294	with open(filename) as fid:
295		txt = fid.read()
296
297	if sep == '':
298		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
299	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
300	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
303def simulate_single_analysis(
304	sample = 'MYSAMPLE',
305	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
306	d13C_VPDB = None, d18O_VPDB = None,
307	D47 = None, D48 = None, D49 = 0., D17O = 0.,
308	a47 = 1., b47 = 0., c47 = -0.9,
309	a48 = 1., b48 = 0., c48 = -0.45,
310	Nominal_D47 = None,
311	Nominal_D48 = None,
312	Nominal_d13C_VPDB = None,
313	Nominal_d18O_VPDB = None,
314	ALPHA_18O_ACID_REACTION = None,
315	R13_VPDB = None,
316	R17_VSMOW = None,
317	R18_VSMOW = None,
318	LAMBDA_17 = None,
319	R18_VPDB = None,
320	):
321	'''
322	Compute working-gas delta values for a single analysis, assuming a stochastic working
323	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
324	
325	**Parameters**
326
327	+ `sample`: sample name
328	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
329		(respectively –4 and +26 ‰ by default)
330	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
331	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
332		of the carbonate sample
333	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
334		Δ48 values if `D47` or `D48` are not specified
335	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
336		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
337	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
338	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
339		correction parameters (by default equal to the `D4xdata` default values)
340	
341	Returns a dictionary with fields
342	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
343	'''
344
345	if Nominal_d13C_VPDB is None:
346		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
347
348	if Nominal_d18O_VPDB is None:
349		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
350
351	if ALPHA_18O_ACID_REACTION is None:
352		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
353
354	if R13_VPDB is None:
355		R13_VPDB = D4xdata().R13_VPDB
356
357	if R17_VSMOW is None:
358		R17_VSMOW = D4xdata().R17_VSMOW
359
360	if R18_VSMOW is None:
361		R18_VSMOW = D4xdata().R18_VSMOW
362
363	if LAMBDA_17 is None:
364		LAMBDA_17 = D4xdata().LAMBDA_17
365
366	if R18_VPDB is None:
367		R18_VPDB = D4xdata().R18_VPDB
368	
369	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
370	
371	if Nominal_D47 is None:
372		Nominal_D47 = D47data().Nominal_D47
373
374	if Nominal_D48 is None:
375		Nominal_D48 = D48data().Nominal_D48
376	
377	if d13C_VPDB is None:
378		if sample in Nominal_d13C_VPDB:
379			d13C_VPDB = Nominal_d13C_VPDB[sample]
380		else:
381			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
382
383	if d18O_VPDB is None:
384		if sample in Nominal_d18O_VPDB:
385			d18O_VPDB = Nominal_d18O_VPDB[sample]
386		else:
387			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
388
389	if D47 is None:
390		if sample in Nominal_D47:
391			D47 = Nominal_D47[sample]
392		else:
393			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
394
395	if D48 is None:
396		if sample in Nominal_D48:
397			D48 = Nominal_D48[sample]
398		else:
399			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
400
401	X = D4xdata()
402	X.R13_VPDB = R13_VPDB
403	X.R17_VSMOW = R17_VSMOW
404	X.R18_VSMOW = R18_VSMOW
405	X.LAMBDA_17 = LAMBDA_17
406	X.R18_VPDB = R18_VPDB
407	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
408
409	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
410		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
411		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
412		)
413	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
414		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
415		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
416		D17O=D17O, D47=D47, D48=D48, D49=D49,
417		)
418	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
419		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
420		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
421		D17O=D17O,
422		)
423	
424	d45 = 1000 * (R45/R45wg - 1)
425	d46 = 1000 * (R46/R46wg - 1)
426	d47 = 1000 * (R47/R47wg - 1)
427	d48 = 1000 * (R48/R48wg - 1)
428	d49 = 1000 * (R49/R49wg - 1)
429
430	for k in range(3): # dumb iteration to adjust for small changes in d47
431		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
432		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
433		d47 = 1000 * (R47raw/R47wg - 1)
434		d48 = 1000 * (R48raw/R48wg - 1)
435
436	return dict(
437		Sample = sample,
438		D17O = D17O,
439		d13Cwg_VPDB = d13Cwg_VPDB,
440		d18Owg_VSMOW = d18Owg_VSMOW,
441		d45 = d45,
442		d46 = d46,
443		d47 = d47,
444		d48 = d48,
445		d49 = d49,
446		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
449def virtual_data(
450	samples = [],
451	a47 = 1., b47 = 0., c47 = -0.9,
452	a48 = 1., b48 = 0., c48 = -0.45,
453	rd45 = 0.020, rd46 = 0.060,
454	rD47 = 0.015, rD48 = 0.045,
455	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
456	session = None,
457	Nominal_D47 = None, Nominal_D48 = None,
458	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
459	ALPHA_18O_ACID_REACTION = None,
460	R13_VPDB = None,
461	R17_VSMOW = None,
462	R18_VSMOW = None,
463	LAMBDA_17 = None,
464	R18_VPDB = None,
465	seed = 0,
466	shuffle = True,
467	):
468	'''
469	Return list with simulated analyses from a single session.
470	
471	**Parameters**
472	
473	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
474	    * `Sample`: the name of the sample
475	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
476	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
477	    * `N`: how many analyses to generate for this sample
478	+ `a47`: scrambling factor for Δ47
479	+ `b47`: compositional nonlinearity for Δ47
480	+ `c47`: working gas offset for Δ47
481	+ `a48`: scrambling factor for Δ48
482	+ `b48`: compositional nonlinearity for Δ48
483	+ `c48`: working gas offset for Δ48
484	+ `rd45`: analytical repeatability of δ45
485	+ `rd46`: analytical repeatability of δ46
486	+ `rD47`: analytical repeatability of Δ47
487	+ `rD48`: analytical repeatability of Δ48
488	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
489		(by default equal to the `simulate_single_analysis` default values)
490	+ `session`: name of the session (no name by default)
491	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
492		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
493	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
494		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
495		(by default equal to the `simulate_single_analysis` defaults)
496	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
497		(by default equal to the `simulate_single_analysis` defaults)
498	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
499		correction parameters (by default equal to the `simulate_single_analysis` default)
500	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
501	+ `shuffle`: randomly reorder the sequence of analyses
502	
503		
504	Here is an example of using this method to generate an arbitrary combination of
505	anchors and unknowns for a bunch of sessions:
506
507	```py
508	.. include:: ../../code_examples/virtual_data/example.py
509	```
510	
511	This should output something like:
512	
513	```
514	.. include:: ../../code_examples/virtual_data/output.txt
515	```
516	'''
517	
518	kwargs = locals().copy()
519
520	from numpy import random as nprandom
521	if seed:
522		rng = nprandom.default_rng(seed)
523	else:
524		rng = nprandom.default_rng()
525	
526	N = sum([s['N'] for s in samples])
527	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
528	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
529	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
530	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
531	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
532	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
533	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
534	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
535	
536	k = 0
537	out = []
538	for s in samples:
539		kw = {}
540		kw['sample'] = s['Sample']
541		kw = {
542			**kw,
543			**{var: kwargs[var]
544				for var in [
545					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
546					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
547					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
548					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
549					]
550				if kwargs[var] is not None},
551			**{var: s[var]
552				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
553				if var in s},
554			}
555
556		sN = s['N']
557		while sN:
558			out.append(simulate_single_analysis(**kw))
559			out[-1]['d45'] += errors45[k]
560			out[-1]['d46'] += errors46[k]
561			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
562			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
563			sN -= 1
564			k += 1
565
566		if session is not None:
567			for r in out:
568				r['Session'] = session
569
570		if shuffle:
571			nprandom.shuffle(out)
572
573	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————

[table_of_samples] 
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————

[table_of_analyses] 
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
1    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
2    Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
3    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
4    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
5    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
6    Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
7    Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
8    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
9    Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
10   Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
11   Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
12   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
15   Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
16   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
17   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
18   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
19   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
20   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
21   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
22   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
23   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
24   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
25   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
26   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
27   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
28   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
29   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
30   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
31   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
32   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
33   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
34   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
35   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
36   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
37   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
38   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
39   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
40   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
41   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
42   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
43   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
44   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
45   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
46   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
47   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
48   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
49   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
50   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
51   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
52   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
53   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
54   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
55   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
56   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
57   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
58   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
59   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
60   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
575def table_of_samples(
576	data47 = None,
577	data48 = None,
578	dir = 'output',
579	filename = None,
580	save_to_file = True,
581	print_out = True,
582	output = None,
583	):
584	'''
585	Print out, save to disk and/or return a combined table of samples
586	for a pair of `D47data` and `D48data` objects.
587
588	**Parameters**
589
590	+ `data47`: `D47data` instance
591	+ `data48`: `D48data` instance
592	+ `dir`: the directory in which to save the table
593	+ `filename`: the name to the csv file to write to
594	+ `save_to_file`: whether to save the table to disk
595	+ `print_out`: whether to print out the table
596	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
597		if set to `'raw'`: return a list of list of strings
598		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
599	'''
600	if data47 is None:
601		if data48 is None:
602			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
603		else:
604			return data48.table_of_samples(
605				dir = dir,
606				filename = filename,
607				save_to_file = save_to_file,
608				print_out = print_out,
609				output = output
610				)
611	else:
612		if data48 is None:
613			return data47.table_of_samples(
614				dir = dir,
615				filename = filename,
616				save_to_file = save_to_file,
617				print_out = print_out,
618				output = output
619				)
620		else:
621			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
622			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
623			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
624
625			if save_to_file:
626				if not os.path.exists(dir):
627					os.makedirs(dir)
628				if filename is None:
629					filename = f'D47D48_samples.csv'
630				with open(f'{dir}/{filename}', 'w') as fid:
631					fid.write(make_csv(out))
632			if print_out:
633				print('\n'+pretty_table(out))
634			if output == 'raw':
635				return out
636			elif output == 'pretty':
637				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
640def table_of_sessions(
641	data47 = None,
642	data48 = None,
643	dir = 'output',
644	filename = None,
645	save_to_file = True,
646	print_out = True,
647	output = None,
648	):
649	'''
650	Print out, save to disk and/or return a combined table of sessions
651	for a pair of `D47data` and `D48data` objects.
652	***Only applicable if the sessions in `data47` and those in `data48`
653	consist of the exact same sets of analyses.***
654
655	**Parameters**
656
657	+ `data47`: `D47data` instance
658	+ `data48`: `D48data` instance
659	+ `dir`: the directory in which to save the table
660	+ `filename`: the name to the csv file to write to
661	+ `save_to_file`: whether to save the table to disk
662	+ `print_out`: whether to print out the table
663	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
664		if set to `'raw'`: return a list of list of strings
665		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
666	'''
667	if data47 is None:
668		if data48 is None:
669			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
670		else:
671			return data48.table_of_sessions(
672				dir = dir,
673				filename = filename,
674				save_to_file = save_to_file,
675				print_out = print_out,
676				output = output
677				)
678	else:
679		if data48 is None:
680			return data47.table_of_sessions(
681				dir = dir,
682				filename = filename,
683				save_to_file = save_to_file,
684				print_out = print_out,
685				output = output
686				)
687		else:
688			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
689			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
690			for k,x in enumerate(out47[0]):
691				if k>7:
692					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
693					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
694			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
695
696			if save_to_file:
697				if not os.path.exists(dir):
698					os.makedirs(dir)
699				if filename is None:
700					filename = f'D47D48_sessions.csv'
701				with open(f'{dir}/{filename}', 'w') as fid:
702					fid.write(make_csv(out))
703			if print_out:
704				print('\n'+pretty_table(out))
705			if output == 'raw':
706				return out
707			elif output == 'pretty':
708				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
711def table_of_analyses(
712	data47 = None,
713	data48 = None,
714	dir = 'output',
715	filename = None,
716	save_to_file = True,
717	print_out = True,
718	output = None,
719	):
720	'''
721	Print out, save to disk and/or return a combined table of analyses
722	for a pair of `D47data` and `D48data` objects.
723
724	If the sessions in `data47` and those in `data48` do not consist of
725	the exact same sets of analyses, the table will have two columns
726	`Session_47` and `Session_48` instead of a single `Session` column.
727
728	**Parameters**
729
730	+ `data47`: `D47data` instance
731	+ `data48`: `D48data` instance
732	+ `dir`: the directory in which to save the table
733	+ `filename`: the name to the csv file to write to
734	+ `save_to_file`: whether to save the table to disk
735	+ `print_out`: whether to print out the table
736	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
737		if set to `'raw'`: return a list of list of strings
738		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
739	'''
740	if data47 is None:
741		if data48 is None:
742			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
743		else:
744			return data48.table_of_analyses(
745				dir = dir,
746				filename = filename,
747				save_to_file = save_to_file,
748				print_out = print_out,
749				output = output
750				)
751	else:
752		if data48 is None:
753			return data47.table_of_analyses(
754				dir = dir,
755				filename = filename,
756				save_to_file = save_to_file,
757				print_out = print_out,
758				output = output
759				)
760		else:
761			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
762			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
763			
764			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
765				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
766			else:
767				out47[0][1] = 'Session_47'
768				out48[0][1] = 'Session_48'
769				out47 = transpose_table(out47)
770				out48 = transpose_table(out48)
771				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
772
773			if save_to_file:
774				if not os.path.exists(dir):
775					os.makedirs(dir)
776				if filename is None:
777					filename = f'D47D48_sessions.csv'
778				with open(f'{dir}/{filename}', 'w') as fid:
779					fid.write(make_csv(out))
780			if print_out:
781				print('\n'+pretty_table(out))
782			if output == 'raw':
783				return out
784			elif output == 'pretty':
785				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 833class D4xdata(list):
 834	'''
 835	Store and process data for a large set of Δ47 and/or Δ48
 836	analyses, usually comprising more than one analytical session.
 837	'''
 838
 839	### 17O CORRECTION PARAMETERS
 840	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 841	'''
 842	Absolute (13C/12C) ratio of VPDB.
 843	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 844	'''
 845
 846	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 847	'''
 848	Absolute (18O/16C) ratio of VSMOW.
 849	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 850	'''
 851
 852	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 853	'''
 854	Mass-dependent exponent for triple oxygen isotopes.
 855	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 856	'''
 857
 858	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 859	'''
 860	Absolute (17O/16C) ratio of VSMOW.
 861	By default equal to 0.00038475
 862	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 863	rescaled to `R13_VPDB`)
 864	'''
 865
 866	R18_VPDB = R18_VSMOW * 1.03092
 867	'''
 868	Absolute (18O/16C) ratio of VPDB.
 869	By definition equal to `R18_VSMOW * 1.03092`.
 870	'''
 871
 872	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 873	'''
 874	Absolute (17O/16C) ratio of VPDB.
 875	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 876	'''
 877
 878	LEVENE_REF_SAMPLE = 'ETH-3'
 879	'''
 880	After the Δ4x standardization step, each sample is tested to
 881	assess whether the Δ4x variance within all analyses for that
 882	sample differs significantly from that observed for a given reference
 883	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 884	which yields a p-value corresponding to the null hypothesis that the
 885	underlying variances are equal).
 886
 887	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 888	sample should be used as a reference for this test.
 889	'''
 890
 891	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 892	'''
 893	Specifies the 18O/16O fractionation factor generally applicable
 894	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 895	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 896
 897	By default equal to 1.008129 (calcite reacted at 90 °C,
 898	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 899	'''
 900
 901	Nominal_d13C_VPDB = {
 902		'ETH-1': 2.02,
 903		'ETH-2': -10.17,
 904		'ETH-3': 1.71,
 905		}	# (Bernasconi et al., 2018)
 906	'''
 907	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 908	`D4xdata.standardize_d13C()`.
 909
 910	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 911	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 912	'''
 913
 914	Nominal_d18O_VPDB = {
 915		'ETH-1': -2.19,
 916		'ETH-2': -18.69,
 917		'ETH-3': -1.78,
 918		}	# (Bernasconi et al., 2018)
 919	'''
 920	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 921	`D4xdata.standardize_d18O()`.
 922
 923	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 924	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 925	'''
 926
 927	d13C_STANDARDIZATION_METHOD = '2pt'
 928	'''
 929	Method by which to standardize δ13C values:
 930	
 931	+ `none`: do not apply any δ13C standardization.
 932	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 933	minimize the difference between final δ13C_VPDB values and
 934	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 935	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 936	values so as to minimize the difference between final δ13C_VPDB
 937	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 938	is defined).
 939	'''
 940
 941	d18O_STANDARDIZATION_METHOD = '2pt'
 942	'''
 943	Method by which to standardize δ18O values:
 944	
 945	+ `none`: do not apply any δ18O standardization.
 946	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 947	minimize the difference between final δ18O_VPDB values and
 948	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 949	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 950	values so as to minimize the difference between final δ18O_VPDB
 951	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 952	is defined).
 953	'''
 954
 955	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 956		'''
 957		**Parameters**
 958
 959		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 960		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 961		+ `mass`: `'47'` or `'48'`
 962		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 963		+ `session`: define session name for analyses without a `Session` key
 964		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 965
 966		Returns a `D4xdata` object derived from `list`.
 967		'''
 968		self._4x = mass
 969		self.verbose = verbose
 970		self.prefix = 'D4xdata'
 971		self.logfile = logfile
 972		list.__init__(self, l)
 973		self.Nf = None
 974		self.repeatability = {}
 975		self.refresh(session = session)
 976
 977
 978	def make_verbal(oldfun):
 979		'''
 980		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 981		'''
 982		@wraps(oldfun)
 983		def newfun(*args, verbose = '', **kwargs):
 984			myself = args[0]
 985			oldprefix = myself.prefix
 986			myself.prefix = oldfun.__name__
 987			if verbose != '':
 988				oldverbose = myself.verbose
 989				myself.verbose = verbose
 990			out = oldfun(*args, **kwargs)
 991			myself.prefix = oldprefix
 992			if verbose != '':
 993				myself.verbose = oldverbose
 994			return out
 995		return newfun
 996
 997
 998	def msg(self, txt):
 999		'''
1000		Log a message to `self.logfile`, and print it out if `verbose = True`
1001		'''
1002		self.log(txt)
1003		if self.verbose:
1004			print(f'{f"[{self.prefix}]":<16} {txt}')
1005
1006
1007	def vmsg(self, txt):
1008		'''
1009		Log a message to `self.logfile` and print it out
1010		'''
1011		self.log(txt)
1012		print(txt)
1013
1014
1015	def log(self, *txts):
1016		'''
1017		Log a message to `self.logfile`
1018		'''
1019		if self.logfile:
1020			with open(self.logfile, 'a') as fid:
1021				for txt in txts:
1022					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1023
1024
1025	def refresh(self, session = 'mySession'):
1026		'''
1027		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1028		'''
1029		self.fill_in_missing_info(session = session)
1030		self.refresh_sessions()
1031		self.refresh_samples()
1032
1033
1034	def refresh_sessions(self):
1035		'''
1036		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1037		to `False` for all sessions.
1038		'''
1039		self.sessions = {
1040			s: {'data': [r for r in self if r['Session'] == s]}
1041			for s in sorted({r['Session'] for r in self})
1042			}
1043		for s in self.sessions:
1044			self.sessions[s]['scrambling_drift'] = False
1045			self.sessions[s]['slope_drift'] = False
1046			self.sessions[s]['wg_drift'] = False
1047			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1048			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1049
1050
1051	def refresh_samples(self):
1052		'''
1053		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1054		'''
1055		self.samples = {
1056			s: {'data': [r for r in self if r['Sample'] == s]}
1057			for s in sorted({r['Sample'] for r in self})
1058			}
1059		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1060		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1061
1062
1063	def read(self, filename, sep = '', session = ''):
1064		'''
1065		Read file in csv format to load data into a `D47data` object.
1066
1067		In the csv file, spaces before and after field separators (`','` by default)
1068		are optional. Each line corresponds to a single analysis.
1069
1070		The required fields are:
1071
1072		+ `UID`: a unique identifier
1073		+ `Session`: an identifier for the analytical session
1074		+ `Sample`: a sample identifier
1075		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1076
1077		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1078		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1079		and `d49` are optional, and set to NaN by default.
1080
1081		**Parameters**
1082
1083		+ `fileneme`: the path of the file to read
1084		+ `sep`: csv separator delimiting the fields
1085		+ `session`: set `Session` field to this string for all analyses
1086		'''
1087		with open(filename) as fid:
1088			self.input(fid.read(), sep = sep, session = session)
1089
1090
1091	def input(self, txt, sep = '', session = ''):
1092		'''
1093		Read `txt` string in csv format to load analysis data into a `D47data` object.
1094
1095		In the csv string, spaces before and after field separators (`','` by default)
1096		are optional. Each line corresponds to a single analysis.
1097
1098		The required fields are:
1099
1100		+ `UID`: a unique identifier
1101		+ `Session`: an identifier for the analytical session
1102		+ `Sample`: a sample identifier
1103		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1104
1105		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1106		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1107		and `d49` are optional, and set to NaN by default.
1108
1109		**Parameters**
1110
1111		+ `txt`: the csv string to read
1112		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1113		whichever appers most often in `txt`.
1114		+ `session`: set `Session` field to this string for all analyses
1115		'''
1116		if sep == '':
1117			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1118		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1119		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1120
1121		if session != '':
1122			for r in data:
1123				r['Session'] = session
1124
1125		self += data
1126		self.refresh()
1127
1128
1129	@make_verbal
1130	def wg(self, samples = None, a18_acid = None):
1131		'''
1132		Compute bulk composition of the working gas for each session based on
1133		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1134		`self.Nominal_d18O_VPDB`.
1135		'''
1136
1137		self.msg('Computing WG composition:')
1138
1139		if a18_acid is None:
1140			a18_acid = self.ALPHA_18O_ACID_REACTION
1141		if samples is None:
1142			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1143
1144		assert a18_acid, f'Acid fractionation factor should not be zero.'
1145
1146		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1147		R45R46_standards = {}
1148		for sample in samples:
1149			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1150			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1151			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1152			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1153			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1154
1155			C12_s = 1 / (1 + R13_s)
1156			C13_s = R13_s / (1 + R13_s)
1157			C16_s = 1 / (1 + R17_s + R18_s)
1158			C17_s = R17_s / (1 + R17_s + R18_s)
1159			C18_s = R18_s / (1 + R17_s + R18_s)
1160
1161			C626_s = C12_s * C16_s ** 2
1162			C627_s = 2 * C12_s * C16_s * C17_s
1163			C628_s = 2 * C12_s * C16_s * C18_s
1164			C636_s = C13_s * C16_s ** 2
1165			C637_s = 2 * C13_s * C16_s * C17_s
1166			C727_s = C12_s * C17_s ** 2
1167
1168			R45_s = (C627_s + C636_s) / C626_s
1169			R46_s = (C628_s + C637_s + C727_s) / C626_s
1170			R45R46_standards[sample] = (R45_s, R46_s)
1171		
1172		for s in self.sessions:
1173			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1174			assert db, f'No sample from {samples} found in session "{s}".'
1175# 			dbsamples = sorted({r['Sample'] for r in db})
1176
1177			X = [r['d45'] for r in db]
1178			Y = [R45R46_standards[r['Sample']][0] for r in db]
1179			x1, x2 = np.min(X), np.max(X)
1180
1181			if x1 < x2:
1182				wgcoord = x1/(x1-x2)
1183			else:
1184				wgcoord = 999
1185
1186			if wgcoord < -.5 or wgcoord > 1.5:
1187				# unreasonable to extrapolate to d45 = 0
1188				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1189			else :
1190				# d45 = 0 is reasonably well bracketed
1191				R45_wg = np.polyfit(X, Y, 1)[1]
1192
1193			X = [r['d46'] for r in db]
1194			Y = [R45R46_standards[r['Sample']][1] for r in db]
1195			x1, x2 = np.min(X), np.max(X)
1196
1197			if x1 < x2:
1198				wgcoord = x1/(x1-x2)
1199			else:
1200				wgcoord = 999
1201
1202			if wgcoord < -.5 or wgcoord > 1.5:
1203				# unreasonable to extrapolate to d46 = 0
1204				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1205			else :
1206				# d46 = 0 is reasonably well bracketed
1207				R46_wg = np.polyfit(X, Y, 1)[1]
1208
1209			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1210
1211			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1212
1213			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1214			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1215			for r in self.sessions[s]['data']:
1216				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1217				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1218
1219
1220	def compute_bulk_delta(self, R45, R46, D17O = 0):
1221		'''
1222		Compute δ13C_VPDB and δ18O_VSMOW,
1223		by solving the generalized form of equation (17) from
1224		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1225		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1226		solving the corresponding second-order Taylor polynomial.
1227		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1228		'''
1229
1230		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1231
1232		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1233		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1234		C = 2 * self.R18_VSMOW
1235		D = -R46
1236
1237		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1238		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1239		cc = A + B + C + D
1240
1241		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1242
1243		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1244		R17 = K * R18 ** self.LAMBDA_17
1245		R13 = R45 - 2 * R17
1246
1247		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1248
1249		return d13C_VPDB, d18O_VSMOW
1250
1251
1252	@make_verbal
1253	def crunch(self, verbose = ''):
1254		'''
1255		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1256		'''
1257		for r in self:
1258			self.compute_bulk_and_clumping_deltas(r)
1259		self.standardize_d13C()
1260		self.standardize_d18O()
1261		self.msg(f"Crunched {len(self)} analyses.")
1262
1263
1264	def fill_in_missing_info(self, session = 'mySession'):
1265		'''
1266		Fill in optional fields with default values
1267		'''
1268		for i,r in enumerate(self):
1269			if 'D17O' not in r:
1270				r['D17O'] = 0.
1271			if 'UID' not in r:
1272				r['UID'] = f'{i+1}'
1273			if 'Session' not in r:
1274				r['Session'] = session
1275			for k in ['d47', 'd48', 'd49']:
1276				if k not in r:
1277					r[k] = np.nan
1278
1279
1280	def standardize_d13C(self):
1281		'''
1282		Perform δ13C standadization within each session `s` according to
1283		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1284		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1285		may be redefined abitrarily at a later stage.
1286		'''
1287		for s in self.sessions:
1288			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1289				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1290				X,Y = zip(*XY)
1291				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1292					offset = np.mean(Y) - np.mean(X)
1293					for r in self.sessions[s]['data']:
1294						r['d13C_VPDB'] += offset				
1295				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1296					a,b = np.polyfit(X,Y,1)
1297					for r in self.sessions[s]['data']:
1298						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1299
1300	def standardize_d18O(self):
1301		'''
1302		Perform δ18O standadization within each session `s` according to
1303		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1304		which is defined by default by `D47data.refresh_sessions()`as equal to
1305		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1306		'''
1307		for s in self.sessions:
1308			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1309				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1310				X,Y = zip(*XY)
1311				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1312				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1313					offset = np.mean(Y) - np.mean(X)
1314					for r in self.sessions[s]['data']:
1315						r['d18O_VSMOW'] += offset				
1316				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1317					a,b = np.polyfit(X,Y,1)
1318					for r in self.sessions[s]['data']:
1319						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1320	
1321
1322	def compute_bulk_and_clumping_deltas(self, r):
1323		'''
1324		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1325		'''
1326
1327		# Compute working gas R13, R18, and isobar ratios
1328		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1329		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1330		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1331
1332		# Compute analyte isobar ratios
1333		R45 = (1 + r['d45'] / 1000) * R45_wg
1334		R46 = (1 + r['d46'] / 1000) * R46_wg
1335		R47 = (1 + r['d47'] / 1000) * R47_wg
1336		R48 = (1 + r['d48'] / 1000) * R48_wg
1337		R49 = (1 + r['d49'] / 1000) * R49_wg
1338
1339		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1340		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1341		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1342
1343		# Compute stochastic isobar ratios of the analyte
1344		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1345			R13, R18, D17O = r['D17O']
1346		)
1347
1348		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1349		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1350		if (R45 / R45stoch - 1) > 5e-8:
1351			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1352		if (R46 / R46stoch - 1) > 5e-8:
1353			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1354
1355		# Compute raw clumped isotope anomalies
1356		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1357		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1358		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1359
1360
1361	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1362		'''
1363		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1364		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1365		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1366		'''
1367
1368		# Compute R17
1369		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1370
1371		# Compute isotope concentrations
1372		C12 = (1 + R13) ** -1
1373		C13 = C12 * R13
1374		C16 = (1 + R17 + R18) ** -1
1375		C17 = C16 * R17
1376		C18 = C16 * R18
1377
1378		# Compute stochastic isotopologue concentrations
1379		C626 = C16 * C12 * C16
1380		C627 = C16 * C12 * C17 * 2
1381		C628 = C16 * C12 * C18 * 2
1382		C636 = C16 * C13 * C16
1383		C637 = C16 * C13 * C17 * 2
1384		C638 = C16 * C13 * C18 * 2
1385		C727 = C17 * C12 * C17
1386		C728 = C17 * C12 * C18 * 2
1387		C737 = C17 * C13 * C17
1388		C738 = C17 * C13 * C18 * 2
1389		C828 = C18 * C12 * C18
1390		C838 = C18 * C13 * C18
1391
1392		# Compute stochastic isobar ratios
1393		R45 = (C636 + C627) / C626
1394		R46 = (C628 + C637 + C727) / C626
1395		R47 = (C638 + C728 + C737) / C626
1396		R48 = (C738 + C828) / C626
1397		R49 = C838 / C626
1398
1399		# Account for stochastic anomalies
1400		R47 *= 1 + D47 / 1000
1401		R48 *= 1 + D48 / 1000
1402		R49 *= 1 + D49 / 1000
1403
1404		# Return isobar ratios
1405		return R45, R46, R47, R48, R49
1406
1407
1408	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1409		'''
1410		Split unknown samples by UID (treat all analyses as different samples)
1411		or by session (treat analyses of a given sample in different sessions as
1412		different samples).
1413
1414		**Parameters**
1415
1416		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1417		+ `grouping`: `by_uid` | `by_session`
1418		'''
1419		if samples_to_split == 'all':
1420			samples_to_split = [s for s in self.unknowns]
1421		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1422		self.grouping = grouping.lower()
1423		if self.grouping in gkeys:
1424			gkey = gkeys[self.grouping]
1425		for r in self:
1426			if r['Sample'] in samples_to_split:
1427				r['Sample_original'] = r['Sample']
1428				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1429			elif r['Sample'] in self.unknowns:
1430				r['Sample_original'] = r['Sample']
1431		self.refresh_samples()
1432
1433
1434	def unsplit_samples(self, tables = False):
1435		'''
1436		Reverse the effects of `D47data.split_samples()`.
1437		
1438		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1439		
1440		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1441		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1442		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1443		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1444		that case session-averaged Δ4x values are statistically independent).
1445		'''
1446		unknowns_old = sorted({s for s in self.unknowns})
1447		CM_old = self.standardization.covar[:,:]
1448		VD_old = self.standardization.params.valuesdict().copy()
1449		vars_old = self.standardization.var_names
1450
1451		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1452
1453		Ns = len(vars_old) - len(unknowns_old)
1454		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1455		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1456
1457		W = np.zeros((len(vars_new), len(vars_old)))
1458		W[:Ns,:Ns] = np.eye(Ns)
1459		for u in unknowns_new:
1460			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1461			if self.grouping == 'by_session':
1462				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1463			elif self.grouping == 'by_uid':
1464				weights = [1 for s in splits]
1465			sw = sum(weights)
1466			weights = [w/sw for w in weights]
1467			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1468
1469		CM_new = W @ CM_old @ W.T
1470		V = W @ np.array([[VD_old[k]] for k in vars_old])
1471		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1472
1473		self.standardization.covar = CM_new
1474		self.standardization.params.valuesdict = lambda : VD_new
1475		self.standardization.var_names = vars_new
1476
1477		for r in self:
1478			if r['Sample'] in self.unknowns:
1479				r['Sample_split'] = r['Sample']
1480				r['Sample'] = r['Sample_original']
1481
1482		self.refresh_samples()
1483		self.consolidate_samples()
1484		self.repeatabilities()
1485
1486		if tables:
1487			self.table_of_analyses()
1488			self.table_of_samples()
1489
1490	def assign_timestamps(self):
1491		'''
1492		Assign a time field `t` of type `float` to each analysis.
1493
1494		If `TimeTag` is one of the data fields, `t` is equal within a given session
1495		to `TimeTag` minus the mean value of `TimeTag` for that session.
1496		Otherwise, `TimeTag` is by default equal to the index of each analysis
1497		in the dataset and `t` is defined as above.
1498		'''
1499		for session in self.sessions:
1500			sdata = self.sessions[session]['data']
1501			try:
1502				t0 = np.mean([r['TimeTag'] for r in sdata])
1503				for r in sdata:
1504					r['t'] = r['TimeTag'] - t0
1505			except KeyError:
1506				t0 = (len(sdata)-1)/2
1507				for t,r in enumerate(sdata):
1508					r['t'] = t - t0
1509
1510
1511	def report(self):
1512		'''
1513		Prints a report on the standardization fit.
1514		Only applicable after `D4xdata.standardize(method='pooled')`.
1515		'''
1516		report_fit(self.standardization)
1517
1518
1519	def combine_samples(self, sample_groups):
1520		'''
1521		Combine analyses of different samples to compute weighted average Δ4x
1522		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1523		dictionary.
1524		
1525		Caution: samples are weighted by number of replicate analyses, which is a
1526		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1527		correlated analytical errors for one or more samples).
1528		
1529		Returns a tuplet of:
1530		
1531		+ the list of group names
1532		+ an array of the corresponding Δ4x values
1533		+ the corresponding (co)variance matrix
1534		
1535		**Parameters**
1536
1537		+ `sample_groups`: a dictionary of the form:
1538		```py
1539		{'group1': ['sample_1', 'sample_2'],
1540		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1541		```
1542		'''
1543		
1544		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1545		groups = sorted(sample_groups.keys())
1546		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1547		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1548		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1549		W = np.array([
1550			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1551			for j in groups])
1552		D4x_new = W @ D4x_old
1553		CM_new = W @ CM_old @ W.T
1554
1555		return groups, D4x_new[:,0], CM_new
1556		
1557
1558	@make_verbal
1559	def standardize(self,
1560		method = 'pooled',
1561		weighted_sessions = [],
1562		consolidate = True,
1563		consolidate_tables = False,
1564		consolidate_plots = False,
1565		constraints = {},
1566		):
1567		'''
1568		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1569		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1570		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1571		i.e. that their true Δ4x value does not change between sessions,
1572		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1573		`'indep_sessions'`, the standardization processes each session independently, based only
1574		on anchors analyses.
1575		'''
1576
1577		self.standardization_method = method
1578		self.assign_timestamps()
1579
1580		if method == 'pooled':
1581			if weighted_sessions:
1582				for session_group in weighted_sessions:
1583					if self._4x == '47':
1584						X = D47data([r for r in self if r['Session'] in session_group])
1585					elif self._4x == '48':
1586						X = D48data([r for r in self if r['Session'] in session_group])
1587					X.Nominal_D4x = self.Nominal_D4x.copy()
1588					X.refresh()
1589					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1590					w = np.sqrt(result.redchi)
1591					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1592					for r in X:
1593						r[f'wD{self._4x}raw'] *= w
1594			else:
1595				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1596				for r in self:
1597					r[f'wD{self._4x}raw'] = 1.
1598
1599			params = Parameters()
1600			for k,session in enumerate(self.sessions):
1601				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1602				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1603				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1604				s = pf(session)
1605				params.add(f'a_{s}', value = 0.9)
1606				params.add(f'b_{s}', value = 0.)
1607				params.add(f'c_{s}', value = -0.9)
1608				params.add(f'a2_{s}', value = 0.,
1609# 					vary = self.sessions[session]['scrambling_drift'],
1610					)
1611				params.add(f'b2_{s}', value = 0.,
1612# 					vary = self.sessions[session]['slope_drift'],
1613					)
1614				params.add(f'c2_{s}', value = 0.,
1615# 					vary = self.sessions[session]['wg_drift'],
1616					)
1617				if not self.sessions[session]['scrambling_drift']:
1618					params[f'a2_{s}'].expr = '0'
1619				if not self.sessions[session]['slope_drift']:
1620					params[f'b2_{s}'].expr = '0'
1621				if not self.sessions[session]['wg_drift']:
1622					params[f'c2_{s}'].expr = '0'
1623
1624			for sample in self.unknowns:
1625				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1626
1627			for k in constraints:
1628				params[k].expr = constraints[k]
1629
1630			def residuals(p):
1631				R = []
1632				for r in self:
1633					session = pf(r['Session'])
1634					sample = pf(r['Sample'])
1635					if r['Sample'] in self.Nominal_D4x:
1636						R += [ (
1637							r[f'D{self._4x}raw'] - (
1638								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1639								+ p[f'b_{session}'] * r[f'd{self._4x}']
1640								+	p[f'c_{session}']
1641								+ r['t'] * (
1642									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1643									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1644									+	p[f'c2_{session}']
1645									)
1646								)
1647							) / r[f'wD{self._4x}raw'] ]
1648					else:
1649						R += [ (
1650							r[f'D{self._4x}raw'] - (
1651								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1652								+ p[f'b_{session}'] * r[f'd{self._4x}']
1653								+	p[f'c_{session}']
1654								+ r['t'] * (
1655									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1656									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1657									+	p[f'c2_{session}']
1658									)
1659								)
1660							) / r[f'wD{self._4x}raw'] ]
1661				return R
1662
1663			M = Minimizer(residuals, params)
1664			result = M.least_squares()
1665			self.Nf = result.nfree
1666			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1667			new_names, new_covar, new_se = _fullcovar(result)[:3]
1668			result.var_names = new_names
1669			result.covar = new_covar
1670
1671			for r in self:
1672				s = pf(r["Session"])
1673				a = result.params.valuesdict()[f'a_{s}']
1674				b = result.params.valuesdict()[f'b_{s}']
1675				c = result.params.valuesdict()[f'c_{s}']
1676				a2 = result.params.valuesdict()[f'a2_{s}']
1677				b2 = result.params.valuesdict()[f'b2_{s}']
1678				c2 = result.params.valuesdict()[f'c2_{s}']
1679				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1680				
1681
1682			self.standardization = result
1683
1684			for session in self.sessions:
1685				self.sessions[session]['Np'] = 3
1686				for k in ['scrambling', 'slope', 'wg']:
1687					if self.sessions[session][f'{k}_drift']:
1688						self.sessions[session]['Np'] += 1
1689
1690			if consolidate:
1691				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1692			return result
1693
1694
1695		elif method == 'indep_sessions':
1696
1697			if weighted_sessions:
1698				for session_group in weighted_sessions:
1699					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1700					X.Nominal_D4x = self.Nominal_D4x.copy()
1701					X.refresh()
1702					# This is only done to assign r['wD47raw'] for r in X:
1703					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1704					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1705			else:
1706				self.msg('All weights set to 1 ‰')
1707				for r in self:
1708					r[f'wD{self._4x}raw'] = 1
1709
1710			for session in self.sessions:
1711				s = self.sessions[session]
1712				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1713				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1714				s['Np'] = sum(p_active)
1715				sdata = s['data']
1716
1717				A = np.array([
1718					[
1719						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1720						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1721						1 / r[f'wD{self._4x}raw'],
1722						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1723						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1724						r['t'] / r[f'wD{self._4x}raw']
1725						]
1726					for r in sdata if r['Sample'] in self.anchors
1727					])[:,p_active] # only keep columns for the active parameters
1728				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1729				s['Na'] = Y.size
1730				CM = linalg.inv(A.T @ A)
1731				bf = (CM @ A.T @ Y).T[0,:]
1732				k = 0
1733				for n,a in zip(p_names, p_active):
1734					if a:
1735						s[n] = bf[k]
1736# 						self.msg(f'{n} = {bf[k]}')
1737						k += 1
1738					else:
1739						s[n] = 0.
1740# 						self.msg(f'{n} = 0.0')
1741
1742				for r in sdata :
1743					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1744					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1745					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1746
1747				s['CM'] = np.zeros((6,6))
1748				i = 0
1749				k_active = [j for j,a in enumerate(p_active) if a]
1750				for j,a in enumerate(p_active):
1751					if a:
1752						s['CM'][j,k_active] = CM[i,:]
1753						i += 1
1754
1755			if not weighted_sessions:
1756				w = self.rmswd()['rmswd']
1757				for r in self:
1758						r[f'wD{self._4x}'] *= w
1759						r[f'wD{self._4x}raw'] *= w
1760				for session in self.sessions:
1761					self.sessions[session]['CM'] *= w**2
1762
1763			for session in self.sessions:
1764				s = self.sessions[session]
1765				s['SE_a'] = s['CM'][0,0]**.5
1766				s['SE_b'] = s['CM'][1,1]**.5
1767				s['SE_c'] = s['CM'][2,2]**.5
1768				s['SE_a2'] = s['CM'][3,3]**.5
1769				s['SE_b2'] = s['CM'][4,4]**.5
1770				s['SE_c2'] = s['CM'][5,5]**.5
1771
1772			if not weighted_sessions:
1773				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1774			else:
1775				self.Nf = 0
1776				for sg in weighted_sessions:
1777					self.Nf += self.rmswd(sessions = sg)['Nf']
1778
1779			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1780
1781			avgD4x = {
1782				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1783				for sample in self.samples
1784				}
1785			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1786			rD4x = (chi2/self.Nf)**.5
1787			self.repeatability[f'sigma_{self._4x}'] = rD4x
1788
1789			if consolidate:
1790				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1791
1792
1793	def standardization_error(self, session, d4x, D4x, t = 0):
1794		'''
1795		Compute standardization error for a given session and
1796		(δ47, Δ47) composition.
1797		'''
1798		a = self.sessions[session]['a']
1799		b = self.sessions[session]['b']
1800		c = self.sessions[session]['c']
1801		a2 = self.sessions[session]['a2']
1802		b2 = self.sessions[session]['b2']
1803		c2 = self.sessions[session]['c2']
1804		CM = self.sessions[session]['CM']
1805
1806		x, y = D4x, d4x
1807		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1808# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1809		dxdy = -(b+b2*t) / (a+a2*t)
1810		dxdz = 1. / (a+a2*t)
1811		dxda = -x / (a+a2*t)
1812		dxdb = -y / (a+a2*t)
1813		dxdc = -1. / (a+a2*t)
1814		dxda2 = -x * a2 / (a+a2*t)
1815		dxdb2 = -y * t / (a+a2*t)
1816		dxdc2 = -t / (a+a2*t)
1817		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1818		sx = (V @ CM @ V.T) ** .5
1819		return sx
1820
1821
1822	@make_verbal
1823	def summary(self,
1824		dir = 'output',
1825		filename = None,
1826		save_to_file = True,
1827		print_out = True,
1828		):
1829		'''
1830		Print out an/or save to disk a summary of the standardization results.
1831
1832		**Parameters**
1833
1834		+ `dir`: the directory in which to save the table
1835		+ `filename`: the name to the csv file to write to
1836		+ `save_to_file`: whether to save the table to disk
1837		+ `print_out`: whether to print out the table
1838		'''
1839
1840		out = []
1841		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1842		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1843		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1844		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1845		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1848		out += [['Model degrees of freedom', f"{self.Nf}"]]
1849		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1850		out += [['Standardization method', self.standardization_method]]
1851
1852		if save_to_file:
1853			if not os.path.exists(dir):
1854				os.makedirs(dir)
1855			if filename is None:
1856				filename = f'D{self._4x}_summary.csv'
1857			with open(f'{dir}/{filename}', 'w') as fid:
1858				fid.write(make_csv(out))
1859		if print_out:
1860			self.msg('\n' + pretty_table(out, header = 0))
1861
1862
1863	@make_verbal
1864	def table_of_sessions(self,
1865		dir = 'output',
1866		filename = None,
1867		save_to_file = True,
1868		print_out = True,
1869		output = None,
1870		):
1871		'''
1872		Print out an/or save to disk a table of sessions.
1873
1874		**Parameters**
1875
1876		+ `dir`: the directory in which to save the table
1877		+ `filename`: the name to the csv file to write to
1878		+ `save_to_file`: whether to save the table to disk
1879		+ `print_out`: whether to print out the table
1880		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1881		    if set to `'raw'`: return a list of list of strings
1882		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1883		'''
1884		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1885		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1886		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1887
1888		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1889		if include_a2:
1890			out[-1] += ['a2 ± SE']
1891		if include_b2:
1892			out[-1] += ['b2 ± SE']
1893		if include_c2:
1894			out[-1] += ['c2 ± SE']
1895		for session in self.sessions:
1896			out += [[
1897				session,
1898				f"{self.sessions[session]['Na']}",
1899				f"{self.sessions[session]['Nu']}",
1900				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1901				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1902				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1903				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1904				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1905				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1906				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1907				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1908				]]
1909			if include_a2:
1910				if self.sessions[session]['scrambling_drift']:
1911					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1912				else:
1913					out[-1] += ['']
1914			if include_b2:
1915				if self.sessions[session]['slope_drift']:
1916					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1917				else:
1918					out[-1] += ['']
1919			if include_c2:
1920				if self.sessions[session]['wg_drift']:
1921					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1922				else:
1923					out[-1] += ['']
1924
1925		if save_to_file:
1926			if not os.path.exists(dir):
1927				os.makedirs(dir)
1928			if filename is None:
1929				filename = f'D{self._4x}_sessions.csv'
1930			with open(f'{dir}/{filename}', 'w') as fid:
1931				fid.write(make_csv(out))
1932		if print_out:
1933			self.msg('\n' + pretty_table(out))
1934		if output == 'raw':
1935			return out
1936		elif output == 'pretty':
1937			return pretty_table(out)
1938
1939
1940	@make_verbal
1941	def table_of_analyses(
1942		self,
1943		dir = 'output',
1944		filename = None,
1945		save_to_file = True,
1946		print_out = True,
1947		output = None,
1948		):
1949		'''
1950		Print out an/or save to disk a table of analyses.
1951
1952		**Parameters**
1953
1954		+ `dir`: the directory in which to save the table
1955		+ `filename`: the name to the csv file to write to
1956		+ `save_to_file`: whether to save the table to disk
1957		+ `print_out`: whether to print out the table
1958		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1959		    if set to `'raw'`: return a list of list of strings
1960		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1961		'''
1962
1963		out = [['UID','Session','Sample']]
1964		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1965		for f in extra_fields:
1966			out[-1] += [f[0]]
1967		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1968		for r in self:
1969			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1970			for f in extra_fields:
1971				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1972			out[-1] += [
1973				f"{r['d13Cwg_VPDB']:.3f}",
1974				f"{r['d18Owg_VSMOW']:.3f}",
1975				f"{r['d45']:.6f}",
1976				f"{r['d46']:.6f}",
1977				f"{r['d47']:.6f}",
1978				f"{r['d48']:.6f}",
1979				f"{r['d49']:.6f}",
1980				f"{r['d13C_VPDB']:.6f}",
1981				f"{r['d18O_VSMOW']:.6f}",
1982				f"{r['D47raw']:.6f}",
1983				f"{r['D48raw']:.6f}",
1984				f"{r['D49raw']:.6f}",
1985				f"{r[f'D{self._4x}']:.6f}"
1986				]
1987		if save_to_file:
1988			if not os.path.exists(dir):
1989				os.makedirs(dir)
1990			if filename is None:
1991				filename = f'D{self._4x}_analyses.csv'
1992			with open(f'{dir}/{filename}', 'w') as fid:
1993				fid.write(make_csv(out))
1994		if print_out:
1995			self.msg('\n' + pretty_table(out))
1996		return out
1997
1998	@make_verbal
1999	def covar_table(
2000		self,
2001		correl = False,
2002		dir = 'output',
2003		filename = None,
2004		save_to_file = True,
2005		print_out = True,
2006		output = None,
2007		):
2008		'''
2009		Print out, save to disk and/or return the variance-covariance matrix of D4x
2010		for all unknown samples.
2011
2012		**Parameters**
2013
2014		+ `dir`: the directory in which to save the csv
2015		+ `filename`: the name of the csv file to write to
2016		+ `save_to_file`: whether to save the csv
2017		+ `print_out`: whether to print out the matrix
2018		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2019		    if set to `'raw'`: return a list of list of strings
2020		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2021		'''
2022		samples = sorted([u for u in self.unknowns])
2023		out = [[''] + samples]
2024		for s1 in samples:
2025			out.append([s1])
2026			for s2 in samples:
2027				if correl:
2028					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2029				else:
2030					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2031
2032		if save_to_file:
2033			if not os.path.exists(dir):
2034				os.makedirs(dir)
2035			if filename is None:
2036				if correl:
2037					filename = f'D{self._4x}_correl.csv'
2038				else:
2039					filename = f'D{self._4x}_covar.csv'
2040			with open(f'{dir}/{filename}', 'w') as fid:
2041				fid.write(make_csv(out))
2042		if print_out:
2043			self.msg('\n'+pretty_table(out))
2044		if output == 'raw':
2045			return out
2046		elif output == 'pretty':
2047			return pretty_table(out)
2048
2049	@make_verbal
2050	def table_of_samples(
2051		self,
2052		dir = 'output',
2053		filename = None,
2054		save_to_file = True,
2055		print_out = True,
2056		output = None,
2057		):
2058		'''
2059		Print out, save to disk and/or return a table of samples.
2060
2061		**Parameters**
2062
2063		+ `dir`: the directory in which to save the csv
2064		+ `filename`: the name of the csv file to write to
2065		+ `save_to_file`: whether to save the csv
2066		+ `print_out`: whether to print out the table
2067		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2068		    if set to `'raw'`: return a list of list of strings
2069		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2070		'''
2071
2072		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2073		for sample in self.anchors:
2074			out += [[
2075				f"{sample}",
2076				f"{self.samples[sample]['N']}",
2077				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2078				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2079				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2080				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2081				]]
2082		for sample in self.unknowns:
2083			out += [[
2084				f"{sample}",
2085				f"{self.samples[sample]['N']}",
2086				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2087				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2088				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2089				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2090				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2091				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2092				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2093				]]
2094		if save_to_file:
2095			if not os.path.exists(dir):
2096				os.makedirs(dir)
2097			if filename is None:
2098				filename = f'D{self._4x}_samples.csv'
2099			with open(f'{dir}/{filename}', 'w') as fid:
2100				fid.write(make_csv(out))
2101		if print_out:
2102			self.msg('\n'+pretty_table(out))
2103		if output == 'raw':
2104			return out
2105		elif output == 'pretty':
2106			return pretty_table(out)
2107
2108
2109	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2110		'''
2111		Generate session plots and save them to disk.
2112
2113		**Parameters**
2114
2115		+ `dir`: the directory in which to save the plots
2116		+ `figsize`: the width and height (in inches) of each plot
2117		+ `filetype`: 'pdf' or 'png'
2118		+ `dpi`: resolution for PNG output
2119		'''
2120		if not os.path.exists(dir):
2121			os.makedirs(dir)
2122
2123		for session in self.sessions:
2124			sp = self.plot_single_session(session, xylimits = 'constant')
2125			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2126			ppl.close(sp.fig)
2127			
2128
2129
2130	@make_verbal
2131	def consolidate_samples(self):
2132		'''
2133		Compile various statistics for each sample.
2134
2135		For each anchor sample:
2136
2137		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2138		+ `SE_D47` or `SE_D48`: set to zero by definition
2139
2140		For each unknown sample:
2141
2142		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2143		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2144
2145		For each anchor and unknown:
2146
2147		+ `N`: the total number of analyses of this sample
2148		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2149		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2150		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2151		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2152		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2153		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2154		'''
2155		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2156		for sample in self.samples:
2157			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2158			if self.samples[sample]['N'] > 1:
2159				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2160
2161			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2162			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2163
2164			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2165			if len(D4x_pop) > 2:
2166				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2167			
2168		if self.standardization_method == 'pooled':
2169			for sample in self.anchors:
2170				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2171				self.samples[sample][f'SE_D{self._4x}'] = 0.
2172			for sample in self.unknowns:
2173				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2174				try:
2175					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2176				except ValueError:
2177					# when `sample` is constrained by self.standardize(constraints = {...}),
2178					# it is no longer listed in self.standardization.var_names.
2179					# Temporary fix: define SE as zero for now
2180					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2181
2182		elif self.standardization_method == 'indep_sessions':
2183			for sample in self.anchors:
2184				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2185				self.samples[sample][f'SE_D{self._4x}'] = 0.
2186			for sample in self.unknowns:
2187				self.msg(f'Consolidating sample {sample}')
2188				self.unknowns[sample][f'session_D{self._4x}'] = {}
2189				session_avg = []
2190				for session in self.sessions:
2191					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2192					if sdata:
2193						self.msg(f'{sample} found in session {session}')
2194						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2195						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2196						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2197						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2198						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2199						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2200						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2201				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2202				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2203				wsum = sum([weights[s] for s in weights])
2204				for s in weights:
2205					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2206
2207		for r in self:
2208			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2209
2210
2211
2212	def consolidate_sessions(self):
2213		'''
2214		Compute various statistics for each session.
2215
2216		+ `Na`: Number of anchor analyses in the session
2217		+ `Nu`: Number of unknown analyses in the session
2218		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2219		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2220		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2221		+ `a`: scrambling factor
2222		+ `b`: compositional slope
2223		+ `c`: WG offset
2224		+ `SE_a`: Model stadard erorr of `a`
2225		+ `SE_b`: Model stadard erorr of `b`
2226		+ `SE_c`: Model stadard erorr of `c`
2227		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2228		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2229		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2230		+ `a2`: scrambling factor drift
2231		+ `b2`: compositional slope drift
2232		+ `c2`: WG offset drift
2233		+ `Np`: Number of standardization parameters to fit
2234		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2235		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2236		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2237		'''
2238		for session in self.sessions:
2239			if 'd13Cwg_VPDB' not in self.sessions[session]:
2240				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2241			if 'd18Owg_VSMOW' not in self.sessions[session]:
2242				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2243			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2244			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2245
2246			self.msg(f'Computing repeatabilities for session {session}')
2247			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2248			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2249			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2250
2251		if self.standardization_method == 'pooled':
2252			for session in self.sessions:
2253
2254				# different (better?) computation of D4x repeatability for each session:
2255				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2256				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2257
2258				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2259				i = self.standardization.var_names.index(f'a_{pf(session)}')
2260				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2261
2262				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2263				i = self.standardization.var_names.index(f'b_{pf(session)}')
2264				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2265
2266				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2267				i = self.standardization.var_names.index(f'c_{pf(session)}')
2268				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2269
2270				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2271				if self.sessions[session]['scrambling_drift']:
2272					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2273					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2274				else:
2275					self.sessions[session]['SE_a2'] = 0.
2276
2277				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2278				if self.sessions[session]['slope_drift']:
2279					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2280					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2281				else:
2282					self.sessions[session]['SE_b2'] = 0.
2283
2284				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2285				if self.sessions[session]['wg_drift']:
2286					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2287					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2288				else:
2289					self.sessions[session]['SE_c2'] = 0.
2290
2291				i = self.standardization.var_names.index(f'a_{pf(session)}')
2292				j = self.standardization.var_names.index(f'b_{pf(session)}')
2293				k = self.standardization.var_names.index(f'c_{pf(session)}')
2294				CM = np.zeros((6,6))
2295				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2296				try:
2297					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2298					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2299					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2300					try:
2301						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2302						CM[3,4] = self.standardization.covar[i2,j2]
2303						CM[4,3] = self.standardization.covar[j2,i2]
2304					except ValueError:
2305						pass
2306					try:
2307						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2308						CM[3,5] = self.standardization.covar[i2,k2]
2309						CM[5,3] = self.standardization.covar[k2,i2]
2310					except ValueError:
2311						pass
2312				except ValueError:
2313					pass
2314				try:
2315					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2316					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2317					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2318					try:
2319						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2320						CM[4,5] = self.standardization.covar[j2,k2]
2321						CM[5,4] = self.standardization.covar[k2,j2]
2322					except ValueError:
2323						pass
2324				except ValueError:
2325					pass
2326				try:
2327					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2328					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2329					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2330				except ValueError:
2331					pass
2332
2333				self.sessions[session]['CM'] = CM
2334
2335		elif self.standardization_method == 'indep_sessions':
2336			pass # Not implemented yet
2337
2338
2339	@make_verbal
2340	def repeatabilities(self):
2341		'''
2342		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2343		(for all samples, for anchors, and for unknowns).
2344		'''
2345		self.msg('Computing reproducibilities for all sessions')
2346
2347		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2348		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2349		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2351		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2352
2353
2354	@make_verbal
2355	def consolidate(self, tables = True, plots = True):
2356		'''
2357		Collect information about samples, sessions and repeatabilities.
2358		'''
2359		self.consolidate_samples()
2360		self.consolidate_sessions()
2361		self.repeatabilities()
2362
2363		if tables:
2364			self.summary()
2365			self.table_of_sessions()
2366			self.table_of_analyses()
2367			self.table_of_samples()
2368
2369		if plots:
2370			self.plot_sessions()
2371
2372
2373	@make_verbal
2374	def rmswd(self,
2375		samples = 'all samples',
2376		sessions = 'all sessions',
2377		):
2378		'''
2379		Compute the χ2, root mean squared weighted deviation
2380		(i.e. reduced χ2), and corresponding degrees of freedom of the
2381		Δ4x values for samples in `samples` and sessions in `sessions`.
2382		
2383		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2384		'''
2385		if samples == 'all samples':
2386			mysamples = [k for k in self.samples]
2387		elif samples == 'anchors':
2388			mysamples = [k for k in self.anchors]
2389		elif samples == 'unknowns':
2390			mysamples = [k for k in self.unknowns]
2391		else:
2392			mysamples = samples
2393
2394		if sessions == 'all sessions':
2395			sessions = [k for k in self.sessions]
2396
2397		chisq, Nf = 0, 0
2398		for sample in mysamples :
2399			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2400			if len(G) > 1 :
2401				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2402				Nf += (len(G) - 1)
2403				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2404		r = (chisq / Nf)**.5 if Nf > 0 else 0
2405		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2406		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2407
2408	
2409	@make_verbal
2410	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2411		'''
2412		Compute the repeatability of `[r[key] for r in self]`
2413		'''
2414
2415		if samples == 'all samples':
2416			mysamples = [k for k in self.samples]
2417		elif samples == 'anchors':
2418			mysamples = [k for k in self.anchors]
2419		elif samples == 'unknowns':
2420			mysamples = [k for k in self.unknowns]
2421		else:
2422			mysamples = samples
2423
2424		if sessions == 'all sessions':
2425			sessions = [k for k in self.sessions]
2426
2427		if key in ['D47', 'D48']:
2428			# Full disclosure: the definition of Nf is tricky/debatable
2429			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2430			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2431			Nf = len(G)
2432# 			print(f'len(G) = {Nf}')
2433			Nf -= len([s for s in mysamples if s in self.unknowns])
2434# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2435			for session in sessions:
2436				Np = len([
2437					_ for _ in self.standardization.params
2438					if (
2439						self.standardization.params[_].expr is not None
2440						and (
2441							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2442							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2443							)
2444						)
2445					])
2446# 				print(f'session {session}: {Np} parameters to consider')
2447				Na = len({
2448					r['Sample'] for r in self.sessions[session]['data']
2449					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2450					})
2451# 				print(f'session {session}: {Na} different anchors in that session')
2452				Nf -= min(Np, Na)
2453# 			print(f'Nf = {Nf}')
2454
2455# 			for sample in mysamples :
2456# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2457# 				if len(X) > 1 :
2458# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2459# 					if sample in self.unknowns:
2460# 						Nf += len(X) - 1
2461# 					else:
2462# 						Nf += len(X)
2463# 			if samples in ['anchors', 'all samples']:
2464# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2465			r = (chisq / Nf)**.5 if Nf > 0 else 0
2466
2467		else: # if key not in ['D47', 'D48']
2468			chisq, Nf = 0, 0
2469			for sample in mysamples :
2470				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2471				if len(X) > 1 :
2472					Nf += len(X) - 1
2473					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2474			r = (chisq / Nf)**.5 if Nf > 0 else 0
2475
2476		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2477		return r
2478
2479	def sample_average(self, samples, weights = 'equal', normalize = True):
2480		'''
2481		Weighted average Δ4x value of a group of samples, accounting for covariance.
2482
2483		Returns the weighed average Δ4x value and associated SE
2484		of a group of samples. Weights are equal by default. If `normalize` is
2485		true, `weights` will be rescaled so that their sum equals 1.
2486
2487		**Examples**
2488
2489		```python
2490		self.sample_average(['X','Y'], [1, 2])
2491		```
2492
2493		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2494		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2495		values of samples X and Y, respectively.
2496
2497		```python
2498		self.sample_average(['X','Y'], [1, -1], normalize = False)
2499		```
2500
2501		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2502		'''
2503		if weights == 'equal':
2504			weights = [1/len(samples)] * len(samples)
2505
2506		if normalize:
2507			s = sum(weights)
2508			if s:
2509				weights = [w/s for w in weights]
2510
2511		try:
2512# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2513# 			C = self.standardization.covar[indices,:][:,indices]
2514			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2515			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2516			return correlated_sum(X, C, weights)
2517		except ValueError:
2518			return (0., 0.)
2519
2520
2521	def sample_D4x_covar(self, sample1, sample2 = None):
2522		'''
2523		Covariance between Δ4x values of samples
2524
2525		Returns the error covariance between the average Δ4x values of two
2526		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2527		returns the Δ4x variance for that sample.
2528		'''
2529		if sample2 is None:
2530			sample2 = sample1
2531		if self.standardization_method == 'pooled':
2532			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2533			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2534			return self.standardization.covar[i, j]
2535		elif self.standardization_method == 'indep_sessions':
2536			if sample1 == sample2:
2537				return self.samples[sample1][f'SE_D{self._4x}']**2
2538			else:
2539				c = 0
2540				for session in self.sessions:
2541					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2542					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2543					if sdata1 and sdata2:
2544						a = self.sessions[session]['a']
2545						# !! TODO: CM below does not account for temporal changes in standardization parameters
2546						CM = self.sessions[session]['CM'][:3,:3]
2547						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2548						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2549						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2550						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2551						c += (
2552							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2553							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2554							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2555							@ CM
2556							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2557							) / a**2
2558				return float(c)
2559
2560	def sample_D4x_correl(self, sample1, sample2 = None):
2561		'''
2562		Correlation between Δ4x errors of samples
2563
2564		Returns the error correlation between the average Δ4x values of two samples.
2565		'''
2566		if sample2 is None or sample2 == sample1:
2567			return 1.
2568		return (
2569			self.sample_D4x_covar(sample1, sample2)
2570			/ self.unknowns[sample1][f'SE_D{self._4x}']
2571			/ self.unknowns[sample2][f'SE_D{self._4x}']
2572			)
2573
2574	def plot_single_session(self,
2575		session,
2576		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2577		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2578		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2579		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2580		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2581		xylimits = 'free', # | 'constant'
2582		x_label = None,
2583		y_label = None,
2584		error_contour_interval = 'auto',
2585		fig = 'new',
2586		):
2587		'''
2588		Generate plot for a single session
2589		'''
2590		if x_label is None:
2591			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2592		if y_label is None:
2593			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2594
2595		out = _SessionPlot()
2596		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2597		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2598		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2599		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2601		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		anchor_avg = (np.array([ np.array([
2603				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2604				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2605				]) for sample in anchors]).T,
2606			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2607		unknown_avg = (np.array([ np.array([
2608				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2609				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2610				]) for sample in unknowns]).T,
2611			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2612		
2613		
2614		if fig == 'new':
2615			out.fig = ppl.figure(figsize = (6,6))
2616			ppl.subplots_adjust(.1,.1,.9,.9)
2617
2618		out.anchor_analyses, = ppl.plot(
2619			anchors_d,
2620			anchors_D,
2621			**kw_plot_anchors)
2622		out.unknown_analyses, = ppl.plot(
2623			unknowns_d,
2624			unknowns_D,
2625			**kw_plot_unknowns)
2626		out.anchor_avg = ppl.plot(
2627			*anchor_avg,
2628			**kw_plot_anchor_avg)
2629		out.unknown_avg = ppl.plot(
2630			*unknown_avg,
2631			**kw_plot_unknown_avg)
2632		if xylimits == 'constant':
2633			x = [r[f'd{self._4x}'] for r in self]
2634			y = [r[f'D{self._4x}'] for r in self]
2635			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2636			w, h = x2-x1, y2-y1
2637			x1 -= w/20
2638			x2 += w/20
2639			y1 -= h/20
2640			y2 += h/20
2641			ppl.axis([x1, x2, y1, y2])
2642		elif xylimits == 'free':
2643			x1, x2, y1, y2 = ppl.axis()
2644		else:
2645			x1, x2, y1, y2 = ppl.axis(xylimits)
2646				
2647		if error_contour_interval != 'none':
2648			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2649			XI,YI = np.meshgrid(xi, yi)
2650			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2651			if error_contour_interval == 'auto':
2652				rng = np.max(SI) - np.min(SI)
2653				if rng <= 0.01:
2654					cinterval = 0.001
2655				elif rng <= 0.03:
2656					cinterval = 0.004
2657				elif rng <= 0.1:
2658					cinterval = 0.01
2659				elif rng <= 0.3:
2660					cinterval = 0.03
2661				elif rng <= 1.:
2662					cinterval = 0.1
2663				else:
2664					cinterval = 0.5
2665			else:
2666				cinterval = error_contour_interval
2667
2668			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2669			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2670			out.clabel = ppl.clabel(out.contour)
2671			contour = (XI, YI, SI, cval, cinterval)
2672
2673		if fig == None:
2674			return {
2675			'anchors':anchors,
2676			'unknowns':unknowns,
2677			'anchors_d':anchors_d,
2678			'anchors_D':anchors_D,
2679			'unknowns_d':unknowns_d,
2680			'unknowns_D':unknowns_D,
2681			'anchor_avg':anchor_avg,
2682			'unknown_avg':unknown_avg,
2683			'contour':contour,
2684			}
2685
2686		ppl.xlabel(x_label)
2687		ppl.ylabel(y_label)
2688		ppl.title(session, weight = 'bold')
2689		ppl.grid(alpha = .2)
2690		out.ax = ppl.gca()		
2691
2692		return out
2693
2694	def plot_residuals(
2695		self,
2696		kde = False,
2697		hist = False,
2698		binwidth = 2/3,
2699		dir = 'output',
2700		filename = None,
2701		highlight = [],
2702		colors = None,
2703		figsize = None,
2704		dpi = 100,
2705		yspan = None,
2706		):
2707		'''
2708		Plot residuals of each analysis as a function of time (actually, as a function of
2709		the order of analyses in the `D4xdata` object)
2710
2711		+ `kde`: whether to add a kernel density estimate of residuals
2712		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2713		+ `histbins`: specify bin edges for the histogram
2714		+ `dir`: the directory in which to save the plot
2715		+ `highlight`: a list of samples to highlight
2716		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2717		+ `figsize`: (width, height) of figure
2718		+ `dpi`: resolution for PNG output
2719		+ `yspan`: factor controlling the range of y values shown in plot
2720		  (by default: `yspan = 1.5 if kde else 1.0`)
2721		'''
2722		
2723		from matplotlib import ticker
2724
2725		if yspan is None:
2726			if kde:
2727				yspan = 1.5
2728			else:
2729				yspan = 1.0
2730		
2731		# Layout
2732		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2733		if hist or kde:
2734			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2735			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2736		else:
2737			ppl.subplots_adjust(.08,.05,.78,.8)
2738			ax1 = ppl.subplot(111)
2739		
2740		# Colors
2741		N = len(self.anchors)
2742		if colors is None:
2743			if len(highlight) > 0:
2744				Nh = len(highlight)
2745				if Nh == 1:
2746					colors = {highlight[0]: (0,0,0)}
2747				elif Nh == 3:
2748					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2749				elif Nh == 4:
2750					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2751				else:
2752					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2753			else:
2754				if N == 3:
2755					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2756				elif N == 4:
2757					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2758				else:
2759					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2760
2761		ppl.sca(ax1)
2762		
2763		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2764
2765		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2766
2767		session = self[0]['Session']
2768		x1 = 0
2769# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2770		x_sessions = {}
2771		one_or_more_singlets = False
2772		one_or_more_multiplets = False
2773		multiplets = set()
2774		for k,r in enumerate(self):
2775			if r['Session'] != session:
2776				x2 = k-1
2777				x_sessions[session] = (x1+x2)/2
2778				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2779				session = r['Session']
2780				x1 = k
2781			singlet = len(self.samples[r['Sample']]['data']) == 1
2782			if not singlet:
2783				multiplets.add(r['Sample'])
2784			if r['Sample'] in self.unknowns:
2785				if singlet:
2786					one_or_more_singlets = True
2787				else:
2788					one_or_more_multiplets = True
2789			kw = dict(
2790				marker = 'x' if singlet else '+',
2791				ms = 4 if singlet else 5,
2792				ls = 'None',
2793				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2794				mew = 1,
2795				alpha = 0.2 if singlet else 1,
2796				)
2797			if highlight and r['Sample'] not in highlight:
2798				kw['alpha'] = 0.2
2799			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2800		x2 = k
2801		x_sessions[session] = (x1+x2)/2
2802
2803		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2805		if not (hist or kde):
2806			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808
2809		xmin, xmax, ymin, ymax = ppl.axis()
2810		if yspan != 1:
2811			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2812		for s in x_sessions:
2813			ppl.text(
2814				x_sessions[s],
2815				ymax +1,
2816				s,
2817				va = 'bottom',
2818				**(
2819					dict(ha = 'center')
2820					if len(self.sessions[s]['data']) > (0.15 * len(self))
2821					else dict(ha = 'left', rotation = 45)
2822					)
2823				)
2824
2825		if hist or kde:
2826			ppl.sca(ax2)
2827
2828		for s in colors:
2829			kw['marker'] = '+'
2830			kw['ms'] = 5
2831			kw['mec'] = colors[s]
2832			kw['label'] = s
2833			kw['alpha'] = 1
2834			ppl.plot([], [], **kw)
2835
2836		kw['mec'] = (0,0,0)
2837
2838		if one_or_more_singlets:
2839			kw['marker'] = 'x'
2840			kw['ms'] = 4
2841			kw['alpha'] = .2
2842			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2843			ppl.plot([], [], **kw)
2844
2845		if one_or_more_multiplets:
2846			kw['marker'] = '+'
2847			kw['ms'] = 4
2848			kw['alpha'] = 1
2849			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2850			ppl.plot([], [], **kw)
2851
2852		if hist or kde:
2853			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2854		else:
2855			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2856		leg.set_zorder(-1000)
2857
2858		ppl.sca(ax1)
2859
2860		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2861		ppl.xticks([])
2862		ppl.axis([-1, len(self), None, None])
2863
2864		if hist or kde:
2865			ppl.sca(ax2)
2866			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2867
2868			if kde:
2869				from scipy.stats import gaussian_kde
2870				yi = np.linspace(ymin, ymax, 201)
2871				xi = gaussian_kde(X).evaluate(yi)
2872				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2873# 				ppl.plot(xi, yi, 'k-', lw = 1)
2874			elif hist:
2875				ppl.hist(
2876					X,
2877					orientation = 'horizontal',
2878					histtype = 'stepfilled',
2879					ec = [.4]*3,
2880					fc = [.25]*3,
2881					alpha = .25,
2882					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2883					)
2884			ppl.text(0, 0,
2885				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2886				size = 7.5,
2887				alpha = 1,
2888				va = 'center',
2889				ha = 'left',
2890				)
2891
2892			ppl.axis([0, None, ymin, ymax])
2893			ppl.xticks([])
2894			ppl.yticks([])
2895# 			ax2.spines['left'].set_visible(False)
2896			ax2.spines['right'].set_visible(False)
2897			ax2.spines['top'].set_visible(False)
2898			ax2.spines['bottom'].set_visible(False)
2899
2900		ax1.axis([None, None, ymin, ymax])
2901
2902		if not os.path.exists(dir):
2903			os.makedirs(dir)
2904		if filename is None:
2905			return fig
2906		elif filename == '':
2907			filename = f'D{self._4x}_residuals.pdf'
2908		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2909		ppl.close(fig)
2910				
2911
2912	def simulate(self, *args, **kwargs):
2913		'''
2914		Legacy function with warning message pointing to `virtual_data()`
2915		'''
2916		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2917
2918	def plot_distribution_of_analyses(
2919		self,
2920		dir = 'output',
2921		filename = None,
2922		vs_time = False,
2923		figsize = (6,4),
2924		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2925		output = None,
2926		dpi = 100,
2927		):
2928		'''
2929		Plot temporal distribution of all analyses in the data set.
2930		
2931		**Parameters**
2932
2933		+ `dir`: the directory in which to save the plot
2934		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2935		+ `dpi`: resolution for PNG output
2936		+ `figsize`: (width, height) of figure
2937		+ `dpi`: resolution for PNG output
2938		'''
2939
2940		asamples = [s for s in self.anchors]
2941		usamples = [s for s in self.unknowns]
2942		if output is None or output == 'fig':
2943			fig = ppl.figure(figsize = figsize)
2944			ppl.subplots_adjust(*subplots_adjust)
2945		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2946		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax += (Xmax-Xmin)/40
2948		Xmin -= (Xmax-Xmin)/41
2949		for k, s in enumerate(asamples + usamples):
2950			if vs_time:
2951				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2952			else:
2953				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2954			Y = [-k for x in X]
2955			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2956			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2957			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2958		ppl.axis([Xmin, Xmax, -k-1, 1])
2959		ppl.xlabel('\ntime')
2960		ppl.gca().annotate('',
2961			xy = (0.6, -0.02),
2962			xycoords = 'axes fraction',
2963			xytext = (.4, -0.02), 
2964            arrowprops = dict(arrowstyle = "->", color = 'k'),
2965            )
2966			
2967
2968		x2 = -1
2969		for session in self.sessions:
2970			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2971			if vs_time:
2972				ppl.axvline(x1, color = 'k', lw = .75)
2973			if x2 > -1:
2974				if not vs_time:
2975					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2976			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2977# 			from xlrd import xldate_as_datetime
2978# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2979			if vs_time:
2980				ppl.axvline(x2, color = 'k', lw = .75)
2981				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2982			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2983
2984		ppl.xticks([])
2985		ppl.yticks([])
2986
2987		if output is None:
2988			if not os.path.exists(dir):
2989				os.makedirs(dir)
2990			if filename == None:
2991				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2992			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2993			ppl.close(fig)
2994		elif output == 'ax':
2995			return ppl.gca()
2996		elif output == 'fig':
2997			return fig
2998
2999
3000	def plot_bulk_compositions(
3001		self,
3002		samples = None,
3003		dir = 'output/bulk_compositions',
3004		figsize = (6,6),
3005		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3006		show = False,
3007		sample_color = (0,.5,1),
3008		analysis_color = (.7,.7,.7),
3009		labeldist = 0.3,
3010		radius = 0.05,
3011		):
3012		'''
3013		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3014		
3015		By default, creates a directory `./output/bulk_compositions` where plots for
3016		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3017		
3018		
3019		**Parameters**
3020
3021		+ `samples`: Only these samples are processed (by default: all samples).
3022		+ `dir`: where to save the plots
3023		+ `figsize`: (width, height) of figure
3024		+ `subplots_adjust`: passed to `subplots_adjust()`
3025		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3026		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3027		+ `sample_color`: color used for replicate markers/labels
3028		+ `analysis_color`: color used for sample markers/labels
3029		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3030		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3031		'''
3032
3033		from matplotlib.patches import Ellipse
3034
3035		if samples is None:
3036			samples = [_ for _ in self.samples]
3037
3038		saved = {}
3039
3040		for s in samples:
3041
3042			fig = ppl.figure(figsize = figsize)
3043			fig.subplots_adjust(*subplots_adjust)
3044			ax = ppl.subplot(111)
3045			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3046			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3047			ppl.title(s)
3048
3049
3050			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3051			UID = [_['UID'] for _ in self.samples[s]['data']]
3052			XY0 = XY.mean(0)
3053
3054			for xy in XY:
3055				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3056				
3057			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3058			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3059			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3060			saved[s] = [XY, XY0]
3061			
3062			x1, x2, y1, y2 = ppl.axis()
3063			x0, dx = (x1+x2)/2, (x2-x1)/2
3064			y0, dy = (y1+y2)/2, (y2-y1)/2
3065			dx, dy = [max(max(dx, dy), radius)]*2
3066
3067			ppl.axis([
3068				x0 - 1.2*dx,
3069				x0 + 1.2*dx,
3070				y0 - 1.2*dy,
3071				y0 + 1.2*dy,
3072				])			
3073
3074			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3075
3076			for xy, uid in zip(XY, UID):
3077
3078				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3079				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3080
3081				if (vector_in_display_space**2).sum() > 0:
3082
3083					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3084					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3085					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3086					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3087
3088					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3089
3090				else:
3091
3092					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3093
3094			if radius:
3095				ax.add_artist(Ellipse(
3096					xy = XY0,
3097					width = radius*2,
3098					height = radius*2,
3099					ls = (0, (2,2)),
3100					lw = .7,
3101					ec = analysis_color,
3102					fc = 'None',
3103					))
3104				ppl.text(
3105					XY0[0],
3106					XY0[1]-radius,
3107					f'\n± {radius*1e3:.0f} ppm',
3108					color = analysis_color,
3109					va = 'top',
3110					ha = 'center',
3111					linespacing = 0.4,
3112					size = 8,
3113					)
3114
3115			if not os.path.exists(dir):
3116				os.makedirs(dir)
3117			fig.savefig(f'{dir}/{s}.pdf')
3118			ppl.close(fig)
3119
3120		fig = ppl.figure(figsize = figsize)
3121		fig.subplots_adjust(*subplots_adjust)
3122		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3123		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3124
3125		for s in saved:
3126			for xy in saved[s][0]:
3127				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3128			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3129			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3130			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3131
3132		x1, x2, y1, y2 = ppl.axis()
3133		ppl.axis([
3134			x1 - (x2-x1)/10,
3135			x2 + (x2-x1)/10,
3136			y1 - (y2-y1)/10,
3137			y2 + (y2-y1)/10,
3138			])			
3139
3140
3141		if not os.path.exists(dir):
3142			os.makedirs(dir)
3143		fig.savefig(f'{dir}/__all__.pdf')
3144		if show:
3145			ppl.show()
3146		ppl.close(fig)
3147		
3148
3149	def _save_D4x_correl(
3150		self,
3151		samples = None,
3152		dir = 'output',
3153		filename = None,
3154		D4x_precision = 4,
3155		correl_precision = 4,
3156		):
3157		'''
3158		Save D4x values along with their SE and correlation matrix.
3159
3160		**Parameters**
3161
3162		+ `samples`: Only these samples are output (by default: all samples).
3163		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3164		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3165		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3166		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3167		'''
3168		if samples is None:
3169			samples = sorted([s for s in self.unknowns])
3170		
3171		out = [['Sample']] + [[s] for s in samples]
3172		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3173		for k,s in enumerate(samples):
3174			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3175			for s2 in samples:
3176				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3177		
3178		if not os.path.exists(dir):
3179			os.makedirs(dir)
3180		if filename is None:
3181			filename = f'D{self._4x}_correl.csv'
3182		with open(f'{dir}/{filename}', 'w') as fid:
3183			fid.write(make_csv(out))

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
955	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
956		'''
957		**Parameters**
958
959		+ `l`: a list of dictionaries, with each dictionary including at least the keys
960		`Sample`, `d45`, `d46`, and `d47` or `d48`.
961		+ `mass`: `'47'` or `'48'`
962		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
963		+ `session`: define session name for analyses without a `Session` key
964		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
965
966		Returns a `D4xdata` object derived from `list`.
967		'''
968		self._4x = mass
969		self.verbose = verbose
970		self.prefix = 'D4xdata'
971		self.logfile = logfile
972		list.__init__(self, l)
973		self.Nf = None
974		self.repeatability = {}
975		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
978	def make_verbal(oldfun):
979		'''
980		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
981		'''
982		@wraps(oldfun)
983		def newfun(*args, verbose = '', **kwargs):
984			myself = args[0]
985			oldprefix = myself.prefix
986			myself.prefix = oldfun.__name__
987			if verbose != '':
988				oldverbose = myself.verbose
989				myself.verbose = verbose
990			out = oldfun(*args, **kwargs)
991			myself.prefix = oldprefix
992			if verbose != '':
993				myself.verbose = oldverbose
994			return out
995		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
 998	def msg(self, txt):
 999		'''
1000		Log a message to `self.logfile`, and print it out if `verbose = True`
1001		'''
1002		self.log(txt)
1003		if self.verbose:
1004			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
1007	def vmsg(self, txt):
1008		'''
1009		Log a message to `self.logfile` and print it out
1010		'''
1011		self.log(txt)
1012		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
1015	def log(self, *txts):
1016		'''
1017		Log a message to `self.logfile`
1018		'''
1019		if self.logfile:
1020			with open(self.logfile, 'a') as fid:
1021				for txt in txts:
1022					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
1025	def refresh(self, session = 'mySession'):
1026		'''
1027		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1028		'''
1029		self.fill_in_missing_info(session = session)
1030		self.refresh_sessions()
1031		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1034	def refresh_sessions(self):
1035		'''
1036		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1037		to `False` for all sessions.
1038		'''
1039		self.sessions = {
1040			s: {'data': [r for r in self if r['Session'] == s]}
1041			for s in sorted({r['Session'] for r in self})
1042			}
1043		for s in self.sessions:
1044			self.sessions[s]['scrambling_drift'] = False
1045			self.sessions[s]['slope_drift'] = False
1046			self.sessions[s]['wg_drift'] = False
1047			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1048			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1051	def refresh_samples(self):
1052		'''
1053		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1054		'''
1055		self.samples = {
1056			s: {'data': [r for r in self if r['Sample'] == s]}
1057			for s in sorted({r['Sample'] for r in self})
1058			}
1059		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1060		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1063	def read(self, filename, sep = '', session = ''):
1064		'''
1065		Read file in csv format to load data into a `D47data` object.
1066
1067		In the csv file, spaces before and after field separators (`','` by default)
1068		are optional. Each line corresponds to a single analysis.
1069
1070		The required fields are:
1071
1072		+ `UID`: a unique identifier
1073		+ `Session`: an identifier for the analytical session
1074		+ `Sample`: a sample identifier
1075		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1076
1077		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1078		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1079		and `d49` are optional, and set to NaN by default.
1080
1081		**Parameters**
1082
1083		+ `fileneme`: the path of the file to read
1084		+ `sep`: csv separator delimiting the fields
1085		+ `session`: set `Session` field to this string for all analyses
1086		'''
1087		with open(filename) as fid:
1088			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1091	def input(self, txt, sep = '', session = ''):
1092		'''
1093		Read `txt` string in csv format to load analysis data into a `D47data` object.
1094
1095		In the csv string, spaces before and after field separators (`','` by default)
1096		are optional. Each line corresponds to a single analysis.
1097
1098		The required fields are:
1099
1100		+ `UID`: a unique identifier
1101		+ `Session`: an identifier for the analytical session
1102		+ `Sample`: a sample identifier
1103		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1104
1105		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1106		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1107		and `d49` are optional, and set to NaN by default.
1108
1109		**Parameters**
1110
1111		+ `txt`: the csv string to read
1112		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1113		whichever appers most often in `txt`.
1114		+ `session`: set `Session` field to this string for all analyses
1115		'''
1116		if sep == '':
1117			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1118		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1119		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1120
1121		if session != '':
1122			for r in data:
1123				r['Session'] = session
1124
1125		self += data
1126		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, a18_acid=None):
1129	@make_verbal
1130	def wg(self, samples = None, a18_acid = None):
1131		'''
1132		Compute bulk composition of the working gas for each session based on
1133		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1134		`self.Nominal_d18O_VPDB`.
1135		'''
1136
1137		self.msg('Computing WG composition:')
1138
1139		if a18_acid is None:
1140			a18_acid = self.ALPHA_18O_ACID_REACTION
1141		if samples is None:
1142			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1143
1144		assert a18_acid, f'Acid fractionation factor should not be zero.'
1145
1146		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1147		R45R46_standards = {}
1148		for sample in samples:
1149			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1150			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1151			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1152			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1153			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1154
1155			C12_s = 1 / (1 + R13_s)
1156			C13_s = R13_s / (1 + R13_s)
1157			C16_s = 1 / (1 + R17_s + R18_s)
1158			C17_s = R17_s / (1 + R17_s + R18_s)
1159			C18_s = R18_s / (1 + R17_s + R18_s)
1160
1161			C626_s = C12_s * C16_s ** 2
1162			C627_s = 2 * C12_s * C16_s * C17_s
1163			C628_s = 2 * C12_s * C16_s * C18_s
1164			C636_s = C13_s * C16_s ** 2
1165			C637_s = 2 * C13_s * C16_s * C17_s
1166			C727_s = C12_s * C17_s ** 2
1167
1168			R45_s = (C627_s + C636_s) / C626_s
1169			R46_s = (C628_s + C637_s + C727_s) / C626_s
1170			R45R46_standards[sample] = (R45_s, R46_s)
1171		
1172		for s in self.sessions:
1173			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1174			assert db, f'No sample from {samples} found in session "{s}".'
1175# 			dbsamples = sorted({r['Sample'] for r in db})
1176
1177			X = [r['d45'] for r in db]
1178			Y = [R45R46_standards[r['Sample']][0] for r in db]
1179			x1, x2 = np.min(X), np.max(X)
1180
1181			if x1 < x2:
1182				wgcoord = x1/(x1-x2)
1183			else:
1184				wgcoord = 999
1185
1186			if wgcoord < -.5 or wgcoord > 1.5:
1187				# unreasonable to extrapolate to d45 = 0
1188				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1189			else :
1190				# d45 = 0 is reasonably well bracketed
1191				R45_wg = np.polyfit(X, Y, 1)[1]
1192
1193			X = [r['d46'] for r in db]
1194			Y = [R45R46_standards[r['Sample']][1] for r in db]
1195			x1, x2 = np.min(X), np.max(X)
1196
1197			if x1 < x2:
1198				wgcoord = x1/(x1-x2)
1199			else:
1200				wgcoord = 999
1201
1202			if wgcoord < -.5 or wgcoord > 1.5:
1203				# unreasonable to extrapolate to d46 = 0
1204				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1205			else :
1206				# d46 = 0 is reasonably well bracketed
1207				R46_wg = np.polyfit(X, Y, 1)[1]
1208
1209			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1210
1211			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1212
1213			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1214			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1215			for r in self.sessions[s]['data']:
1216				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1217				r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

def compute_bulk_delta(self, R45, R46, D17O=0):
1220	def compute_bulk_delta(self, R45, R46, D17O = 0):
1221		'''
1222		Compute δ13C_VPDB and δ18O_VSMOW,
1223		by solving the generalized form of equation (17) from
1224		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1225		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1226		solving the corresponding second-order Taylor polynomial.
1227		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1228		'''
1229
1230		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1231
1232		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1233		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1234		C = 2 * self.R18_VSMOW
1235		D = -R46
1236
1237		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1238		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1239		cc = A + B + C + D
1240
1241		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1242
1243		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1244		R17 = K * R18 ** self.LAMBDA_17
1245		R13 = R45 - 2 * R17
1246
1247		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1248
1249		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1252	@make_verbal
1253	def crunch(self, verbose = ''):
1254		'''
1255		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1256		'''
1257		for r in self:
1258			self.compute_bulk_and_clumping_deltas(r)
1259		self.standardize_d13C()
1260		self.standardize_d18O()
1261		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1264	def fill_in_missing_info(self, session = 'mySession'):
1265		'''
1266		Fill in optional fields with default values
1267		'''
1268		for i,r in enumerate(self):
1269			if 'D17O' not in r:
1270				r['D17O'] = 0.
1271			if 'UID' not in r:
1272				r['UID'] = f'{i+1}'
1273			if 'Session' not in r:
1274				r['Session'] = session
1275			for k in ['d47', 'd48', 'd49']:
1276				if k not in r:
1277					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1280	def standardize_d13C(self):
1281		'''
1282		Perform δ13C standadization within each session `s` according to
1283		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1284		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1285		may be redefined abitrarily at a later stage.
1286		'''
1287		for s in self.sessions:
1288			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1289				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1290				X,Y = zip(*XY)
1291				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1292					offset = np.mean(Y) - np.mean(X)
1293					for r in self.sessions[s]['data']:
1294						r['d13C_VPDB'] += offset				
1295				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1296					a,b = np.polyfit(X,Y,1)
1297					for r in self.sessions[s]['data']:
1298						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1300	def standardize_d18O(self):
1301		'''
1302		Perform δ18O standadization within each session `s` according to
1303		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1304		which is defined by default by `D47data.refresh_sessions()`as equal to
1305		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1306		'''
1307		for s in self.sessions:
1308			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1309				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1310				X,Y = zip(*XY)
1311				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1312				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1313					offset = np.mean(Y) - np.mean(X)
1314					for r in self.sessions[s]['data']:
1315						r['d18O_VSMOW'] += offset				
1316				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1317					a,b = np.polyfit(X,Y,1)
1318					for r in self.sessions[s]['data']:
1319						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1322	def compute_bulk_and_clumping_deltas(self, r):
1323		'''
1324		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1325		'''
1326
1327		# Compute working gas R13, R18, and isobar ratios
1328		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1329		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1330		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1331
1332		# Compute analyte isobar ratios
1333		R45 = (1 + r['d45'] / 1000) * R45_wg
1334		R46 = (1 + r['d46'] / 1000) * R46_wg
1335		R47 = (1 + r['d47'] / 1000) * R47_wg
1336		R48 = (1 + r['d48'] / 1000) * R48_wg
1337		R49 = (1 + r['d49'] / 1000) * R49_wg
1338
1339		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1340		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1341		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1342
1343		# Compute stochastic isobar ratios of the analyte
1344		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1345			R13, R18, D17O = r['D17O']
1346		)
1347
1348		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1349		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1350		if (R45 / R45stoch - 1) > 5e-8:
1351			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1352		if (R46 / R46stoch - 1) > 5e-8:
1353			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1354
1355		# Compute raw clumped isotope anomalies
1356		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1357		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1358		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1361	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1362		'''
1363		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1364		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1365		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1366		'''
1367
1368		# Compute R17
1369		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1370
1371		# Compute isotope concentrations
1372		C12 = (1 + R13) ** -1
1373		C13 = C12 * R13
1374		C16 = (1 + R17 + R18) ** -1
1375		C17 = C16 * R17
1376		C18 = C16 * R18
1377
1378		# Compute stochastic isotopologue concentrations
1379		C626 = C16 * C12 * C16
1380		C627 = C16 * C12 * C17 * 2
1381		C628 = C16 * C12 * C18 * 2
1382		C636 = C16 * C13 * C16
1383		C637 = C16 * C13 * C17 * 2
1384		C638 = C16 * C13 * C18 * 2
1385		C727 = C17 * C12 * C17
1386		C728 = C17 * C12 * C18 * 2
1387		C737 = C17 * C13 * C17
1388		C738 = C17 * C13 * C18 * 2
1389		C828 = C18 * C12 * C18
1390		C838 = C18 * C13 * C18
1391
1392		# Compute stochastic isobar ratios
1393		R45 = (C636 + C627) / C626
1394		R46 = (C628 + C637 + C727) / C626
1395		R47 = (C638 + C728 + C737) / C626
1396		R48 = (C738 + C828) / C626
1397		R49 = C838 / C626
1398
1399		# Account for stochastic anomalies
1400		R47 *= 1 + D47 / 1000
1401		R48 *= 1 + D48 / 1000
1402		R49 *= 1 + D49 / 1000
1403
1404		# Return isobar ratios
1405		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1408	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1409		'''
1410		Split unknown samples by UID (treat all analyses as different samples)
1411		or by session (treat analyses of a given sample in different sessions as
1412		different samples).
1413
1414		**Parameters**
1415
1416		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1417		+ `grouping`: `by_uid` | `by_session`
1418		'''
1419		if samples_to_split == 'all':
1420			samples_to_split = [s for s in self.unknowns]
1421		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1422		self.grouping = grouping.lower()
1423		if self.grouping in gkeys:
1424			gkey = gkeys[self.grouping]
1425		for r in self:
1426			if r['Sample'] in samples_to_split:
1427				r['Sample_original'] = r['Sample']
1428				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1429			elif r['Sample'] in self.unknowns:
1430				r['Sample_original'] = r['Sample']
1431		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1434	def unsplit_samples(self, tables = False):
1435		'''
1436		Reverse the effects of `D47data.split_samples()`.
1437		
1438		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1439		
1440		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1441		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1442		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1443		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1444		that case session-averaged Δ4x values are statistically independent).
1445		'''
1446		unknowns_old = sorted({s for s in self.unknowns})
1447		CM_old = self.standardization.covar[:,:]
1448		VD_old = self.standardization.params.valuesdict().copy()
1449		vars_old = self.standardization.var_names
1450
1451		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1452
1453		Ns = len(vars_old) - len(unknowns_old)
1454		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1455		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1456
1457		W = np.zeros((len(vars_new), len(vars_old)))
1458		W[:Ns,:Ns] = np.eye(Ns)
1459		for u in unknowns_new:
1460			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1461			if self.grouping == 'by_session':
1462				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1463			elif self.grouping == 'by_uid':
1464				weights = [1 for s in splits]
1465			sw = sum(weights)
1466			weights = [w/sw for w in weights]
1467			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1468
1469		CM_new = W @ CM_old @ W.T
1470		V = W @ np.array([[VD_old[k]] for k in vars_old])
1471		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1472
1473		self.standardization.covar = CM_new
1474		self.standardization.params.valuesdict = lambda : VD_new
1475		self.standardization.var_names = vars_new
1476
1477		for r in self:
1478			if r['Sample'] in self.unknowns:
1479				r['Sample_split'] = r['Sample']
1480				r['Sample'] = r['Sample_original']
1481
1482		self.refresh_samples()
1483		self.consolidate_samples()
1484		self.repeatabilities()
1485
1486		if tables:
1487			self.table_of_analyses()
1488			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1490	def assign_timestamps(self):
1491		'''
1492		Assign a time field `t` of type `float` to each analysis.
1493
1494		If `TimeTag` is one of the data fields, `t` is equal within a given session
1495		to `TimeTag` minus the mean value of `TimeTag` for that session.
1496		Otherwise, `TimeTag` is by default equal to the index of each analysis
1497		in the dataset and `t` is defined as above.
1498		'''
1499		for session in self.sessions:
1500			sdata = self.sessions[session]['data']
1501			try:
1502				t0 = np.mean([r['TimeTag'] for r in sdata])
1503				for r in sdata:
1504					r['t'] = r['TimeTag'] - t0
1505			except KeyError:
1506				t0 = (len(sdata)-1)/2
1507				for t,r in enumerate(sdata):
1508					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1511	def report(self):
1512		'''
1513		Prints a report on the standardization fit.
1514		Only applicable after `D4xdata.standardize(method='pooled')`.
1515		'''
1516		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1519	def combine_samples(self, sample_groups):
1520		'''
1521		Combine analyses of different samples to compute weighted average Δ4x
1522		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1523		dictionary.
1524		
1525		Caution: samples are weighted by number of replicate analyses, which is a
1526		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1527		correlated analytical errors for one or more samples).
1528		
1529		Returns a tuplet of:
1530		
1531		+ the list of group names
1532		+ an array of the corresponding Δ4x values
1533		+ the corresponding (co)variance matrix
1534		
1535		**Parameters**
1536
1537		+ `sample_groups`: a dictionary of the form:
1538		```py
1539		{'group1': ['sample_1', 'sample_2'],
1540		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1541		```
1542		'''
1543		
1544		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1545		groups = sorted(sample_groups.keys())
1546		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1547		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1548		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1549		W = np.array([
1550			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1551			for j in groups])
1552		D4x_new = W @ D4x_old
1553		CM_new = W @ CM_old @ W.T
1554
1555		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1558	@make_verbal
1559	def standardize(self,
1560		method = 'pooled',
1561		weighted_sessions = [],
1562		consolidate = True,
1563		consolidate_tables = False,
1564		consolidate_plots = False,
1565		constraints = {},
1566		):
1567		'''
1568		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1569		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1570		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1571		i.e. that their true Δ4x value does not change between sessions,
1572		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1573		`'indep_sessions'`, the standardization processes each session independently, based only
1574		on anchors analyses.
1575		'''
1576
1577		self.standardization_method = method
1578		self.assign_timestamps()
1579
1580		if method == 'pooled':
1581			if weighted_sessions:
1582				for session_group in weighted_sessions:
1583					if self._4x == '47':
1584						X = D47data([r for r in self if r['Session'] in session_group])
1585					elif self._4x == '48':
1586						X = D48data([r for r in self if r['Session'] in session_group])
1587					X.Nominal_D4x = self.Nominal_D4x.copy()
1588					X.refresh()
1589					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1590					w = np.sqrt(result.redchi)
1591					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1592					for r in X:
1593						r[f'wD{self._4x}raw'] *= w
1594			else:
1595				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1596				for r in self:
1597					r[f'wD{self._4x}raw'] = 1.
1598
1599			params = Parameters()
1600			for k,session in enumerate(self.sessions):
1601				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1602				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1603				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1604				s = pf(session)
1605				params.add(f'a_{s}', value = 0.9)
1606				params.add(f'b_{s}', value = 0.)
1607				params.add(f'c_{s}', value = -0.9)
1608				params.add(f'a2_{s}', value = 0.,
1609# 					vary = self.sessions[session]['scrambling_drift'],
1610					)
1611				params.add(f'b2_{s}', value = 0.,
1612# 					vary = self.sessions[session]['slope_drift'],
1613					)
1614				params.add(f'c2_{s}', value = 0.,
1615# 					vary = self.sessions[session]['wg_drift'],
1616					)
1617				if not self.sessions[session]['scrambling_drift']:
1618					params[f'a2_{s}'].expr = '0'
1619				if not self.sessions[session]['slope_drift']:
1620					params[f'b2_{s}'].expr = '0'
1621				if not self.sessions[session]['wg_drift']:
1622					params[f'c2_{s}'].expr = '0'
1623
1624			for sample in self.unknowns:
1625				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1626
1627			for k in constraints:
1628				params[k].expr = constraints[k]
1629
1630			def residuals(p):
1631				R = []
1632				for r in self:
1633					session = pf(r['Session'])
1634					sample = pf(r['Sample'])
1635					if r['Sample'] in self.Nominal_D4x:
1636						R += [ (
1637							r[f'D{self._4x}raw'] - (
1638								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1639								+ p[f'b_{session}'] * r[f'd{self._4x}']
1640								+	p[f'c_{session}']
1641								+ r['t'] * (
1642									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1643									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1644									+	p[f'c2_{session}']
1645									)
1646								)
1647							) / r[f'wD{self._4x}raw'] ]
1648					else:
1649						R += [ (
1650							r[f'D{self._4x}raw'] - (
1651								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1652								+ p[f'b_{session}'] * r[f'd{self._4x}']
1653								+	p[f'c_{session}']
1654								+ r['t'] * (
1655									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1656									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1657									+	p[f'c2_{session}']
1658									)
1659								)
1660							) / r[f'wD{self._4x}raw'] ]
1661				return R
1662
1663			M = Minimizer(residuals, params)
1664			result = M.least_squares()
1665			self.Nf = result.nfree
1666			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1667			new_names, new_covar, new_se = _fullcovar(result)[:3]
1668			result.var_names = new_names
1669			result.covar = new_covar
1670
1671			for r in self:
1672				s = pf(r["Session"])
1673				a = result.params.valuesdict()[f'a_{s}']
1674				b = result.params.valuesdict()[f'b_{s}']
1675				c = result.params.valuesdict()[f'c_{s}']
1676				a2 = result.params.valuesdict()[f'a2_{s}']
1677				b2 = result.params.valuesdict()[f'b2_{s}']
1678				c2 = result.params.valuesdict()[f'c2_{s}']
1679				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1680				
1681
1682			self.standardization = result
1683
1684			for session in self.sessions:
1685				self.sessions[session]['Np'] = 3
1686				for k in ['scrambling', 'slope', 'wg']:
1687					if self.sessions[session][f'{k}_drift']:
1688						self.sessions[session]['Np'] += 1
1689
1690			if consolidate:
1691				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1692			return result
1693
1694
1695		elif method == 'indep_sessions':
1696
1697			if weighted_sessions:
1698				for session_group in weighted_sessions:
1699					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1700					X.Nominal_D4x = self.Nominal_D4x.copy()
1701					X.refresh()
1702					# This is only done to assign r['wD47raw'] for r in X:
1703					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1704					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1705			else:
1706				self.msg('All weights set to 1 ‰')
1707				for r in self:
1708					r[f'wD{self._4x}raw'] = 1
1709
1710			for session in self.sessions:
1711				s = self.sessions[session]
1712				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1713				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1714				s['Np'] = sum(p_active)
1715				sdata = s['data']
1716
1717				A = np.array([
1718					[
1719						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1720						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1721						1 / r[f'wD{self._4x}raw'],
1722						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1723						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1724						r['t'] / r[f'wD{self._4x}raw']
1725						]
1726					for r in sdata if r['Sample'] in self.anchors
1727					])[:,p_active] # only keep columns for the active parameters
1728				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1729				s['Na'] = Y.size
1730				CM = linalg.inv(A.T @ A)
1731				bf = (CM @ A.T @ Y).T[0,:]
1732				k = 0
1733				for n,a in zip(p_names, p_active):
1734					if a:
1735						s[n] = bf[k]
1736# 						self.msg(f'{n} = {bf[k]}')
1737						k += 1
1738					else:
1739						s[n] = 0.
1740# 						self.msg(f'{n} = 0.0')
1741
1742				for r in sdata :
1743					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1744					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1745					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1746
1747				s['CM'] = np.zeros((6,6))
1748				i = 0
1749				k_active = [j for j,a in enumerate(p_active) if a]
1750				for j,a in enumerate(p_active):
1751					if a:
1752						s['CM'][j,k_active] = CM[i,:]
1753						i += 1
1754
1755			if not weighted_sessions:
1756				w = self.rmswd()['rmswd']
1757				for r in self:
1758						r[f'wD{self._4x}'] *= w
1759						r[f'wD{self._4x}raw'] *= w
1760				for session in self.sessions:
1761					self.sessions[session]['CM'] *= w**2
1762
1763			for session in self.sessions:
1764				s = self.sessions[session]
1765				s['SE_a'] = s['CM'][0,0]**.5
1766				s['SE_b'] = s['CM'][1,1]**.5
1767				s['SE_c'] = s['CM'][2,2]**.5
1768				s['SE_a2'] = s['CM'][3,3]**.5
1769				s['SE_b2'] = s['CM'][4,4]**.5
1770				s['SE_c2'] = s['CM'][5,5]**.5
1771
1772			if not weighted_sessions:
1773				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1774			else:
1775				self.Nf = 0
1776				for sg in weighted_sessions:
1777					self.Nf += self.rmswd(sessions = sg)['Nf']
1778
1779			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1780
1781			avgD4x = {
1782				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1783				for sample in self.samples
1784				}
1785			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1786			rD4x = (chi2/self.Nf)**.5
1787			self.repeatability[f'sigma_{self._4x}'] = rD4x
1788
1789			if consolidate:
1790				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1793	def standardization_error(self, session, d4x, D4x, t = 0):
1794		'''
1795		Compute standardization error for a given session and
1796		(δ47, Δ47) composition.
1797		'''
1798		a = self.sessions[session]['a']
1799		b = self.sessions[session]['b']
1800		c = self.sessions[session]['c']
1801		a2 = self.sessions[session]['a2']
1802		b2 = self.sessions[session]['b2']
1803		c2 = self.sessions[session]['c2']
1804		CM = self.sessions[session]['CM']
1805
1806		x, y = D4x, d4x
1807		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1808# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1809		dxdy = -(b+b2*t) / (a+a2*t)
1810		dxdz = 1. / (a+a2*t)
1811		dxda = -x / (a+a2*t)
1812		dxdb = -y / (a+a2*t)
1813		dxdc = -1. / (a+a2*t)
1814		dxda2 = -x * a2 / (a+a2*t)
1815		dxdb2 = -y * t / (a+a2*t)
1816		dxdc2 = -t / (a+a2*t)
1817		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1818		sx = (V @ CM @ V.T) ** .5
1819		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1822	@make_verbal
1823	def summary(self,
1824		dir = 'output',
1825		filename = None,
1826		save_to_file = True,
1827		print_out = True,
1828		):
1829		'''
1830		Print out an/or save to disk a summary of the standardization results.
1831
1832		**Parameters**
1833
1834		+ `dir`: the directory in which to save the table
1835		+ `filename`: the name to the csv file to write to
1836		+ `save_to_file`: whether to save the table to disk
1837		+ `print_out`: whether to print out the table
1838		'''
1839
1840		out = []
1841		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1842		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1843		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1844		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1845		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1848		out += [['Model degrees of freedom', f"{self.Nf}"]]
1849		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1850		out += [['Standardization method', self.standardization_method]]
1851
1852		if save_to_file:
1853			if not os.path.exists(dir):
1854				os.makedirs(dir)
1855			if filename is None:
1856				filename = f'D{self._4x}_summary.csv'
1857			with open(f'{dir}/{filename}', 'w') as fid:
1858				fid.write(make_csv(out))
1859		if print_out:
1860			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1863	@make_verbal
1864	def table_of_sessions(self,
1865		dir = 'output',
1866		filename = None,
1867		save_to_file = True,
1868		print_out = True,
1869		output = None,
1870		):
1871		'''
1872		Print out an/or save to disk a table of sessions.
1873
1874		**Parameters**
1875
1876		+ `dir`: the directory in which to save the table
1877		+ `filename`: the name to the csv file to write to
1878		+ `save_to_file`: whether to save the table to disk
1879		+ `print_out`: whether to print out the table
1880		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1881		    if set to `'raw'`: return a list of list of strings
1882		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1883		'''
1884		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1885		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1886		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1887
1888		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1889		if include_a2:
1890			out[-1] += ['a2 ± SE']
1891		if include_b2:
1892			out[-1] += ['b2 ± SE']
1893		if include_c2:
1894			out[-1] += ['c2 ± SE']
1895		for session in self.sessions:
1896			out += [[
1897				session,
1898				f"{self.sessions[session]['Na']}",
1899				f"{self.sessions[session]['Nu']}",
1900				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1901				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1902				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1903				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1904				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1905				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1906				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1907				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1908				]]
1909			if include_a2:
1910				if self.sessions[session]['scrambling_drift']:
1911					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1912				else:
1913					out[-1] += ['']
1914			if include_b2:
1915				if self.sessions[session]['slope_drift']:
1916					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1917				else:
1918					out[-1] += ['']
1919			if include_c2:
1920				if self.sessions[session]['wg_drift']:
1921					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1922				else:
1923					out[-1] += ['']
1924
1925		if save_to_file:
1926			if not os.path.exists(dir):
1927				os.makedirs(dir)
1928			if filename is None:
1929				filename = f'D{self._4x}_sessions.csv'
1930			with open(f'{dir}/{filename}', 'w') as fid:
1931				fid.write(make_csv(out))
1932		if print_out:
1933			self.msg('\n' + pretty_table(out))
1934		if output == 'raw':
1935			return out
1936		elif output == 'pretty':
1937			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1940	@make_verbal
1941	def table_of_analyses(
1942		self,
1943		dir = 'output',
1944		filename = None,
1945		save_to_file = True,
1946		print_out = True,
1947		output = None,
1948		):
1949		'''
1950		Print out an/or save to disk a table of analyses.
1951
1952		**Parameters**
1953
1954		+ `dir`: the directory in which to save the table
1955		+ `filename`: the name to the csv file to write to
1956		+ `save_to_file`: whether to save the table to disk
1957		+ `print_out`: whether to print out the table
1958		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1959		    if set to `'raw'`: return a list of list of strings
1960		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1961		'''
1962
1963		out = [['UID','Session','Sample']]
1964		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1965		for f in extra_fields:
1966			out[-1] += [f[0]]
1967		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1968		for r in self:
1969			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1970			for f in extra_fields:
1971				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1972			out[-1] += [
1973				f"{r['d13Cwg_VPDB']:.3f}",
1974				f"{r['d18Owg_VSMOW']:.3f}",
1975				f"{r['d45']:.6f}",
1976				f"{r['d46']:.6f}",
1977				f"{r['d47']:.6f}",
1978				f"{r['d48']:.6f}",
1979				f"{r['d49']:.6f}",
1980				f"{r['d13C_VPDB']:.6f}",
1981				f"{r['d18O_VSMOW']:.6f}",
1982				f"{r['D47raw']:.6f}",
1983				f"{r['D48raw']:.6f}",
1984				f"{r['D49raw']:.6f}",
1985				f"{r[f'D{self._4x}']:.6f}"
1986				]
1987		if save_to_file:
1988			if not os.path.exists(dir):
1989				os.makedirs(dir)
1990			if filename is None:
1991				filename = f'D{self._4x}_analyses.csv'
1992			with open(f'{dir}/{filename}', 'w') as fid:
1993				fid.write(make_csv(out))
1994		if print_out:
1995			self.msg('\n' + pretty_table(out))
1996		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1998	@make_verbal
1999	def covar_table(
2000		self,
2001		correl = False,
2002		dir = 'output',
2003		filename = None,
2004		save_to_file = True,
2005		print_out = True,
2006		output = None,
2007		):
2008		'''
2009		Print out, save to disk and/or return the variance-covariance matrix of D4x
2010		for all unknown samples.
2011
2012		**Parameters**
2013
2014		+ `dir`: the directory in which to save the csv
2015		+ `filename`: the name of the csv file to write to
2016		+ `save_to_file`: whether to save the csv
2017		+ `print_out`: whether to print out the matrix
2018		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2019		    if set to `'raw'`: return a list of list of strings
2020		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2021		'''
2022		samples = sorted([u for u in self.unknowns])
2023		out = [[''] + samples]
2024		for s1 in samples:
2025			out.append([s1])
2026			for s2 in samples:
2027				if correl:
2028					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2029				else:
2030					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2031
2032		if save_to_file:
2033			if not os.path.exists(dir):
2034				os.makedirs(dir)
2035			if filename is None:
2036				if correl:
2037					filename = f'D{self._4x}_correl.csv'
2038				else:
2039					filename = f'D{self._4x}_covar.csv'
2040			with open(f'{dir}/{filename}', 'w') as fid:
2041				fid.write(make_csv(out))
2042		if print_out:
2043			self.msg('\n'+pretty_table(out))
2044		if output == 'raw':
2045			return out
2046		elif output == 'pretty':
2047			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2049	@make_verbal
2050	def table_of_samples(
2051		self,
2052		dir = 'output',
2053		filename = None,
2054		save_to_file = True,
2055		print_out = True,
2056		output = None,
2057		):
2058		'''
2059		Print out, save to disk and/or return a table of samples.
2060
2061		**Parameters**
2062
2063		+ `dir`: the directory in which to save the csv
2064		+ `filename`: the name of the csv file to write to
2065		+ `save_to_file`: whether to save the csv
2066		+ `print_out`: whether to print out the table
2067		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2068		    if set to `'raw'`: return a list of list of strings
2069		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2070		'''
2071
2072		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2073		for sample in self.anchors:
2074			out += [[
2075				f"{sample}",
2076				f"{self.samples[sample]['N']}",
2077				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2078				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2079				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2080				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2081				]]
2082		for sample in self.unknowns:
2083			out += [[
2084				f"{sample}",
2085				f"{self.samples[sample]['N']}",
2086				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2087				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2088				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2089				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2090				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2091				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2092				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2093				]]
2094		if save_to_file:
2095			if not os.path.exists(dir):
2096				os.makedirs(dir)
2097			if filename is None:
2098				filename = f'D{self._4x}_samples.csv'
2099			with open(f'{dir}/{filename}', 'w') as fid:
2100				fid.write(make_csv(out))
2101		if print_out:
2102			self.msg('\n'+pretty_table(out))
2103		if output == 'raw':
2104			return out
2105		elif output == 'pretty':
2106			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2109	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2110		'''
2111		Generate session plots and save them to disk.
2112
2113		**Parameters**
2114
2115		+ `dir`: the directory in which to save the plots
2116		+ `figsize`: the width and height (in inches) of each plot
2117		+ `filetype`: 'pdf' or 'png'
2118		+ `dpi`: resolution for PNG output
2119		'''
2120		if not os.path.exists(dir):
2121			os.makedirs(dir)
2122
2123		for session in self.sessions:
2124			sp = self.plot_single_session(session, xylimits = 'constant')
2125			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2126			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2130	@make_verbal
2131	def consolidate_samples(self):
2132		'''
2133		Compile various statistics for each sample.
2134
2135		For each anchor sample:
2136
2137		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2138		+ `SE_D47` or `SE_D48`: set to zero by definition
2139
2140		For each unknown sample:
2141
2142		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2143		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2144
2145		For each anchor and unknown:
2146
2147		+ `N`: the total number of analyses of this sample
2148		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2149		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2150		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2151		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2152		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2153		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2154		'''
2155		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2156		for sample in self.samples:
2157			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2158			if self.samples[sample]['N'] > 1:
2159				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2160
2161			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2162			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2163
2164			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2165			if len(D4x_pop) > 2:
2166				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2167			
2168		if self.standardization_method == 'pooled':
2169			for sample in self.anchors:
2170				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2171				self.samples[sample][f'SE_D{self._4x}'] = 0.
2172			for sample in self.unknowns:
2173				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2174				try:
2175					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2176				except ValueError:
2177					# when `sample` is constrained by self.standardize(constraints = {...}),
2178					# it is no longer listed in self.standardization.var_names.
2179					# Temporary fix: define SE as zero for now
2180					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2181
2182		elif self.standardization_method == 'indep_sessions':
2183			for sample in self.anchors:
2184				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2185				self.samples[sample][f'SE_D{self._4x}'] = 0.
2186			for sample in self.unknowns:
2187				self.msg(f'Consolidating sample {sample}')
2188				self.unknowns[sample][f'session_D{self._4x}'] = {}
2189				session_avg = []
2190				for session in self.sessions:
2191					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2192					if sdata:
2193						self.msg(f'{sample} found in session {session}')
2194						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2195						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2196						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2197						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2198						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2199						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2200						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2201				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2202				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2203				wsum = sum([weights[s] for s in weights])
2204				for s in weights:
2205					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2206
2207		for r in self:
2208			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2212	def consolidate_sessions(self):
2213		'''
2214		Compute various statistics for each session.
2215
2216		+ `Na`: Number of anchor analyses in the session
2217		+ `Nu`: Number of unknown analyses in the session
2218		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2219		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2220		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2221		+ `a`: scrambling factor
2222		+ `b`: compositional slope
2223		+ `c`: WG offset
2224		+ `SE_a`: Model stadard erorr of `a`
2225		+ `SE_b`: Model stadard erorr of `b`
2226		+ `SE_c`: Model stadard erorr of `c`
2227		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2228		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2229		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2230		+ `a2`: scrambling factor drift
2231		+ `b2`: compositional slope drift
2232		+ `c2`: WG offset drift
2233		+ `Np`: Number of standardization parameters to fit
2234		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2235		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2236		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2237		'''
2238		for session in self.sessions:
2239			if 'd13Cwg_VPDB' not in self.sessions[session]:
2240				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2241			if 'd18Owg_VSMOW' not in self.sessions[session]:
2242				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2243			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2244			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2245
2246			self.msg(f'Computing repeatabilities for session {session}')
2247			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2248			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2249			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2250
2251		if self.standardization_method == 'pooled':
2252			for session in self.sessions:
2253
2254				# different (better?) computation of D4x repeatability for each session:
2255				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2256				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2257
2258				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2259				i = self.standardization.var_names.index(f'a_{pf(session)}')
2260				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2261
2262				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2263				i = self.standardization.var_names.index(f'b_{pf(session)}')
2264				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2265
2266				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2267				i = self.standardization.var_names.index(f'c_{pf(session)}')
2268				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2269
2270				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2271				if self.sessions[session]['scrambling_drift']:
2272					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2273					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2274				else:
2275					self.sessions[session]['SE_a2'] = 0.
2276
2277				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2278				if self.sessions[session]['slope_drift']:
2279					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2280					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2281				else:
2282					self.sessions[session]['SE_b2'] = 0.
2283
2284				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2285				if self.sessions[session]['wg_drift']:
2286					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2287					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2288				else:
2289					self.sessions[session]['SE_c2'] = 0.
2290
2291				i = self.standardization.var_names.index(f'a_{pf(session)}')
2292				j = self.standardization.var_names.index(f'b_{pf(session)}')
2293				k = self.standardization.var_names.index(f'c_{pf(session)}')
2294				CM = np.zeros((6,6))
2295				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2296				try:
2297					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2298					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2299					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2300					try:
2301						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2302						CM[3,4] = self.standardization.covar[i2,j2]
2303						CM[4,3] = self.standardization.covar[j2,i2]
2304					except ValueError:
2305						pass
2306					try:
2307						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2308						CM[3,5] = self.standardization.covar[i2,k2]
2309						CM[5,3] = self.standardization.covar[k2,i2]
2310					except ValueError:
2311						pass
2312				except ValueError:
2313					pass
2314				try:
2315					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2316					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2317					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2318					try:
2319						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2320						CM[4,5] = self.standardization.covar[j2,k2]
2321						CM[5,4] = self.standardization.covar[k2,j2]
2322					except ValueError:
2323						pass
2324				except ValueError:
2325					pass
2326				try:
2327					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2328					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2329					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2330				except ValueError:
2331					pass
2332
2333				self.sessions[session]['CM'] = CM
2334
2335		elif self.standardization_method == 'indep_sessions':
2336			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2339	@make_verbal
2340	def repeatabilities(self):
2341		'''
2342		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2343		(for all samples, for anchors, and for unknowns).
2344		'''
2345		self.msg('Computing reproducibilities for all sessions')
2346
2347		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2348		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2349		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2351		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2354	@make_verbal
2355	def consolidate(self, tables = True, plots = True):
2356		'''
2357		Collect information about samples, sessions and repeatabilities.
2358		'''
2359		self.consolidate_samples()
2360		self.consolidate_sessions()
2361		self.repeatabilities()
2362
2363		if tables:
2364			self.summary()
2365			self.table_of_sessions()
2366			self.table_of_analyses()
2367			self.table_of_samples()
2368
2369		if plots:
2370			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2373	@make_verbal
2374	def rmswd(self,
2375		samples = 'all samples',
2376		sessions = 'all sessions',
2377		):
2378		'''
2379		Compute the χ2, root mean squared weighted deviation
2380		(i.e. reduced χ2), and corresponding degrees of freedom of the
2381		Δ4x values for samples in `samples` and sessions in `sessions`.
2382		
2383		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2384		'''
2385		if samples == 'all samples':
2386			mysamples = [k for k in self.samples]
2387		elif samples == 'anchors':
2388			mysamples = [k for k in self.anchors]
2389		elif samples == 'unknowns':
2390			mysamples = [k for k in self.unknowns]
2391		else:
2392			mysamples = samples
2393
2394		if sessions == 'all sessions':
2395			sessions = [k for k in self.sessions]
2396
2397		chisq, Nf = 0, 0
2398		for sample in mysamples :
2399			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2400			if len(G) > 1 :
2401				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2402				Nf += (len(G) - 1)
2403				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2404		r = (chisq / Nf)**.5 if Nf > 0 else 0
2405		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2406		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2409	@make_verbal
2410	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2411		'''
2412		Compute the repeatability of `[r[key] for r in self]`
2413		'''
2414
2415		if samples == 'all samples':
2416			mysamples = [k for k in self.samples]
2417		elif samples == 'anchors':
2418			mysamples = [k for k in self.anchors]
2419		elif samples == 'unknowns':
2420			mysamples = [k for k in self.unknowns]
2421		else:
2422			mysamples = samples
2423
2424		if sessions == 'all sessions':
2425			sessions = [k for k in self.sessions]
2426
2427		if key in ['D47', 'D48']:
2428			# Full disclosure: the definition of Nf is tricky/debatable
2429			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2430			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2431			Nf = len(G)
2432# 			print(f'len(G) = {Nf}')
2433			Nf -= len([s for s in mysamples if s in self.unknowns])
2434# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2435			for session in sessions:
2436				Np = len([
2437					_ for _ in self.standardization.params
2438					if (
2439						self.standardization.params[_].expr is not None
2440						and (
2441							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2442							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2443							)
2444						)
2445					])
2446# 				print(f'session {session}: {Np} parameters to consider')
2447				Na = len({
2448					r['Sample'] for r in self.sessions[session]['data']
2449					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2450					})
2451# 				print(f'session {session}: {Na} different anchors in that session')
2452				Nf -= min(Np, Na)
2453# 			print(f'Nf = {Nf}')
2454
2455# 			for sample in mysamples :
2456# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2457# 				if len(X) > 1 :
2458# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2459# 					if sample in self.unknowns:
2460# 						Nf += len(X) - 1
2461# 					else:
2462# 						Nf += len(X)
2463# 			if samples in ['anchors', 'all samples']:
2464# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2465			r = (chisq / Nf)**.5 if Nf > 0 else 0
2466
2467		else: # if key not in ['D47', 'D48']
2468			chisq, Nf = 0, 0
2469			for sample in mysamples :
2470				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2471				if len(X) > 1 :
2472					Nf += len(X) - 1
2473					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2474			r = (chisq / Nf)**.5 if Nf > 0 else 0
2475
2476		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2477		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2479	def sample_average(self, samples, weights = 'equal', normalize = True):
2480		'''
2481		Weighted average Δ4x value of a group of samples, accounting for covariance.
2482
2483		Returns the weighed average Δ4x value and associated SE
2484		of a group of samples. Weights are equal by default. If `normalize` is
2485		true, `weights` will be rescaled so that their sum equals 1.
2486
2487		**Examples**
2488
2489		```python
2490		self.sample_average(['X','Y'], [1, 2])
2491		```
2492
2493		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2494		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2495		values of samples X and Y, respectively.
2496
2497		```python
2498		self.sample_average(['X','Y'], [1, -1], normalize = False)
2499		```
2500
2501		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2502		'''
2503		if weights == 'equal':
2504			weights = [1/len(samples)] * len(samples)
2505
2506		if normalize:
2507			s = sum(weights)
2508			if s:
2509				weights = [w/s for w in weights]
2510
2511		try:
2512# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2513# 			C = self.standardization.covar[indices,:][:,indices]
2514			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2515			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2516			return correlated_sum(X, C, weights)
2517		except ValueError:
2518			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2521	def sample_D4x_covar(self, sample1, sample2 = None):
2522		'''
2523		Covariance between Δ4x values of samples
2524
2525		Returns the error covariance between the average Δ4x values of two
2526		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2527		returns the Δ4x variance for that sample.
2528		'''
2529		if sample2 is None:
2530			sample2 = sample1
2531		if self.standardization_method == 'pooled':
2532			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2533			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2534			return self.standardization.covar[i, j]
2535		elif self.standardization_method == 'indep_sessions':
2536			if sample1 == sample2:
2537				return self.samples[sample1][f'SE_D{self._4x}']**2
2538			else:
2539				c = 0
2540				for session in self.sessions:
2541					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2542					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2543					if sdata1 and sdata2:
2544						a = self.sessions[session]['a']
2545						# !! TODO: CM below does not account for temporal changes in standardization parameters
2546						CM = self.sessions[session]['CM'][:3,:3]
2547						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2548						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2549						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2550						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2551						c += (
2552							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2553							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2554							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2555							@ CM
2556							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2557							) / a**2
2558				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2560	def sample_D4x_correl(self, sample1, sample2 = None):
2561		'''
2562		Correlation between Δ4x errors of samples
2563
2564		Returns the error correlation between the average Δ4x values of two samples.
2565		'''
2566		if sample2 is None or sample2 == sample1:
2567			return 1.
2568		return (
2569			self.sample_D4x_covar(sample1, sample2)
2570			/ self.unknowns[sample1][f'SE_D{self._4x}']
2571			/ self.unknowns[sample2][f'SE_D{self._4x}']
2572			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2574	def plot_single_session(self,
2575		session,
2576		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2577		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2578		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2579		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2580		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2581		xylimits = 'free', # | 'constant'
2582		x_label = None,
2583		y_label = None,
2584		error_contour_interval = 'auto',
2585		fig = 'new',
2586		):
2587		'''
2588		Generate plot for a single session
2589		'''
2590		if x_label is None:
2591			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2592		if y_label is None:
2593			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2594
2595		out = _SessionPlot()
2596		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2597		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2598		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2599		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2601		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		anchor_avg = (np.array([ np.array([
2603				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2604				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2605				]) for sample in anchors]).T,
2606			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2607		unknown_avg = (np.array([ np.array([
2608				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2609				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2610				]) for sample in unknowns]).T,
2611			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2612		
2613		
2614		if fig == 'new':
2615			out.fig = ppl.figure(figsize = (6,6))
2616			ppl.subplots_adjust(.1,.1,.9,.9)
2617
2618		out.anchor_analyses, = ppl.plot(
2619			anchors_d,
2620			anchors_D,
2621			**kw_plot_anchors)
2622		out.unknown_analyses, = ppl.plot(
2623			unknowns_d,
2624			unknowns_D,
2625			**kw_plot_unknowns)
2626		out.anchor_avg = ppl.plot(
2627			*anchor_avg,
2628			**kw_plot_anchor_avg)
2629		out.unknown_avg = ppl.plot(
2630			*unknown_avg,
2631			**kw_plot_unknown_avg)
2632		if xylimits == 'constant':
2633			x = [r[f'd{self._4x}'] for r in self]
2634			y = [r[f'D{self._4x}'] for r in self]
2635			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2636			w, h = x2-x1, y2-y1
2637			x1 -= w/20
2638			x2 += w/20
2639			y1 -= h/20
2640			y2 += h/20
2641			ppl.axis([x1, x2, y1, y2])
2642		elif xylimits == 'free':
2643			x1, x2, y1, y2 = ppl.axis()
2644		else:
2645			x1, x2, y1, y2 = ppl.axis(xylimits)
2646				
2647		if error_contour_interval != 'none':
2648			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2649			XI,YI = np.meshgrid(xi, yi)
2650			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2651			if error_contour_interval == 'auto':
2652				rng = np.max(SI) - np.min(SI)
2653				if rng <= 0.01:
2654					cinterval = 0.001
2655				elif rng <= 0.03:
2656					cinterval = 0.004
2657				elif rng <= 0.1:
2658					cinterval = 0.01
2659				elif rng <= 0.3:
2660					cinterval = 0.03
2661				elif rng <= 1.:
2662					cinterval = 0.1
2663				else:
2664					cinterval = 0.5
2665			else:
2666				cinterval = error_contour_interval
2667
2668			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2669			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2670			out.clabel = ppl.clabel(out.contour)
2671			contour = (XI, YI, SI, cval, cinterval)
2672
2673		if fig == None:
2674			return {
2675			'anchors':anchors,
2676			'unknowns':unknowns,
2677			'anchors_d':anchors_d,
2678			'anchors_D':anchors_D,
2679			'unknowns_d':unknowns_d,
2680			'unknowns_D':unknowns_D,
2681			'anchor_avg':anchor_avg,
2682			'unknown_avg':unknown_avg,
2683			'contour':contour,
2684			}
2685
2686		ppl.xlabel(x_label)
2687		ppl.ylabel(y_label)
2688		ppl.title(session, weight = 'bold')
2689		ppl.grid(alpha = .2)
2690		out.ax = ppl.gca()		
2691
2692		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2694	def plot_residuals(
2695		self,
2696		kde = False,
2697		hist = False,
2698		binwidth = 2/3,
2699		dir = 'output',
2700		filename = None,
2701		highlight = [],
2702		colors = None,
2703		figsize = None,
2704		dpi = 100,
2705		yspan = None,
2706		):
2707		'''
2708		Plot residuals of each analysis as a function of time (actually, as a function of
2709		the order of analyses in the `D4xdata` object)
2710
2711		+ `kde`: whether to add a kernel density estimate of residuals
2712		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2713		+ `histbins`: specify bin edges for the histogram
2714		+ `dir`: the directory in which to save the plot
2715		+ `highlight`: a list of samples to highlight
2716		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2717		+ `figsize`: (width, height) of figure
2718		+ `dpi`: resolution for PNG output
2719		+ `yspan`: factor controlling the range of y values shown in plot
2720		  (by default: `yspan = 1.5 if kde else 1.0`)
2721		'''
2722		
2723		from matplotlib import ticker
2724
2725		if yspan is None:
2726			if kde:
2727				yspan = 1.5
2728			else:
2729				yspan = 1.0
2730		
2731		# Layout
2732		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2733		if hist or kde:
2734			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2735			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2736		else:
2737			ppl.subplots_adjust(.08,.05,.78,.8)
2738			ax1 = ppl.subplot(111)
2739		
2740		# Colors
2741		N = len(self.anchors)
2742		if colors is None:
2743			if len(highlight) > 0:
2744				Nh = len(highlight)
2745				if Nh == 1:
2746					colors = {highlight[0]: (0,0,0)}
2747				elif Nh == 3:
2748					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2749				elif Nh == 4:
2750					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2751				else:
2752					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2753			else:
2754				if N == 3:
2755					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2756				elif N == 4:
2757					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2758				else:
2759					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2760
2761		ppl.sca(ax1)
2762		
2763		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2764
2765		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2766
2767		session = self[0]['Session']
2768		x1 = 0
2769# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2770		x_sessions = {}
2771		one_or_more_singlets = False
2772		one_or_more_multiplets = False
2773		multiplets = set()
2774		for k,r in enumerate(self):
2775			if r['Session'] != session:
2776				x2 = k-1
2777				x_sessions[session] = (x1+x2)/2
2778				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2779				session = r['Session']
2780				x1 = k
2781			singlet = len(self.samples[r['Sample']]['data']) == 1
2782			if not singlet:
2783				multiplets.add(r['Sample'])
2784			if r['Sample'] in self.unknowns:
2785				if singlet:
2786					one_or_more_singlets = True
2787				else:
2788					one_or_more_multiplets = True
2789			kw = dict(
2790				marker = 'x' if singlet else '+',
2791				ms = 4 if singlet else 5,
2792				ls = 'None',
2793				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2794				mew = 1,
2795				alpha = 0.2 if singlet else 1,
2796				)
2797			if highlight and r['Sample'] not in highlight:
2798				kw['alpha'] = 0.2
2799			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2800		x2 = k
2801		x_sessions[session] = (x1+x2)/2
2802
2803		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2805		if not (hist or kde):
2806			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808
2809		xmin, xmax, ymin, ymax = ppl.axis()
2810		if yspan != 1:
2811			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2812		for s in x_sessions:
2813			ppl.text(
2814				x_sessions[s],
2815				ymax +1,
2816				s,
2817				va = 'bottom',
2818				**(
2819					dict(ha = 'center')
2820					if len(self.sessions[s]['data']) > (0.15 * len(self))
2821					else dict(ha = 'left', rotation = 45)
2822					)
2823				)
2824
2825		if hist or kde:
2826			ppl.sca(ax2)
2827
2828		for s in colors:
2829			kw['marker'] = '+'
2830			kw['ms'] = 5
2831			kw['mec'] = colors[s]
2832			kw['label'] = s
2833			kw['alpha'] = 1
2834			ppl.plot([], [], **kw)
2835
2836		kw['mec'] = (0,0,0)
2837
2838		if one_or_more_singlets:
2839			kw['marker'] = 'x'
2840			kw['ms'] = 4
2841			kw['alpha'] = .2
2842			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2843			ppl.plot([], [], **kw)
2844
2845		if one_or_more_multiplets:
2846			kw['marker'] = '+'
2847			kw['ms'] = 4
2848			kw['alpha'] = 1
2849			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2850			ppl.plot([], [], **kw)
2851
2852		if hist or kde:
2853			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2854		else:
2855			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2856		leg.set_zorder(-1000)
2857
2858		ppl.sca(ax1)
2859
2860		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2861		ppl.xticks([])
2862		ppl.axis([-1, len(self), None, None])
2863
2864		if hist or kde:
2865			ppl.sca(ax2)
2866			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2867
2868			if kde:
2869				from scipy.stats import gaussian_kde
2870				yi = np.linspace(ymin, ymax, 201)
2871				xi = gaussian_kde(X).evaluate(yi)
2872				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2873# 				ppl.plot(xi, yi, 'k-', lw = 1)
2874			elif hist:
2875				ppl.hist(
2876					X,
2877					orientation = 'horizontal',
2878					histtype = 'stepfilled',
2879					ec = [.4]*3,
2880					fc = [.25]*3,
2881					alpha = .25,
2882					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2883					)
2884			ppl.text(0, 0,
2885				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2886				size = 7.5,
2887				alpha = 1,
2888				va = 'center',
2889				ha = 'left',
2890				)
2891
2892			ppl.axis([0, None, ymin, ymax])
2893			ppl.xticks([])
2894			ppl.yticks([])
2895# 			ax2.spines['left'].set_visible(False)
2896			ax2.spines['right'].set_visible(False)
2897			ax2.spines['top'].set_visible(False)
2898			ax2.spines['bottom'].set_visible(False)
2899
2900		ax1.axis([None, None, ymin, ymax])
2901
2902		if not os.path.exists(dir):
2903			os.makedirs(dir)
2904		if filename is None:
2905			return fig
2906		elif filename == '':
2907			filename = f'D{self._4x}_residuals.pdf'
2908		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2909		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: <color>} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2912	def simulate(self, *args, **kwargs):
2913		'''
2914		Legacy function with warning message pointing to `virtual_data()`
2915		'''
2916		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
2918	def plot_distribution_of_analyses(
2919		self,
2920		dir = 'output',
2921		filename = None,
2922		vs_time = False,
2923		figsize = (6,4),
2924		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2925		output = None,
2926		dpi = 100,
2927		):
2928		'''
2929		Plot temporal distribution of all analyses in the data set.
2930		
2931		**Parameters**
2932
2933		+ `dir`: the directory in which to save the plot
2934		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2935		+ `dpi`: resolution for PNG output
2936		+ `figsize`: (width, height) of figure
2937		+ `dpi`: resolution for PNG output
2938		'''
2939
2940		asamples = [s for s in self.anchors]
2941		usamples = [s for s in self.unknowns]
2942		if output is None or output == 'fig':
2943			fig = ppl.figure(figsize = figsize)
2944			ppl.subplots_adjust(*subplots_adjust)
2945		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2946		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax += (Xmax-Xmin)/40
2948		Xmin -= (Xmax-Xmin)/41
2949		for k, s in enumerate(asamples + usamples):
2950			if vs_time:
2951				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2952			else:
2953				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2954			Y = [-k for x in X]
2955			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2956			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2957			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2958		ppl.axis([Xmin, Xmax, -k-1, 1])
2959		ppl.xlabel('\ntime')
2960		ppl.gca().annotate('',
2961			xy = (0.6, -0.02),
2962			xycoords = 'axes fraction',
2963			xytext = (.4, -0.02), 
2964            arrowprops = dict(arrowstyle = "->", color = 'k'),
2965            )
2966			
2967
2968		x2 = -1
2969		for session in self.sessions:
2970			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2971			if vs_time:
2972				ppl.axvline(x1, color = 'k', lw = .75)
2973			if x2 > -1:
2974				if not vs_time:
2975					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2976			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2977# 			from xlrd import xldate_as_datetime
2978# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2979			if vs_time:
2980				ppl.axvline(x2, color = 'k', lw = .75)
2981				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2982			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2983
2984		ppl.xticks([])
2985		ppl.yticks([])
2986
2987		if output is None:
2988			if not os.path.exists(dir):
2989				os.makedirs(dir)
2990			if filename == None:
2991				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2992			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2993			ppl.close(fig)
2994		elif output == 'ax':
2995			return ppl.gca()
2996		elif output == 'fig':
2997			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
3000	def plot_bulk_compositions(
3001		self,
3002		samples = None,
3003		dir = 'output/bulk_compositions',
3004		figsize = (6,6),
3005		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3006		show = False,
3007		sample_color = (0,.5,1),
3008		analysis_color = (.7,.7,.7),
3009		labeldist = 0.3,
3010		radius = 0.05,
3011		):
3012		'''
3013		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3014		
3015		By default, creates a directory `./output/bulk_compositions` where plots for
3016		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3017		
3018		
3019		**Parameters**
3020
3021		+ `samples`: Only these samples are processed (by default: all samples).
3022		+ `dir`: where to save the plots
3023		+ `figsize`: (width, height) of figure
3024		+ `subplots_adjust`: passed to `subplots_adjust()`
3025		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3026		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3027		+ `sample_color`: color used for replicate markers/labels
3028		+ `analysis_color`: color used for sample markers/labels
3029		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3030		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3031		'''
3032
3033		from matplotlib.patches import Ellipse
3034
3035		if samples is None:
3036			samples = [_ for _ in self.samples]
3037
3038		saved = {}
3039
3040		for s in samples:
3041
3042			fig = ppl.figure(figsize = figsize)
3043			fig.subplots_adjust(*subplots_adjust)
3044			ax = ppl.subplot(111)
3045			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3046			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3047			ppl.title(s)
3048
3049
3050			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3051			UID = [_['UID'] for _ in self.samples[s]['data']]
3052			XY0 = XY.mean(0)
3053
3054			for xy in XY:
3055				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3056				
3057			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3058			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3059			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3060			saved[s] = [XY, XY0]
3061			
3062			x1, x2, y1, y2 = ppl.axis()
3063			x0, dx = (x1+x2)/2, (x2-x1)/2
3064			y0, dy = (y1+y2)/2, (y2-y1)/2
3065			dx, dy = [max(max(dx, dy), radius)]*2
3066
3067			ppl.axis([
3068				x0 - 1.2*dx,
3069				x0 + 1.2*dx,
3070				y0 - 1.2*dy,
3071				y0 + 1.2*dy,
3072				])			
3073
3074			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3075
3076			for xy, uid in zip(XY, UID):
3077
3078				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3079				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3080
3081				if (vector_in_display_space**2).sum() > 0:
3082
3083					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3084					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3085					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3086					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3087
3088					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3089
3090				else:
3091
3092					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3093
3094			if radius:
3095				ax.add_artist(Ellipse(
3096					xy = XY0,
3097					width = radius*2,
3098					height = radius*2,
3099					ls = (0, (2,2)),
3100					lw = .7,
3101					ec = analysis_color,
3102					fc = 'None',
3103					))
3104				ppl.text(
3105					XY0[0],
3106					XY0[1]-radius,
3107					f'\n± {radius*1e3:.0f} ppm',
3108					color = analysis_color,
3109					va = 'top',
3110					ha = 'center',
3111					linespacing = 0.4,
3112					size = 8,
3113					)
3114
3115			if not os.path.exists(dir):
3116				os.makedirs(dir)
3117			fig.savefig(f'{dir}/{s}.pdf')
3118			ppl.close(fig)
3119
3120		fig = ppl.figure(figsize = figsize)
3121		fig.subplots_adjust(*subplots_adjust)
3122		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3123		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3124
3125		for s in saved:
3126			for xy in saved[s][0]:
3127				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3128			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3129			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3130			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3131
3132		x1, x2, y1, y2 = ppl.axis()
3133		ppl.axis([
3134			x1 - (x2-x1)/10,
3135			x2 + (x2-x1)/10,
3136			y1 - (y2-y1)/10,
3137			y2 + (y2-y1)/10,
3138			])			
3139
3140
3141		if not os.path.exists(dir):
3142			os.makedirs(dir)
3143		fig.savefig(f'{dir}/__all__.pdf')
3144		if show:
3145			ppl.show()
3146		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3188class D47data(D4xdata):
3189	'''
3190	Store and process data for a large set of Δ47 analyses,
3191	usually comprising more than one analytical session.
3192	'''
3193
3194	Nominal_D4x = {
3195		'ETH-1':   0.2052,
3196		'ETH-2':   0.2085,
3197		'ETH-3':   0.6132,
3198		'ETH-4':   0.4511,
3199		'IAEA-C1': 0.3018,
3200		'IAEA-C2': 0.6409,
3201		'MERCK':   0.5135,
3202		} # I-CDES (Bernasconi et al., 2021)
3203	'''
3204	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3205	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3206	reference frame.
3207
3208	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3209	```py
3210	{
3211		'ETH-1'   : 0.2052,
3212		'ETH-2'   : 0.2085,
3213		'ETH-3'   : 0.6132,
3214		'ETH-4'   : 0.4511,
3215		'IAEA-C1' : 0.3018,
3216		'IAEA-C2' : 0.6409,
3217		'MERCK'   : 0.5135,
3218	}
3219	```
3220	'''
3221
3222
3223	@property
3224	def Nominal_D47(self):
3225		return self.Nominal_D4x
3226	
3227
3228	@Nominal_D47.setter
3229	def Nominal_D47(self, new):
3230		self.Nominal_D4x = dict(**new)
3231		self.refresh()
3232
3233
3234	def __init__(self, l = [], **kwargs):
3235		'''
3236		**Parameters:** same as `D4xdata.__init__()`
3237		'''
3238		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3239
3240
3241	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3242		'''
3243		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3244		value for that temperature, and add treat these samples as additional anchors.
3245
3246		**Parameters**
3247
3248		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3249		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3250		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3251		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3252		if `new`: keep pre-existing anchors but update them in case of conflict
3253		between old and new Δ47 values;
3254		if `old`: keep pre-existing anchors but preserve their original Δ47
3255		values in case of conflict.
3256		'''
3257		f = {
3258			'petersen': fCO2eqD47_Petersen,
3259			'wang': fCO2eqD47_Wang,
3260			}[fCo2eqD47]
3261		foo = {}
3262		for r in self:
3263			if 'Teq' in r:
3264				if r['Sample'] in foo:
3265					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3266				else:
3267					foo[r['Sample']] = f(r['Teq'])
3268			else:
3269					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3270
3271		if priority == 'replace':
3272			self.Nominal_D47 = {}
3273		for s in foo:
3274			if priority != 'old' or s not in self.Nominal_D47:
3275				self.Nominal_D47[s] = foo[s]
3276	
3277	def save_D47_correl(self, *args, **kwargs):
3278		return self._save_D4x_correl(*args, **kwargs)
3279
3280	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3234	def __init__(self, l = [], **kwargs):
3235		'''
3236		**Parameters:** same as `D4xdata.__init__()`
3237		'''
3238		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3223	@property
3224	def Nominal_D47(self):
3225		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3241	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3242		'''
3243		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3244		value for that temperature, and add treat these samples as additional anchors.
3245
3246		**Parameters**
3247
3248		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3249		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3250		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3251		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3252		if `new`: keep pre-existing anchors but update them in case of conflict
3253		between old and new Δ47 values;
3254		if `old`: keep pre-existing anchors but preserve their original Δ47
3255		values in case of conflict.
3256		'''
3257		f = {
3258			'petersen': fCO2eqD47_Petersen,
3259			'wang': fCO2eqD47_Wang,
3260			}[fCo2eqD47]
3261		foo = {}
3262		for r in self:
3263			if 'Teq' in r:
3264				if r['Sample'] in foo:
3265					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3266				else:
3267					foo[r['Sample']] = f(r['Teq'])
3268			else:
3269					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3270
3271		if priority == 'replace':
3272			self.Nominal_D47 = {}
3273		for s in foo:
3274			if priority != 'old' or s not in self.Nominal_D47:
3275				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3277	def save_D47_correl(self, *args, **kwargs):
3278		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D48data(D4xdata):
3283class D48data(D4xdata):
3284	'''
3285	Store and process data for a large set of Δ48 analyses,
3286	usually comprising more than one analytical session.
3287	'''
3288
3289	Nominal_D4x = {
3290		'ETH-1':  0.138,
3291		'ETH-2':  0.138,
3292		'ETH-3':  0.270,
3293		'ETH-4':  0.223,
3294		'GU-1':  -0.419,
3295		} # (Fiebig et al., 2019, 2021)
3296	'''
3297	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3298	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3299	reference frame.
3300
3301	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3302	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3303
3304	```py
3305	{
3306		'ETH-1' :  0.138,
3307		'ETH-2' :  0.138,
3308		'ETH-3' :  0.270,
3309		'ETH-4' :  0.223,
3310		'GU-1'  : -0.419,
3311	}
3312	```
3313	'''
3314
3315
3316	@property
3317	def Nominal_D48(self):
3318		return self.Nominal_D4x
3319
3320	
3321	@Nominal_D48.setter
3322	def Nominal_D48(self, new):
3323		self.Nominal_D4x = dict(**new)
3324		self.refresh()
3325
3326
3327	def __init__(self, l = [], **kwargs):
3328		'''
3329		**Parameters:** same as `D4xdata.__init__()`
3330		'''
3331		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3332
3333	def save_D48_correl(self, *args, **kwargs):
3334		return self._save_D4x_correl(*args, **kwargs)
3335
3336	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3327	def __init__(self, l = [], **kwargs):
3328		'''
3329		**Parameters:** same as `D4xdata.__init__()`
3330		'''
3331		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3316	@property
3317	def Nominal_D48(self):
3318		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3333	def save_D48_correl(self, *args, **kwargs):
3334		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D49data(D4xdata):
3339class D49data(D4xdata):
3340	'''
3341	Store and process data for a large set of Δ49 analyses,
3342	usually comprising more than one analytical session.
3343	'''
3344	
3345	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3346	'''
3347	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3348	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3349	reference frame.
3350
3351	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3352
3353	```py
3354	{
3355		"1000C": 0.0,
3356		"25C": 2.228
3357	}
3358	```
3359	'''
3360	
3361	@property
3362	def Nominal_D49(self):
3363		return self.Nominal_D4x
3364	
3365	@Nominal_D49.setter
3366	def Nominal_D49(self, new):
3367		self.Nominal_D4x = dict(**new)
3368		self.refresh()
3369	
3370	def __init__(self, l=[], **kwargs):
3371		'''
3372		**Parameters:** same as `D4xdata.__init__()`
3373		'''
3374		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3375	
3376	def save_D49_correl(self, *args, **kwargs):
3377		return self._save_D4x_correl(*args, **kwargs)
3378	
3379	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3370	def __init__(self, l=[], **kwargs):
3371		'''
3372		**Parameters:** same as `D4xdata.__init__()`
3373		'''
3374		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3361	@property
3362	def Nominal_D49(self):
3363		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3376	def save_D49_correl(self, *args, **kwargs):
3377		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)