D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

4. API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16# 4. API Documentation
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) 2024 Mathieu Daëron'
  23__license__   = 'Modified BSD License - https://opensource.org/licenses/BSD-3-Clause'
  24__date__      = '2024-09-10'
  25__version__   = '2.4.1'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42
  43typer.rich_utils.STYLE_HELPTEXT = ''
  44
  45rcParams['font.family'] = 'sans-serif'
  46rcParams['font.sans-serif'] = 'Helvetica'
  47rcParams['font.size'] = 10
  48rcParams['mathtext.fontset'] = 'custom'
  49rcParams['mathtext.rm'] = 'sans'
  50rcParams['mathtext.bf'] = 'sans:bold'
  51rcParams['mathtext.it'] = 'sans:italic'
  52rcParams['mathtext.cal'] = 'sans:italic'
  53rcParams['mathtext.default'] = 'rm'
  54rcParams['xtick.major.size'] = 4
  55rcParams['xtick.major.width'] = 1
  56rcParams['ytick.major.size'] = 4
  57rcParams['ytick.major.width'] = 1
  58rcParams['axes.grid'] = False
  59rcParams['axes.linewidth'] = 1
  60rcParams['grid.linewidth'] = .75
  61rcParams['grid.linestyle'] = '-'
  62rcParams['grid.alpha'] = .15
  63rcParams['savefig.dpi'] = 150
  64
  65Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  66_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  67def fCO2eqD47_Petersen(T):
  68	'''
  69	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  70	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  71
  72	'''
  73	return float(_fCO2eqD47_Petersen(T))
  74
  75
  76Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  77_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  78def fCO2eqD47_Wang(T):
  79	'''
  80	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  81	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  82	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  83	'''
  84	return float(_fCO2eqD47_Wang(T))
  85
  86
  87def correlated_sum(X, C, w = None):
  88	'''
  89	Compute covariance-aware linear combinations
  90
  91	**Parameters**
  92	
  93	+ `X`: list or 1-D array of values to sum
  94	+ `C`: covariance matrix for the elements of `X`
  95	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  96	       (all equal to 1 by default)
  97
  98	Return the sum (and its SE) of the elements of `X`, with optional weights equal
  99	to the elements of `w`, accounting for covariances between the elements of `X`.
 100	'''
 101	if w is None:
 102		w = [1 for x in X]
 103	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 104
 105
 106def make_csv(x, hsep = ',', vsep = '\n'):
 107	'''
 108	Formats a list of lists of strings as a CSV
 109
 110	**Parameters**
 111
 112	+ `x`: the list of lists of strings to format
 113	+ `hsep`: the field separator (`,` by default)
 114	+ `vsep`: the line-ending convention to use (`\\n` by default)
 115
 116	**Example**
 117
 118	```py
 119	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 120	```
 121
 122	outputs:
 123
 124	```py
 125	a,b,c
 126	d,e,f
 127	```
 128	'''
 129	return vsep.join([hsep.join(l) for l in x])
 130
 131
 132def pf(txt):
 133	'''
 134	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 135	'''
 136	return txt.replace('-','_').replace('.','_').replace(' ','_')
 137
 138
 139def smart_type(x):
 140	'''
 141	Tries to convert string `x` to a float if it includes a decimal point, or
 142	to an integer if it does not. If both attempts fail, return the original
 143	string unchanged.
 144	'''
 145	try:
 146		y = float(x)
 147	except ValueError:
 148		return x
 149	if '.' not in x:
 150		return int(y)
 151	return y
 152
 153
 154def pretty_table(x, header = 1, hsep = '  ', vsep = '–', align = '<'):
 155	'''
 156	Reads a list of lists of strings and outputs an ascii table
 157
 158	**Parameters**
 159
 160	+ `x`: a list of lists of strings
 161	+ `header`: the number of lines to treat as header lines
 162	+ `hsep`: the horizontal separator between columns
 163	+ `vsep`: the character to use as vertical separator
 164	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 165
 166	**Example**
 167
 168	```py
 169	x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
 170	print(pretty_table(x))
 171	```
 172	yields:	
 173	```
 174	--  ------  ---
 175	A        B    C
 176	--  ------  ---
 177	1   1.9999  foo
 178	10       x  bar
 179	--  ------  ---
 180	```
 181	
 182	'''
 183	txt = []
 184	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 185
 186	if len(widths) > len(align):
 187		align += '>' * (len(widths)-len(align))
 188	sepline = hsep.join([vsep*w for w in widths])
 189	txt += [sepline]
 190	for k,l in enumerate(x):
 191		if k and k == header:
 192			txt += [sepline]
 193		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 194	txt += [sepline]
 195	txt += ['']
 196	return '\n'.join(txt)
 197
 198
 199def transpose_table(x):
 200	'''
 201	Transpose a list if lists
 202
 203	**Parameters**
 204
 205	+ `x`: a list of lists
 206
 207	**Example**
 208
 209	```py
 210	x = [[1, 2], [3, 4]]
 211	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 212	```
 213	'''
 214	return [[e for e in c] for c in zip(*x)]
 215
 216
 217def w_avg(X, sX) :
 218	'''
 219	Compute variance-weighted average
 220
 221	Returns the value and SE of the weighted average of the elements of `X`,
 222	with relative weights equal to their inverse variances (`1/sX**2`).
 223
 224	**Parameters**
 225
 226	+ `X`: array-like of elements to average
 227	+ `sX`: array-like of the corresponding SE values
 228
 229	**Tip**
 230
 231	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 232	they may be rearranged using `zip()`:
 233
 234	```python
 235	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 236	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 237	```
 238	'''
 239	X = [ x for x in X ]
 240	sX = [ sx for sx in sX ]
 241	W = [ sx**-2 for sx in sX ]
 242	W = [ w/sum(W) for w in W ]
 243	Xavg = sum([ w*x for w,x in zip(W,X) ])
 244	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 245	return Xavg, sXavg
 246
 247
 248def read_csv(filename, sep = ''):
 249	'''
 250	Read contents of `filename` in csv format and return a list of dictionaries.
 251
 252	In the csv string, spaces before and after field separators (`','` by default)
 253	are optional.
 254
 255	**Parameters**
 256
 257	+ `filename`: the csv file to read
 258	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 259	whichever appers most often in the contents of `filename`.
 260	'''
 261	with open(filename) as fid:
 262		txt = fid.read()
 263
 264	if sep == '':
 265		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 266	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 267	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 268
 269
 270def simulate_single_analysis(
 271	sample = 'MYSAMPLE',
 272	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 273	d13C_VPDB = None, d18O_VPDB = None,
 274	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 275	a47 = 1., b47 = 0., c47 = -0.9,
 276	a48 = 1., b48 = 0., c48 = -0.45,
 277	Nominal_D47 = None,
 278	Nominal_D48 = None,
 279	Nominal_d13C_VPDB = None,
 280	Nominal_d18O_VPDB = None,
 281	ALPHA_18O_ACID_REACTION = None,
 282	R13_VPDB = None,
 283	R17_VSMOW = None,
 284	R18_VSMOW = None,
 285	LAMBDA_17 = None,
 286	R18_VPDB = None,
 287	):
 288	'''
 289	Compute working-gas delta values for a single analysis, assuming a stochastic working
 290	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 291	
 292	**Parameters**
 293
 294	+ `sample`: sample name
 295	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 296		(respectively –4 and +26 ‰ by default)
 297	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 298	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 299		of the carbonate sample
 300	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 301		Δ48 values if `D47` or `D48` are not specified
 302	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 303		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 304	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 305	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 306		correction parameters (by default equal to the `D4xdata` default values)
 307	
 308	Returns a dictionary with fields
 309	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 310	'''
 311
 312	if Nominal_d13C_VPDB is None:
 313		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 314
 315	if Nominal_d18O_VPDB is None:
 316		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 317
 318	if ALPHA_18O_ACID_REACTION is None:
 319		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 320
 321	if R13_VPDB is None:
 322		R13_VPDB = D4xdata().R13_VPDB
 323
 324	if R17_VSMOW is None:
 325		R17_VSMOW = D4xdata().R17_VSMOW
 326
 327	if R18_VSMOW is None:
 328		R18_VSMOW = D4xdata().R18_VSMOW
 329
 330	if LAMBDA_17 is None:
 331		LAMBDA_17 = D4xdata().LAMBDA_17
 332
 333	if R18_VPDB is None:
 334		R18_VPDB = D4xdata().R18_VPDB
 335	
 336	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 337	
 338	if Nominal_D47 is None:
 339		Nominal_D47 = D47data().Nominal_D47
 340
 341	if Nominal_D48 is None:
 342		Nominal_D48 = D48data().Nominal_D48
 343	
 344	if d13C_VPDB is None:
 345		if sample in Nominal_d13C_VPDB:
 346			d13C_VPDB = Nominal_d13C_VPDB[sample]
 347		else:
 348			raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.")
 349
 350	if d18O_VPDB is None:
 351		if sample in Nominal_d18O_VPDB:
 352			d18O_VPDB = Nominal_d18O_VPDB[sample]
 353		else:
 354			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 355
 356	if D47 is None:
 357		if sample in Nominal_D47:
 358			D47 = Nominal_D47[sample]
 359		else:
 360			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 361
 362	if D48 is None:
 363		if sample in Nominal_D48:
 364			D48 = Nominal_D48[sample]
 365		else:
 366			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 367
 368	X = D4xdata()
 369	X.R13_VPDB = R13_VPDB
 370	X.R17_VSMOW = R17_VSMOW
 371	X.R18_VSMOW = R18_VSMOW
 372	X.LAMBDA_17 = LAMBDA_17
 373	X.R18_VPDB = R18_VPDB
 374	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 375
 376	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 377		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 378		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 379		)
 380	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 381		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 382		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 383		D17O=D17O, D47=D47, D48=D48, D49=D49,
 384		)
 385	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 386		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 387		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 388		D17O=D17O,
 389		)
 390	
 391	d45 = 1000 * (R45/R45wg - 1)
 392	d46 = 1000 * (R46/R46wg - 1)
 393	d47 = 1000 * (R47/R47wg - 1)
 394	d48 = 1000 * (R48/R48wg - 1)
 395	d49 = 1000 * (R49/R49wg - 1)
 396
 397	for k in range(3): # dumb iteration to adjust for small changes in d47
 398		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 399		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 400		d47 = 1000 * (R47raw/R47wg - 1)
 401		d48 = 1000 * (R48raw/R48wg - 1)
 402
 403	return dict(
 404		Sample = sample,
 405		D17O = D17O,
 406		d13Cwg_VPDB = d13Cwg_VPDB,
 407		d18Owg_VSMOW = d18Owg_VSMOW,
 408		d45 = d45,
 409		d46 = d46,
 410		d47 = d47,
 411		d48 = d48,
 412		d49 = d49,
 413		)
 414
 415
 416def virtual_data(
 417	samples = [],
 418	a47 = 1., b47 = 0., c47 = -0.9,
 419	a48 = 1., b48 = 0., c48 = -0.45,
 420	rd45 = 0.020, rd46 = 0.060,
 421	rD47 = 0.015, rD48 = 0.045,
 422	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 423	session = None,
 424	Nominal_D47 = None, Nominal_D48 = None,
 425	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 426	ALPHA_18O_ACID_REACTION = None,
 427	R13_VPDB = None,
 428	R17_VSMOW = None,
 429	R18_VSMOW = None,
 430	LAMBDA_17 = None,
 431	R18_VPDB = None,
 432	seed = 0,
 433	shuffle = True,
 434	):
 435	'''
 436	Return list with simulated analyses from a single session.
 437	
 438	**Parameters**
 439	
 440	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 441	    * `Sample`: the name of the sample
 442	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 443	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 444	    * `N`: how many analyses to generate for this sample
 445	+ `a47`: scrambling factor for Δ47
 446	+ `b47`: compositional nonlinearity for Δ47
 447	+ `c47`: working gas offset for Δ47
 448	+ `a48`: scrambling factor for Δ48
 449	+ `b48`: compositional nonlinearity for Δ48
 450	+ `c48`: working gas offset for Δ48
 451	+ `rd45`: analytical repeatability of δ45
 452	+ `rd46`: analytical repeatability of δ46
 453	+ `rD47`: analytical repeatability of Δ47
 454	+ `rD48`: analytical repeatability of Δ48
 455	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 456		(by default equal to the `simulate_single_analysis` default values)
 457	+ `session`: name of the session (no name by default)
 458	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 459		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 460	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 461		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 462		(by default equal to the `simulate_single_analysis` defaults)
 463	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 464		(by default equal to the `simulate_single_analysis` defaults)
 465	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 466		correction parameters (by default equal to the `simulate_single_analysis` default)
 467	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 468	+ `shuffle`: randomly reorder the sequence of analyses
 469	
 470		
 471	Here is an example of using this method to generate an arbitrary combination of
 472	anchors and unknowns for a bunch of sessions:
 473
 474	```py
 475	.. include:: ../../code_examples/virtual_data/example.py
 476	```
 477	
 478	This should output something like:
 479	
 480	```
 481	.. include:: ../../code_examples/virtual_data/output.txt
 482	```
 483	'''
 484	
 485	kwargs = locals().copy()
 486
 487	from numpy import random as nprandom
 488	if seed:
 489		rng = nprandom.default_rng(seed)
 490	else:
 491		rng = nprandom.default_rng()
 492	
 493	N = sum([s['N'] for s in samples])
 494	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 495	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 496	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 497	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 498	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 499	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 500	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 501	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 502	
 503	k = 0
 504	out = []
 505	for s in samples:
 506		kw = {}
 507		kw['sample'] = s['Sample']
 508		kw = {
 509			**kw,
 510			**{var: kwargs[var]
 511				for var in [
 512					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 513					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 514					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 515					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 516					]
 517				if kwargs[var] is not None},
 518			**{var: s[var]
 519				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 520				if var in s},
 521			}
 522
 523		sN = s['N']
 524		while sN:
 525			out.append(simulate_single_analysis(**kw))
 526			out[-1]['d45'] += errors45[k]
 527			out[-1]['d46'] += errors46[k]
 528			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 529			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 530			sN -= 1
 531			k += 1
 532
 533		if session is not None:
 534			for r in out:
 535				r['Session'] = session
 536
 537		if shuffle:
 538			nprandom.shuffle(out)
 539
 540	return out
 541
 542def table_of_samples(
 543	data47 = None,
 544	data48 = None,
 545	dir = 'output',
 546	filename = None,
 547	save_to_file = True,
 548	print_out = True,
 549	output = None,
 550	):
 551	'''
 552	Print out, save to disk and/or return a combined table of samples
 553	for a pair of `D47data` and `D48data` objects.
 554
 555	**Parameters**
 556
 557	+ `data47`: `D47data` instance
 558	+ `data48`: `D48data` instance
 559	+ `dir`: the directory in which to save the table
 560	+ `filename`: the name to the csv file to write to
 561	+ `save_to_file`: whether to save the table to disk
 562	+ `print_out`: whether to print out the table
 563	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 564		if set to `'raw'`: return a list of list of strings
 565		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 566	'''
 567	if data47 is None:
 568		if data48 is None:
 569			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 570		else:
 571			return data48.table_of_samples(
 572				dir = dir,
 573				filename = filename,
 574				save_to_file = save_to_file,
 575				print_out = print_out,
 576				output = output
 577				)
 578	else:
 579		if data48 is None:
 580			return data47.table_of_samples(
 581				dir = dir,
 582				filename = filename,
 583				save_to_file = save_to_file,
 584				print_out = print_out,
 585				output = output
 586				)
 587		else:
 588			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 589			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 590			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 591
 592			if save_to_file:
 593				if not os.path.exists(dir):
 594					os.makedirs(dir)
 595				if filename is None:
 596					filename = f'D47D48_samples.csv'
 597				with open(f'{dir}/{filename}', 'w') as fid:
 598					fid.write(make_csv(out))
 599			if print_out:
 600				print('\n'+pretty_table(out))
 601			if output == 'raw':
 602				return out
 603			elif output == 'pretty':
 604				return pretty_table(out)
 605
 606
 607def table_of_sessions(
 608	data47 = None,
 609	data48 = None,
 610	dir = 'output',
 611	filename = None,
 612	save_to_file = True,
 613	print_out = True,
 614	output = None,
 615	):
 616	'''
 617	Print out, save to disk and/or return a combined table of sessions
 618	for a pair of `D47data` and `D48data` objects.
 619	***Only applicable if the sessions in `data47` and those in `data48`
 620	consist of the exact same sets of analyses.***
 621
 622	**Parameters**
 623
 624	+ `data47`: `D47data` instance
 625	+ `data48`: `D48data` instance
 626	+ `dir`: the directory in which to save the table
 627	+ `filename`: the name to the csv file to write to
 628	+ `save_to_file`: whether to save the table to disk
 629	+ `print_out`: whether to print out the table
 630	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 631		if set to `'raw'`: return a list of list of strings
 632		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 633	'''
 634	if data47 is None:
 635		if data48 is None:
 636			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 637		else:
 638			return data48.table_of_sessions(
 639				dir = dir,
 640				filename = filename,
 641				save_to_file = save_to_file,
 642				print_out = print_out,
 643				output = output
 644				)
 645	else:
 646		if data48 is None:
 647			return data47.table_of_sessions(
 648				dir = dir,
 649				filename = filename,
 650				save_to_file = save_to_file,
 651				print_out = print_out,
 652				output = output
 653				)
 654		else:
 655			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 656			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 657			for k,x in enumerate(out47[0]):
 658				if k>7:
 659					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 660					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 661			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 662
 663			if save_to_file:
 664				if not os.path.exists(dir):
 665					os.makedirs(dir)
 666				if filename is None:
 667					filename = f'D47D48_sessions.csv'
 668				with open(f'{dir}/{filename}', 'w') as fid:
 669					fid.write(make_csv(out))
 670			if print_out:
 671				print('\n'+pretty_table(out))
 672			if output == 'raw':
 673				return out
 674			elif output == 'pretty':
 675				return pretty_table(out)
 676
 677
 678def table_of_analyses(
 679	data47 = None,
 680	data48 = None,
 681	dir = 'output',
 682	filename = None,
 683	save_to_file = True,
 684	print_out = True,
 685	output = None,
 686	):
 687	'''
 688	Print out, save to disk and/or return a combined table of analyses
 689	for a pair of `D47data` and `D48data` objects.
 690
 691	If the sessions in `data47` and those in `data48` do not consist of
 692	the exact same sets of analyses, the table will have two columns
 693	`Session_47` and `Session_48` instead of a single `Session` column.
 694
 695	**Parameters**
 696
 697	+ `data47`: `D47data` instance
 698	+ `data48`: `D48data` instance
 699	+ `dir`: the directory in which to save the table
 700	+ `filename`: the name to the csv file to write to
 701	+ `save_to_file`: whether to save the table to disk
 702	+ `print_out`: whether to print out the table
 703	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 704		if set to `'raw'`: return a list of list of strings
 705		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 706	'''
 707	if data47 is None:
 708		if data48 is None:
 709			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 710		else:
 711			return data48.table_of_analyses(
 712				dir = dir,
 713				filename = filename,
 714				save_to_file = save_to_file,
 715				print_out = print_out,
 716				output = output
 717				)
 718	else:
 719		if data48 is None:
 720			return data47.table_of_analyses(
 721				dir = dir,
 722				filename = filename,
 723				save_to_file = save_to_file,
 724				print_out = print_out,
 725				output = output
 726				)
 727		else:
 728			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 729			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 730			
 731			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 732				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 733			else:
 734				out47[0][1] = 'Session_47'
 735				out48[0][1] = 'Session_48'
 736				out47 = transpose_table(out47)
 737				out48 = transpose_table(out48)
 738				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 739
 740			if save_to_file:
 741				if not os.path.exists(dir):
 742					os.makedirs(dir)
 743				if filename is None:
 744					filename = f'D47D48_sessions.csv'
 745				with open(f'{dir}/{filename}', 'w') as fid:
 746					fid.write(make_csv(out))
 747			if print_out:
 748				print('\n'+pretty_table(out))
 749			if output == 'raw':
 750				return out
 751			elif output == 'pretty':
 752				return pretty_table(out)
 753
 754
 755def _fullcovar(minresult, epsilon = 0.01, named = False):
 756	'''
 757	Construct full covariance matrix in the case of constrained parameters
 758	'''
 759	
 760	import asteval
 761	
 762	def f(values):
 763		interp = asteval.Interpreter()
 764		for n,v in zip(minresult.var_names, values):
 765			interp(f'{n} = {v}')
 766		for q in minresult.params:
 767			if minresult.params[q].expr:
 768				interp(f'{q} = {minresult.params[q].expr}')
 769		return np.array([interp.symtable[q] for q in minresult.params])
 770
 771	# construct Jacobian
 772	J = np.zeros((minresult.nvarys, len(minresult.params)))
 773	X = np.array([minresult.params[p].value for p in minresult.var_names])
 774	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 775
 776	for j in range(minresult.nvarys):
 777		x1 = [_ for _ in X]
 778		x1[j] += epsilon * sX[j]
 779		x2 = [_ for _ in X]
 780		x2[j] -= epsilon * sX[j]
 781		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 782
 783	_names = [q for q in minresult.params]
 784	_covar = J.T @ minresult.covar @ J
 785	_se = np.diag(_covar)**.5
 786	_correl = _covar.copy()
 787	for k,s in enumerate(_se):
 788		if s:
 789			_correl[k,:] /= s
 790			_correl[:,k] /= s
 791
 792	if named:
 793		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 794		_se = {i: _se[i] for i in minresult.params}
 795		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 796
 797	return _names, _covar, _se, _correl
 798
 799
 800class D4xdata(list):
 801	'''
 802	Store and process data for a large set of Δ47 and/or Δ48
 803	analyses, usually comprising more than one analytical session.
 804	'''
 805
 806	### 17O CORRECTION PARAMETERS
 807	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 808	'''
 809	Absolute (13C/12C) ratio of VPDB.
 810	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 811	'''
 812
 813	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 814	'''
 815	Absolute (18O/16C) ratio of VSMOW.
 816	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 817	'''
 818
 819	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 820	'''
 821	Mass-dependent exponent for triple oxygen isotopes.
 822	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 823	'''
 824
 825	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 826	'''
 827	Absolute (17O/16C) ratio of VSMOW.
 828	By default equal to 0.00038475
 829	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 830	rescaled to `R13_VPDB`)
 831	'''
 832
 833	R18_VPDB = R18_VSMOW * 1.03092
 834	'''
 835	Absolute (18O/16C) ratio of VPDB.
 836	By definition equal to `R18_VSMOW * 1.03092`.
 837	'''
 838
 839	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 840	'''
 841	Absolute (17O/16C) ratio of VPDB.
 842	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 843	'''
 844
 845	LEVENE_REF_SAMPLE = 'ETH-3'
 846	'''
 847	After the Δ4x standardization step, each sample is tested to
 848	assess whether the Δ4x variance within all analyses for that
 849	sample differs significantly from that observed for a given reference
 850	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 851	which yields a p-value corresponding to the null hypothesis that the
 852	underlying variances are equal).
 853
 854	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 855	sample should be used as a reference for this test.
 856	'''
 857
 858	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 859	'''
 860	Specifies the 18O/16O fractionation factor generally applicable
 861	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 862	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 863
 864	By default equal to 1.008129 (calcite reacted at 90 °C,
 865	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 866	'''
 867
 868	Nominal_d13C_VPDB = {
 869		'ETH-1': 2.02,
 870		'ETH-2': -10.17,
 871		'ETH-3': 1.71,
 872		}	# (Bernasconi et al., 2018)
 873	'''
 874	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 875	`D4xdata.standardize_d13C()`.
 876
 877	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 878	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 879	'''
 880
 881	Nominal_d18O_VPDB = {
 882		'ETH-1': -2.19,
 883		'ETH-2': -18.69,
 884		'ETH-3': -1.78,
 885		}	# (Bernasconi et al., 2018)
 886	'''
 887	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 888	`D4xdata.standardize_d18O()`.
 889
 890	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 891	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 892	'''
 893
 894	d13C_STANDARDIZATION_METHOD = '2pt'
 895	'''
 896	Method by which to standardize δ13C values:
 897	
 898	+ `none`: do not apply any δ13C standardization.
 899	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 900	minimize the difference between final δ13C_VPDB values and
 901	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 902	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 903	values so as to minimize the difference between final δ13C_VPDB
 904	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 905	is defined).
 906	'''
 907
 908	d18O_STANDARDIZATION_METHOD = '2pt'
 909	'''
 910	Method by which to standardize δ18O values:
 911	
 912	+ `none`: do not apply any δ18O standardization.
 913	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 914	minimize the difference between final δ18O_VPDB values and
 915	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 916	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 917	values so as to minimize the difference between final δ18O_VPDB
 918	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 919	is defined).
 920	'''
 921
 922	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 923		'''
 924		**Parameters**
 925
 926		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 927		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 928		+ `mass`: `'47'` or `'48'`
 929		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 930		+ `session`: define session name for analyses without a `Session` key
 931		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 932
 933		Returns a `D4xdata` object derived from `list`.
 934		'''
 935		self._4x = mass
 936		self.verbose = verbose
 937		self.prefix = 'D4xdata'
 938		self.logfile = logfile
 939		list.__init__(self, l)
 940		self.Nf = None
 941		self.repeatability = {}
 942		self.refresh(session = session)
 943
 944
 945	def make_verbal(oldfun):
 946		'''
 947		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 948		'''
 949		@wraps(oldfun)
 950		def newfun(*args, verbose = '', **kwargs):
 951			myself = args[0]
 952			oldprefix = myself.prefix
 953			myself.prefix = oldfun.__name__
 954			if verbose != '':
 955				oldverbose = myself.verbose
 956				myself.verbose = verbose
 957			out = oldfun(*args, **kwargs)
 958			myself.prefix = oldprefix
 959			if verbose != '':
 960				myself.verbose = oldverbose
 961			return out
 962		return newfun
 963
 964
 965	def msg(self, txt):
 966		'''
 967		Log a message to `self.logfile`, and print it out if `verbose = True`
 968		'''
 969		self.log(txt)
 970		if self.verbose:
 971			print(f'{f"[{self.prefix}]":<16} {txt}')
 972
 973
 974	def vmsg(self, txt):
 975		'''
 976		Log a message to `self.logfile` and print it out
 977		'''
 978		self.log(txt)
 979		print(txt)
 980
 981
 982	def log(self, *txts):
 983		'''
 984		Log a message to `self.logfile`
 985		'''
 986		if self.logfile:
 987			with open(self.logfile, 'a') as fid:
 988				for txt in txts:
 989					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
 990
 991
 992	def refresh(self, session = 'mySession'):
 993		'''
 994		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
 995		'''
 996		self.fill_in_missing_info(session = session)
 997		self.refresh_sessions()
 998		self.refresh_samples()
 999
1000
1001	def refresh_sessions(self):
1002		'''
1003		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1004		to `False` for all sessions.
1005		'''
1006		self.sessions = {
1007			s: {'data': [r for r in self if r['Session'] == s]}
1008			for s in sorted({r['Session'] for r in self})
1009			}
1010		for s in self.sessions:
1011			self.sessions[s]['scrambling_drift'] = False
1012			self.sessions[s]['slope_drift'] = False
1013			self.sessions[s]['wg_drift'] = False
1014			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1015			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1016
1017
1018	def refresh_samples(self):
1019		'''
1020		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1021		'''
1022		self.samples = {
1023			s: {'data': [r for r in self if r['Sample'] == s]}
1024			for s in sorted({r['Sample'] for r in self})
1025			}
1026		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1027		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1028
1029
1030	def read(self, filename, sep = '', session = ''):
1031		'''
1032		Read file in csv format to load data into a `D47data` object.
1033
1034		In the csv file, spaces before and after field separators (`','` by default)
1035		are optional. Each line corresponds to a single analysis.
1036
1037		The required fields are:
1038
1039		+ `UID`: a unique identifier
1040		+ `Session`: an identifier for the analytical session
1041		+ `Sample`: a sample identifier
1042		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1043
1044		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1045		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1046		and `d49` are optional, and set to NaN by default.
1047
1048		**Parameters**
1049
1050		+ `fileneme`: the path of the file to read
1051		+ `sep`: csv separator delimiting the fields
1052		+ `session`: set `Session` field to this string for all analyses
1053		'''
1054		with open(filename) as fid:
1055			self.input(fid.read(), sep = sep, session = session)
1056
1057
1058	def input(self, txt, sep = '', session = ''):
1059		'''
1060		Read `txt` string in csv format to load analysis data into a `D47data` object.
1061
1062		In the csv string, spaces before and after field separators (`','` by default)
1063		are optional. Each line corresponds to a single analysis.
1064
1065		The required fields are:
1066
1067		+ `UID`: a unique identifier
1068		+ `Session`: an identifier for the analytical session
1069		+ `Sample`: a sample identifier
1070		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1071
1072		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1073		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1074		and `d49` are optional, and set to NaN by default.
1075
1076		**Parameters**
1077
1078		+ `txt`: the csv string to read
1079		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1080		whichever appers most often in `txt`.
1081		+ `session`: set `Session` field to this string for all analyses
1082		'''
1083		if sep == '':
1084			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1085		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1086		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1087
1088		if session != '':
1089			for r in data:
1090				r['Session'] = session
1091
1092		self += data
1093		self.refresh()
1094
1095
1096	@make_verbal
1097	def wg(self, samples = None, a18_acid = None):
1098		'''
1099		Compute bulk composition of the working gas for each session based on
1100		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1101		`self.Nominal_d18O_VPDB`.
1102		'''
1103
1104		self.msg('Computing WG composition:')
1105
1106		if a18_acid is None:
1107			a18_acid = self.ALPHA_18O_ACID_REACTION
1108		if samples is None:
1109			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1110
1111		assert a18_acid, f'Acid fractionation factor should not be zero.'
1112
1113		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1114		R45R46_standards = {}
1115		for sample in samples:
1116			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1117			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1118			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1119			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1120			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1121
1122			C12_s = 1 / (1 + R13_s)
1123			C13_s = R13_s / (1 + R13_s)
1124			C16_s = 1 / (1 + R17_s + R18_s)
1125			C17_s = R17_s / (1 + R17_s + R18_s)
1126			C18_s = R18_s / (1 + R17_s + R18_s)
1127
1128			C626_s = C12_s * C16_s ** 2
1129			C627_s = 2 * C12_s * C16_s * C17_s
1130			C628_s = 2 * C12_s * C16_s * C18_s
1131			C636_s = C13_s * C16_s ** 2
1132			C637_s = 2 * C13_s * C16_s * C17_s
1133			C727_s = C12_s * C17_s ** 2
1134
1135			R45_s = (C627_s + C636_s) / C626_s
1136			R46_s = (C628_s + C637_s + C727_s) / C626_s
1137			R45R46_standards[sample] = (R45_s, R46_s)
1138		
1139		for s in self.sessions:
1140			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1141			assert db, f'No sample from {samples} found in session "{s}".'
1142# 			dbsamples = sorted({r['Sample'] for r in db})
1143
1144			X = [r['d45'] for r in db]
1145			Y = [R45R46_standards[r['Sample']][0] for r in db]
1146			x1, x2 = np.min(X), np.max(X)
1147
1148			if x1 < x2:
1149				wgcoord = x1/(x1-x2)
1150			else:
1151				wgcoord = 999
1152
1153			if wgcoord < -.5 or wgcoord > 1.5:
1154				# unreasonable to extrapolate to d45 = 0
1155				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1156			else :
1157				# d45 = 0 is reasonably well bracketed
1158				R45_wg = np.polyfit(X, Y, 1)[1]
1159
1160			X = [r['d46'] for r in db]
1161			Y = [R45R46_standards[r['Sample']][1] for r in db]
1162			x1, x2 = np.min(X), np.max(X)
1163
1164			if x1 < x2:
1165				wgcoord = x1/(x1-x2)
1166			else:
1167				wgcoord = 999
1168
1169			if wgcoord < -.5 or wgcoord > 1.5:
1170				# unreasonable to extrapolate to d46 = 0
1171				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1172			else :
1173				# d46 = 0 is reasonably well bracketed
1174				R46_wg = np.polyfit(X, Y, 1)[1]
1175
1176			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1177
1178			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1179
1180			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1181			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1182			for r in self.sessions[s]['data']:
1183				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1184				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1185
1186
1187	def compute_bulk_delta(self, R45, R46, D17O = 0):
1188		'''
1189		Compute δ13C_VPDB and δ18O_VSMOW,
1190		by solving the generalized form of equation (17) from
1191		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1192		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1193		solving the corresponding second-order Taylor polynomial.
1194		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1195		'''
1196
1197		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1198
1199		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1200		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1201		C = 2 * self.R18_VSMOW
1202		D = -R46
1203
1204		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1205		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1206		cc = A + B + C + D
1207
1208		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1209
1210		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1211		R17 = K * R18 ** self.LAMBDA_17
1212		R13 = R45 - 2 * R17
1213
1214		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1215
1216		return d13C_VPDB, d18O_VSMOW
1217
1218
1219	@make_verbal
1220	def crunch(self, verbose = ''):
1221		'''
1222		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1223		'''
1224		for r in self:
1225			self.compute_bulk_and_clumping_deltas(r)
1226		self.standardize_d13C()
1227		self.standardize_d18O()
1228		self.msg(f"Crunched {len(self)} analyses.")
1229
1230
1231	def fill_in_missing_info(self, session = 'mySession'):
1232		'''
1233		Fill in optional fields with default values
1234		'''
1235		for i,r in enumerate(self):
1236			if 'D17O' not in r:
1237				r['D17O'] = 0.
1238			if 'UID' not in r:
1239				r['UID'] = f'{i+1}'
1240			if 'Session' not in r:
1241				r['Session'] = session
1242			for k in ['d47', 'd48', 'd49']:
1243				if k not in r:
1244					r[k] = np.nan
1245
1246
1247	def standardize_d13C(self):
1248		'''
1249		Perform δ13C standadization within each session `s` according to
1250		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1251		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1252		may be redefined abitrarily at a later stage.
1253		'''
1254		for s in self.sessions:
1255			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1256				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1257				X,Y = zip(*XY)
1258				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1259					offset = np.mean(Y) - np.mean(X)
1260					for r in self.sessions[s]['data']:
1261						r['d13C_VPDB'] += offset				
1262				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1263					a,b = np.polyfit(X,Y,1)
1264					for r in self.sessions[s]['data']:
1265						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1266
1267	def standardize_d18O(self):
1268		'''
1269		Perform δ18O standadization within each session `s` according to
1270		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1271		which is defined by default by `D47data.refresh_sessions()`as equal to
1272		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1273		'''
1274		for s in self.sessions:
1275			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1276				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1277				X,Y = zip(*XY)
1278				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1279				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1280					offset = np.mean(Y) - np.mean(X)
1281					for r in self.sessions[s]['data']:
1282						r['d18O_VSMOW'] += offset				
1283				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1284					a,b = np.polyfit(X,Y,1)
1285					for r in self.sessions[s]['data']:
1286						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1287	
1288
1289	def compute_bulk_and_clumping_deltas(self, r):
1290		'''
1291		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1292		'''
1293
1294		# Compute working gas R13, R18, and isobar ratios
1295		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1296		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1297		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1298
1299		# Compute analyte isobar ratios
1300		R45 = (1 + r['d45'] / 1000) * R45_wg
1301		R46 = (1 + r['d46'] / 1000) * R46_wg
1302		R47 = (1 + r['d47'] / 1000) * R47_wg
1303		R48 = (1 + r['d48'] / 1000) * R48_wg
1304		R49 = (1 + r['d49'] / 1000) * R49_wg
1305
1306		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1307		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1308		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1309
1310		# Compute stochastic isobar ratios of the analyte
1311		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1312			R13, R18, D17O = r['D17O']
1313		)
1314
1315		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1316		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1317		if (R45 / R45stoch - 1) > 5e-8:
1318			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1319		if (R46 / R46stoch - 1) > 5e-8:
1320			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1321
1322		# Compute raw clumped isotope anomalies
1323		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1324		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1325		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1326
1327
1328	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1329		'''
1330		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1331		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1332		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1333		'''
1334
1335		# Compute R17
1336		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1337
1338		# Compute isotope concentrations
1339		C12 = (1 + R13) ** -1
1340		C13 = C12 * R13
1341		C16 = (1 + R17 + R18) ** -1
1342		C17 = C16 * R17
1343		C18 = C16 * R18
1344
1345		# Compute stochastic isotopologue concentrations
1346		C626 = C16 * C12 * C16
1347		C627 = C16 * C12 * C17 * 2
1348		C628 = C16 * C12 * C18 * 2
1349		C636 = C16 * C13 * C16
1350		C637 = C16 * C13 * C17 * 2
1351		C638 = C16 * C13 * C18 * 2
1352		C727 = C17 * C12 * C17
1353		C728 = C17 * C12 * C18 * 2
1354		C737 = C17 * C13 * C17
1355		C738 = C17 * C13 * C18 * 2
1356		C828 = C18 * C12 * C18
1357		C838 = C18 * C13 * C18
1358
1359		# Compute stochastic isobar ratios
1360		R45 = (C636 + C627) / C626
1361		R46 = (C628 + C637 + C727) / C626
1362		R47 = (C638 + C728 + C737) / C626
1363		R48 = (C738 + C828) / C626
1364		R49 = C838 / C626
1365
1366		# Account for stochastic anomalies
1367		R47 *= 1 + D47 / 1000
1368		R48 *= 1 + D48 / 1000
1369		R49 *= 1 + D49 / 1000
1370
1371		# Return isobar ratios
1372		return R45, R46, R47, R48, R49
1373
1374
1375	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1376		'''
1377		Split unknown samples by UID (treat all analyses as different samples)
1378		or by session (treat analyses of a given sample in different sessions as
1379		different samples).
1380
1381		**Parameters**
1382
1383		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1384		+ `grouping`: `by_uid` | `by_session`
1385		'''
1386		if samples_to_split == 'all':
1387			samples_to_split = [s for s in self.unknowns]
1388		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1389		self.grouping = grouping.lower()
1390		if self.grouping in gkeys:
1391			gkey = gkeys[self.grouping]
1392		for r in self:
1393			if r['Sample'] in samples_to_split:
1394				r['Sample_original'] = r['Sample']
1395				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1396			elif r['Sample'] in self.unknowns:
1397				r['Sample_original'] = r['Sample']
1398		self.refresh_samples()
1399
1400
1401	def unsplit_samples(self, tables = False):
1402		'''
1403		Reverse the effects of `D47data.split_samples()`.
1404		
1405		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1406		
1407		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1408		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1409		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1410		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1411		that case session-averaged Δ4x values are statistically independent).
1412		'''
1413		unknowns_old = sorted({s for s in self.unknowns})
1414		CM_old = self.standardization.covar[:,:]
1415		VD_old = self.standardization.params.valuesdict().copy()
1416		vars_old = self.standardization.var_names
1417
1418		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1419
1420		Ns = len(vars_old) - len(unknowns_old)
1421		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1422		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1423
1424		W = np.zeros((len(vars_new), len(vars_old)))
1425		W[:Ns,:Ns] = np.eye(Ns)
1426		for u in unknowns_new:
1427			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1428			if self.grouping == 'by_session':
1429				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1430			elif self.grouping == 'by_uid':
1431				weights = [1 for s in splits]
1432			sw = sum(weights)
1433			weights = [w/sw for w in weights]
1434			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1435
1436		CM_new = W @ CM_old @ W.T
1437		V = W @ np.array([[VD_old[k]] for k in vars_old])
1438		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1439
1440		self.standardization.covar = CM_new
1441		self.standardization.params.valuesdict = lambda : VD_new
1442		self.standardization.var_names = vars_new
1443
1444		for r in self:
1445			if r['Sample'] in self.unknowns:
1446				r['Sample_split'] = r['Sample']
1447				r['Sample'] = r['Sample_original']
1448
1449		self.refresh_samples()
1450		self.consolidate_samples()
1451		self.repeatabilities()
1452
1453		if tables:
1454			self.table_of_analyses()
1455			self.table_of_samples()
1456
1457	def assign_timestamps(self):
1458		'''
1459		Assign a time field `t` of type `float` to each analysis.
1460
1461		If `TimeTag` is one of the data fields, `t` is equal within a given session
1462		to `TimeTag` minus the mean value of `TimeTag` for that session.
1463		Otherwise, `TimeTag` is by default equal to the index of each analysis
1464		in the dataset and `t` is defined as above.
1465		'''
1466		for session in self.sessions:
1467			sdata = self.sessions[session]['data']
1468			try:
1469				t0 = np.mean([r['TimeTag'] for r in sdata])
1470				for r in sdata:
1471					r['t'] = r['TimeTag'] - t0
1472			except KeyError:
1473				t0 = (len(sdata)-1)/2
1474				for t,r in enumerate(sdata):
1475					r['t'] = t - t0
1476
1477
1478	def report(self):
1479		'''
1480		Prints a report on the standardization fit.
1481		Only applicable after `D4xdata.standardize(method='pooled')`.
1482		'''
1483		report_fit(self.standardization)
1484
1485
1486	def combine_samples(self, sample_groups):
1487		'''
1488		Combine analyses of different samples to compute weighted average Δ4x
1489		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1490		dictionary.
1491		
1492		Caution: samples are weighted by number of replicate analyses, which is a
1493		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1494		correlated analytical errors for one or more samples).
1495		
1496		Returns a tuplet of:
1497		
1498		+ the list of group names
1499		+ an array of the corresponding Δ4x values
1500		+ the corresponding (co)variance matrix
1501		
1502		**Parameters**
1503
1504		+ `sample_groups`: a dictionary of the form:
1505		```py
1506		{'group1': ['sample_1', 'sample_2'],
1507		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1508		```
1509		'''
1510		
1511		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1512		groups = sorted(sample_groups.keys())
1513		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1514		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1515		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1516		W = np.array([
1517			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1518			for j in groups])
1519		D4x_new = W @ D4x_old
1520		CM_new = W @ CM_old @ W.T
1521
1522		return groups, D4x_new[:,0], CM_new
1523		
1524
1525	@make_verbal
1526	def standardize(self,
1527		method = 'pooled',
1528		weighted_sessions = [],
1529		consolidate = True,
1530		consolidate_tables = False,
1531		consolidate_plots = False,
1532		constraints = {},
1533		):
1534		'''
1535		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1536		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1537		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1538		i.e. that their true Δ4x value does not change between sessions,
1539		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1540		`'indep_sessions'`, the standardization processes each session independently, based only
1541		on anchors analyses.
1542		'''
1543
1544		self.standardization_method = method
1545		self.assign_timestamps()
1546
1547		if method == 'pooled':
1548			if weighted_sessions:
1549				for session_group in weighted_sessions:
1550					if self._4x == '47':
1551						X = D47data([r for r in self if r['Session'] in session_group])
1552					elif self._4x == '48':
1553						X = D48data([r for r in self if r['Session'] in session_group])
1554					X.Nominal_D4x = self.Nominal_D4x.copy()
1555					X.refresh()
1556					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1557					w = np.sqrt(result.redchi)
1558					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1559					for r in X:
1560						r[f'wD{self._4x}raw'] *= w
1561			else:
1562				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1563				for r in self:
1564					r[f'wD{self._4x}raw'] = 1.
1565
1566			params = Parameters()
1567			for k,session in enumerate(self.sessions):
1568				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1569				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1570				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1571				s = pf(session)
1572				params.add(f'a_{s}', value = 0.9)
1573				params.add(f'b_{s}', value = 0.)
1574				params.add(f'c_{s}', value = -0.9)
1575				params.add(f'a2_{s}', value = 0.,
1576# 					vary = self.sessions[session]['scrambling_drift'],
1577					)
1578				params.add(f'b2_{s}', value = 0.,
1579# 					vary = self.sessions[session]['slope_drift'],
1580					)
1581				params.add(f'c2_{s}', value = 0.,
1582# 					vary = self.sessions[session]['wg_drift'],
1583					)
1584				if not self.sessions[session]['scrambling_drift']:
1585					params[f'a2_{s}'].expr = '0'
1586				if not self.sessions[session]['slope_drift']:
1587					params[f'b2_{s}'].expr = '0'
1588				if not self.sessions[session]['wg_drift']:
1589					params[f'c2_{s}'].expr = '0'
1590
1591			for sample in self.unknowns:
1592				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1593
1594			for k in constraints:
1595				params[k].expr = constraints[k]
1596
1597			def residuals(p):
1598				R = []
1599				for r in self:
1600					session = pf(r['Session'])
1601					sample = pf(r['Sample'])
1602					if r['Sample'] in self.Nominal_D4x:
1603						R += [ (
1604							r[f'D{self._4x}raw'] - (
1605								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1606								+ p[f'b_{session}'] * r[f'd{self._4x}']
1607								+	p[f'c_{session}']
1608								+ r['t'] * (
1609									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1610									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1611									+	p[f'c2_{session}']
1612									)
1613								)
1614							) / r[f'wD{self._4x}raw'] ]
1615					else:
1616						R += [ (
1617							r[f'D{self._4x}raw'] - (
1618								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1619								+ p[f'b_{session}'] * r[f'd{self._4x}']
1620								+	p[f'c_{session}']
1621								+ r['t'] * (
1622									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1623									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1624									+	p[f'c2_{session}']
1625									)
1626								)
1627							) / r[f'wD{self._4x}raw'] ]
1628				return R
1629
1630			M = Minimizer(residuals, params)
1631			result = M.least_squares()
1632			self.Nf = result.nfree
1633			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1634			new_names, new_covar, new_se = _fullcovar(result)[:3]
1635			result.var_names = new_names
1636			result.covar = new_covar
1637
1638			for r in self:
1639				s = pf(r["Session"])
1640				a = result.params.valuesdict()[f'a_{s}']
1641				b = result.params.valuesdict()[f'b_{s}']
1642				c = result.params.valuesdict()[f'c_{s}']
1643				a2 = result.params.valuesdict()[f'a2_{s}']
1644				b2 = result.params.valuesdict()[f'b2_{s}']
1645				c2 = result.params.valuesdict()[f'c2_{s}']
1646				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1647				
1648
1649			self.standardization = result
1650
1651			for session in self.sessions:
1652				self.sessions[session]['Np'] = 3
1653				for k in ['scrambling', 'slope', 'wg']:
1654					if self.sessions[session][f'{k}_drift']:
1655						self.sessions[session]['Np'] += 1
1656
1657			if consolidate:
1658				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1659			return result
1660
1661
1662		elif method == 'indep_sessions':
1663
1664			if weighted_sessions:
1665				for session_group in weighted_sessions:
1666					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1667					X.Nominal_D4x = self.Nominal_D4x.copy()
1668					X.refresh()
1669					# This is only done to assign r['wD47raw'] for r in X:
1670					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1671					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1672			else:
1673				self.msg('All weights set to 1 ‰')
1674				for r in self:
1675					r[f'wD{self._4x}raw'] = 1
1676
1677			for session in self.sessions:
1678				s = self.sessions[session]
1679				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1680				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1681				s['Np'] = sum(p_active)
1682				sdata = s['data']
1683
1684				A = np.array([
1685					[
1686						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1687						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1688						1 / r[f'wD{self._4x}raw'],
1689						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1690						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1691						r['t'] / r[f'wD{self._4x}raw']
1692						]
1693					for r in sdata if r['Sample'] in self.anchors
1694					])[:,p_active] # only keep columns for the active parameters
1695				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1696				s['Na'] = Y.size
1697				CM = linalg.inv(A.T @ A)
1698				bf = (CM @ A.T @ Y).T[0,:]
1699				k = 0
1700				for n,a in zip(p_names, p_active):
1701					if a:
1702						s[n] = bf[k]
1703# 						self.msg(f'{n} = {bf[k]}')
1704						k += 1
1705					else:
1706						s[n] = 0.
1707# 						self.msg(f'{n} = 0.0')
1708
1709				for r in sdata :
1710					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1711					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1712					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1713
1714				s['CM'] = np.zeros((6,6))
1715				i = 0
1716				k_active = [j for j,a in enumerate(p_active) if a]
1717				for j,a in enumerate(p_active):
1718					if a:
1719						s['CM'][j,k_active] = CM[i,:]
1720						i += 1
1721
1722			if not weighted_sessions:
1723				w = self.rmswd()['rmswd']
1724				for r in self:
1725						r[f'wD{self._4x}'] *= w
1726						r[f'wD{self._4x}raw'] *= w
1727				for session in self.sessions:
1728					self.sessions[session]['CM'] *= w**2
1729
1730			for session in self.sessions:
1731				s = self.sessions[session]
1732				s['SE_a'] = s['CM'][0,0]**.5
1733				s['SE_b'] = s['CM'][1,1]**.5
1734				s['SE_c'] = s['CM'][2,2]**.5
1735				s['SE_a2'] = s['CM'][3,3]**.5
1736				s['SE_b2'] = s['CM'][4,4]**.5
1737				s['SE_c2'] = s['CM'][5,5]**.5
1738
1739			if not weighted_sessions:
1740				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1741			else:
1742				self.Nf = 0
1743				for sg in weighted_sessions:
1744					self.Nf += self.rmswd(sessions = sg)['Nf']
1745
1746			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1747
1748			avgD4x = {
1749				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1750				for sample in self.samples
1751				}
1752			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1753			rD4x = (chi2/self.Nf)**.5
1754			self.repeatability[f'sigma_{self._4x}'] = rD4x
1755
1756			if consolidate:
1757				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1758
1759
1760	def standardization_error(self, session, d4x, D4x, t = 0):
1761		'''
1762		Compute standardization error for a given session and
1763		(δ47, Δ47) composition.
1764		'''
1765		a = self.sessions[session]['a']
1766		b = self.sessions[session]['b']
1767		c = self.sessions[session]['c']
1768		a2 = self.sessions[session]['a2']
1769		b2 = self.sessions[session]['b2']
1770		c2 = self.sessions[session]['c2']
1771		CM = self.sessions[session]['CM']
1772
1773		x, y = D4x, d4x
1774		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1775# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1776		dxdy = -(b+b2*t) / (a+a2*t)
1777		dxdz = 1. / (a+a2*t)
1778		dxda = -x / (a+a2*t)
1779		dxdb = -y / (a+a2*t)
1780		dxdc = -1. / (a+a2*t)
1781		dxda2 = -x * a2 / (a+a2*t)
1782		dxdb2 = -y * t / (a+a2*t)
1783		dxdc2 = -t / (a+a2*t)
1784		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1785		sx = (V @ CM @ V.T) ** .5
1786		return sx
1787
1788
1789	@make_verbal
1790	def summary(self,
1791		dir = 'output',
1792		filename = None,
1793		save_to_file = True,
1794		print_out = True,
1795		):
1796		'''
1797		Print out an/or save to disk a summary of the standardization results.
1798
1799		**Parameters**
1800
1801		+ `dir`: the directory in which to save the table
1802		+ `filename`: the name to the csv file to write to
1803		+ `save_to_file`: whether to save the table to disk
1804		+ `print_out`: whether to print out the table
1805		'''
1806
1807		out = []
1808		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1809		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1810		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1811		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1812		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1813		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1814		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1815		out += [['Model degrees of freedom', f"{self.Nf}"]]
1816		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1817		out += [['Standardization method', self.standardization_method]]
1818
1819		if save_to_file:
1820			if not os.path.exists(dir):
1821				os.makedirs(dir)
1822			if filename is None:
1823				filename = f'D{self._4x}_summary.csv'
1824			with open(f'{dir}/{filename}', 'w') as fid:
1825				fid.write(make_csv(out))
1826		if print_out:
1827			self.msg('\n' + pretty_table(out, header = 0))
1828
1829
1830	@make_verbal
1831	def table_of_sessions(self,
1832		dir = 'output',
1833		filename = None,
1834		save_to_file = True,
1835		print_out = True,
1836		output = None,
1837		):
1838		'''
1839		Print out an/or save to disk a table of sessions.
1840
1841		**Parameters**
1842
1843		+ `dir`: the directory in which to save the table
1844		+ `filename`: the name to the csv file to write to
1845		+ `save_to_file`: whether to save the table to disk
1846		+ `print_out`: whether to print out the table
1847		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1848		    if set to `'raw'`: return a list of list of strings
1849		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1850		'''
1851		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1852		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1853		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1854
1855		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1856		if include_a2:
1857			out[-1] += ['a2 ± SE']
1858		if include_b2:
1859			out[-1] += ['b2 ± SE']
1860		if include_c2:
1861			out[-1] += ['c2 ± SE']
1862		for session in self.sessions:
1863			out += [[
1864				session,
1865				f"{self.sessions[session]['Na']}",
1866				f"{self.sessions[session]['Nu']}",
1867				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1868				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1869				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1870				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1871				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1872				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1873				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1874				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1875				]]
1876			if include_a2:
1877				if self.sessions[session]['scrambling_drift']:
1878					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1879				else:
1880					out[-1] += ['']
1881			if include_b2:
1882				if self.sessions[session]['slope_drift']:
1883					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1884				else:
1885					out[-1] += ['']
1886			if include_c2:
1887				if self.sessions[session]['wg_drift']:
1888					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1889				else:
1890					out[-1] += ['']
1891
1892		if save_to_file:
1893			if not os.path.exists(dir):
1894				os.makedirs(dir)
1895			if filename is None:
1896				filename = f'D{self._4x}_sessions.csv'
1897			with open(f'{dir}/{filename}', 'w') as fid:
1898				fid.write(make_csv(out))
1899		if print_out:
1900			self.msg('\n' + pretty_table(out))
1901		if output == 'raw':
1902			return out
1903		elif output == 'pretty':
1904			return pretty_table(out)
1905
1906
1907	@make_verbal
1908	def table_of_analyses(
1909		self,
1910		dir = 'output',
1911		filename = None,
1912		save_to_file = True,
1913		print_out = True,
1914		output = None,
1915		):
1916		'''
1917		Print out an/or save to disk a table of analyses.
1918
1919		**Parameters**
1920
1921		+ `dir`: the directory in which to save the table
1922		+ `filename`: the name to the csv file to write to
1923		+ `save_to_file`: whether to save the table to disk
1924		+ `print_out`: whether to print out the table
1925		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1926		    if set to `'raw'`: return a list of list of strings
1927		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1928		'''
1929
1930		out = [['UID','Session','Sample']]
1931		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1932		for f in extra_fields:
1933			out[-1] += [f[0]]
1934		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1935		for r in self:
1936			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1937			for f in extra_fields:
1938				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1939			out[-1] += [
1940				f"{r['d13Cwg_VPDB']:.3f}",
1941				f"{r['d18Owg_VSMOW']:.3f}",
1942				f"{r['d45']:.6f}",
1943				f"{r['d46']:.6f}",
1944				f"{r['d47']:.6f}",
1945				f"{r['d48']:.6f}",
1946				f"{r['d49']:.6f}",
1947				f"{r['d13C_VPDB']:.6f}",
1948				f"{r['d18O_VSMOW']:.6f}",
1949				f"{r['D47raw']:.6f}",
1950				f"{r['D48raw']:.6f}",
1951				f"{r['D49raw']:.6f}",
1952				f"{r[f'D{self._4x}']:.6f}"
1953				]
1954		if save_to_file:
1955			if not os.path.exists(dir):
1956				os.makedirs(dir)
1957			if filename is None:
1958				filename = f'D{self._4x}_analyses.csv'
1959			with open(f'{dir}/{filename}', 'w') as fid:
1960				fid.write(make_csv(out))
1961		if print_out:
1962			self.msg('\n' + pretty_table(out))
1963		return out
1964
1965	@make_verbal
1966	def covar_table(
1967		self,
1968		correl = False,
1969		dir = 'output',
1970		filename = None,
1971		save_to_file = True,
1972		print_out = True,
1973		output = None,
1974		):
1975		'''
1976		Print out, save to disk and/or return the variance-covariance matrix of D4x
1977		for all unknown samples.
1978
1979		**Parameters**
1980
1981		+ `dir`: the directory in which to save the csv
1982		+ `filename`: the name of the csv file to write to
1983		+ `save_to_file`: whether to save the csv
1984		+ `print_out`: whether to print out the matrix
1985		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1986		    if set to `'raw'`: return a list of list of strings
1987		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1988		'''
1989		samples = sorted([u for u in self.unknowns])
1990		out = [[''] + samples]
1991		for s1 in samples:
1992			out.append([s1])
1993			for s2 in samples:
1994				if correl:
1995					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1996				else:
1997					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1998
1999		if save_to_file:
2000			if not os.path.exists(dir):
2001				os.makedirs(dir)
2002			if filename is None:
2003				if correl:
2004					filename = f'D{self._4x}_correl.csv'
2005				else:
2006					filename = f'D{self._4x}_covar.csv'
2007			with open(f'{dir}/{filename}', 'w') as fid:
2008				fid.write(make_csv(out))
2009		if print_out:
2010			self.msg('\n'+pretty_table(out))
2011		if output == 'raw':
2012			return out
2013		elif output == 'pretty':
2014			return pretty_table(out)
2015
2016	@make_verbal
2017	def table_of_samples(
2018		self,
2019		dir = 'output',
2020		filename = None,
2021		save_to_file = True,
2022		print_out = True,
2023		output = None,
2024		):
2025		'''
2026		Print out, save to disk and/or return a table of samples.
2027
2028		**Parameters**
2029
2030		+ `dir`: the directory in which to save the csv
2031		+ `filename`: the name of the csv file to write to
2032		+ `save_to_file`: whether to save the csv
2033		+ `print_out`: whether to print out the table
2034		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2035		    if set to `'raw'`: return a list of list of strings
2036		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2037		'''
2038
2039		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2040		for sample in self.anchors:
2041			out += [[
2042				f"{sample}",
2043				f"{self.samples[sample]['N']}",
2044				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2045				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2046				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2047				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2048				]]
2049		for sample in self.unknowns:
2050			out += [[
2051				f"{sample}",
2052				f"{self.samples[sample]['N']}",
2053				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2054				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2055				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2056				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2057				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2058				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2059				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2060				]]
2061		if save_to_file:
2062			if not os.path.exists(dir):
2063				os.makedirs(dir)
2064			if filename is None:
2065				filename = f'D{self._4x}_samples.csv'
2066			with open(f'{dir}/{filename}', 'w') as fid:
2067				fid.write(make_csv(out))
2068		if print_out:
2069			self.msg('\n'+pretty_table(out))
2070		if output == 'raw':
2071			return out
2072		elif output == 'pretty':
2073			return pretty_table(out)
2074
2075
2076	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2077		'''
2078		Generate session plots and save them to disk.
2079
2080		**Parameters**
2081
2082		+ `dir`: the directory in which to save the plots
2083		+ `figsize`: the width and height (in inches) of each plot
2084		+ `filetype`: 'pdf' or 'png'
2085		+ `dpi`: resolution for PNG output
2086		'''
2087		if not os.path.exists(dir):
2088			os.makedirs(dir)
2089
2090		for session in self.sessions:
2091			sp = self.plot_single_session(session, xylimits = 'constant')
2092			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2093			ppl.close(sp.fig)
2094			
2095
2096
2097	@make_verbal
2098	def consolidate_samples(self):
2099		'''
2100		Compile various statistics for each sample.
2101
2102		For each anchor sample:
2103
2104		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2105		+ `SE_D47` or `SE_D48`: set to zero by definition
2106
2107		For each unknown sample:
2108
2109		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2110		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2111
2112		For each anchor and unknown:
2113
2114		+ `N`: the total number of analyses of this sample
2115		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2116		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2117		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2118		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2119		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2120		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2121		'''
2122		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2123		for sample in self.samples:
2124			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2125			if self.samples[sample]['N'] > 1:
2126				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2127
2128			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2129			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2130
2131			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2132			if len(D4x_pop) > 2:
2133				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2134			
2135		if self.standardization_method == 'pooled':
2136			for sample in self.anchors:
2137				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2138				self.samples[sample][f'SE_D{self._4x}'] = 0.
2139			for sample in self.unknowns:
2140				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2141				try:
2142					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2143				except ValueError:
2144					# when `sample` is constrained by self.standardize(constraints = {...}),
2145					# it is no longer listed in self.standardization.var_names.
2146					# Temporary fix: define SE as zero for now
2147					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2148
2149		elif self.standardization_method == 'indep_sessions':
2150			for sample in self.anchors:
2151				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2152				self.samples[sample][f'SE_D{self._4x}'] = 0.
2153			for sample in self.unknowns:
2154				self.msg(f'Consolidating sample {sample}')
2155				self.unknowns[sample][f'session_D{self._4x}'] = {}
2156				session_avg = []
2157				for session in self.sessions:
2158					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2159					if sdata:
2160						self.msg(f'{sample} found in session {session}')
2161						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2162						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2163						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2164						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2165						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2166						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2167						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2168				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2169				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2170				wsum = sum([weights[s] for s in weights])
2171				for s in weights:
2172					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2173
2174		for r in self:
2175			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2176
2177
2178
2179	def consolidate_sessions(self):
2180		'''
2181		Compute various statistics for each session.
2182
2183		+ `Na`: Number of anchor analyses in the session
2184		+ `Nu`: Number of unknown analyses in the session
2185		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2186		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2187		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2188		+ `a`: scrambling factor
2189		+ `b`: compositional slope
2190		+ `c`: WG offset
2191		+ `SE_a`: Model stadard erorr of `a`
2192		+ `SE_b`: Model stadard erorr of `b`
2193		+ `SE_c`: Model stadard erorr of `c`
2194		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2195		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2196		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2197		+ `a2`: scrambling factor drift
2198		+ `b2`: compositional slope drift
2199		+ `c2`: WG offset drift
2200		+ `Np`: Number of standardization parameters to fit
2201		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2202		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2203		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2204		'''
2205		for session in self.sessions:
2206			if 'd13Cwg_VPDB' not in self.sessions[session]:
2207				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2208			if 'd18Owg_VSMOW' not in self.sessions[session]:
2209				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2210			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2211			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2212
2213			self.msg(f'Computing repeatabilities for session {session}')
2214			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2215			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2216			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2217
2218		if self.standardization_method == 'pooled':
2219			for session in self.sessions:
2220
2221				# different (better?) computation of D4x repeatability for each session:
2222				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2223				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2224
2225				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2226				i = self.standardization.var_names.index(f'a_{pf(session)}')
2227				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2228
2229				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2230				i = self.standardization.var_names.index(f'b_{pf(session)}')
2231				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2232
2233				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2234				i = self.standardization.var_names.index(f'c_{pf(session)}')
2235				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2236
2237				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2238				if self.sessions[session]['scrambling_drift']:
2239					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2240					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2241				else:
2242					self.sessions[session]['SE_a2'] = 0.
2243
2244				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2245				if self.sessions[session]['slope_drift']:
2246					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2247					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2248				else:
2249					self.sessions[session]['SE_b2'] = 0.
2250
2251				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2252				if self.sessions[session]['wg_drift']:
2253					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2254					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2255				else:
2256					self.sessions[session]['SE_c2'] = 0.
2257
2258				i = self.standardization.var_names.index(f'a_{pf(session)}')
2259				j = self.standardization.var_names.index(f'b_{pf(session)}')
2260				k = self.standardization.var_names.index(f'c_{pf(session)}')
2261				CM = np.zeros((6,6))
2262				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2263				try:
2264					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2265					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2266					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2267					try:
2268						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2269						CM[3,4] = self.standardization.covar[i2,j2]
2270						CM[4,3] = self.standardization.covar[j2,i2]
2271					except ValueError:
2272						pass
2273					try:
2274						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2275						CM[3,5] = self.standardization.covar[i2,k2]
2276						CM[5,3] = self.standardization.covar[k2,i2]
2277					except ValueError:
2278						pass
2279				except ValueError:
2280					pass
2281				try:
2282					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2283					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2284					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2285					try:
2286						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2287						CM[4,5] = self.standardization.covar[j2,k2]
2288						CM[5,4] = self.standardization.covar[k2,j2]
2289					except ValueError:
2290						pass
2291				except ValueError:
2292					pass
2293				try:
2294					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2295					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2296					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2297				except ValueError:
2298					pass
2299
2300				self.sessions[session]['CM'] = CM
2301
2302		elif self.standardization_method == 'indep_sessions':
2303			pass # Not implemented yet
2304
2305
2306	@make_verbal
2307	def repeatabilities(self):
2308		'''
2309		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2310		(for all samples, for anchors, and for unknowns).
2311		'''
2312		self.msg('Computing reproducibilities for all sessions')
2313
2314		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2315		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2316		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2317		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2318		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2319
2320
2321	@make_verbal
2322	def consolidate(self, tables = True, plots = True):
2323		'''
2324		Collect information about samples, sessions and repeatabilities.
2325		'''
2326		self.consolidate_samples()
2327		self.consolidate_sessions()
2328		self.repeatabilities()
2329
2330		if tables:
2331			self.summary()
2332			self.table_of_sessions()
2333			self.table_of_analyses()
2334			self.table_of_samples()
2335
2336		if plots:
2337			self.plot_sessions()
2338
2339
2340	@make_verbal
2341	def rmswd(self,
2342		samples = 'all samples',
2343		sessions = 'all sessions',
2344		):
2345		'''
2346		Compute the χ2, root mean squared weighted deviation
2347		(i.e. reduced χ2), and corresponding degrees of freedom of the
2348		Δ4x values for samples in `samples` and sessions in `sessions`.
2349		
2350		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2351		'''
2352		if samples == 'all samples':
2353			mysamples = [k for k in self.samples]
2354		elif samples == 'anchors':
2355			mysamples = [k for k in self.anchors]
2356		elif samples == 'unknowns':
2357			mysamples = [k for k in self.unknowns]
2358		else:
2359			mysamples = samples
2360
2361		if sessions == 'all sessions':
2362			sessions = [k for k in self.sessions]
2363
2364		chisq, Nf = 0, 0
2365		for sample in mysamples :
2366			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2367			if len(G) > 1 :
2368				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2369				Nf += (len(G) - 1)
2370				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2371		r = (chisq / Nf)**.5 if Nf > 0 else 0
2372		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2373		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2374
2375	
2376	@make_verbal
2377	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2378		'''
2379		Compute the repeatability of `[r[key] for r in self]`
2380		'''
2381
2382		if samples == 'all samples':
2383			mysamples = [k for k in self.samples]
2384		elif samples == 'anchors':
2385			mysamples = [k for k in self.anchors]
2386		elif samples == 'unknowns':
2387			mysamples = [k for k in self.unknowns]
2388		else:
2389			mysamples = samples
2390
2391		if sessions == 'all sessions':
2392			sessions = [k for k in self.sessions]
2393
2394		if key in ['D47', 'D48']:
2395			# Full disclosure: the definition of Nf is tricky/debatable
2396			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2397			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2398			Nf = len(G)
2399# 			print(f'len(G) = {Nf}')
2400			Nf -= len([s for s in mysamples if s in self.unknowns])
2401# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2402			for session in sessions:
2403				Np = len([
2404					_ for _ in self.standardization.params
2405					if (
2406						self.standardization.params[_].expr is not None
2407						and (
2408							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2409							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2410							)
2411						)
2412					])
2413# 				print(f'session {session}: {Np} parameters to consider')
2414				Na = len({
2415					r['Sample'] for r in self.sessions[session]['data']
2416					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2417					})
2418# 				print(f'session {session}: {Na} different anchors in that session')
2419				Nf -= min(Np, Na)
2420# 			print(f'Nf = {Nf}')
2421
2422# 			for sample in mysamples :
2423# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2424# 				if len(X) > 1 :
2425# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2426# 					if sample in self.unknowns:
2427# 						Nf += len(X) - 1
2428# 					else:
2429# 						Nf += len(X)
2430# 			if samples in ['anchors', 'all samples']:
2431# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2432			r = (chisq / Nf)**.5 if Nf > 0 else 0
2433
2434		else: # if key not in ['D47', 'D48']
2435			chisq, Nf = 0, 0
2436			for sample in mysamples :
2437				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2438				if len(X) > 1 :
2439					Nf += len(X) - 1
2440					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2441			r = (chisq / Nf)**.5 if Nf > 0 else 0
2442
2443		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2444		return r
2445
2446	def sample_average(self, samples, weights = 'equal', normalize = True):
2447		'''
2448		Weighted average Δ4x value of a group of samples, accounting for covariance.
2449
2450		Returns the weighed average Δ4x value and associated SE
2451		of a group of samples. Weights are equal by default. If `normalize` is
2452		true, `weights` will be rescaled so that their sum equals 1.
2453
2454		**Examples**
2455
2456		```python
2457		self.sample_average(['X','Y'], [1, 2])
2458		```
2459
2460		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2461		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2462		values of samples X and Y, respectively.
2463
2464		```python
2465		self.sample_average(['X','Y'], [1, -1], normalize = False)
2466		```
2467
2468		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2469		'''
2470		if weights == 'equal':
2471			weights = [1/len(samples)] * len(samples)
2472
2473		if normalize:
2474			s = sum(weights)
2475			if s:
2476				weights = [w/s for w in weights]
2477
2478		try:
2479# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2480# 			C = self.standardization.covar[indices,:][:,indices]
2481			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2482			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2483			return correlated_sum(X, C, weights)
2484		except ValueError:
2485			return (0., 0.)
2486
2487
2488	def sample_D4x_covar(self, sample1, sample2 = None):
2489		'''
2490		Covariance between Δ4x values of samples
2491
2492		Returns the error covariance between the average Δ4x values of two
2493		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2494		returns the Δ4x variance for that sample.
2495		'''
2496		if sample2 is None:
2497			sample2 = sample1
2498		if self.standardization_method == 'pooled':
2499			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2500			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2501			return self.standardization.covar[i, j]
2502		elif self.standardization_method == 'indep_sessions':
2503			if sample1 == sample2:
2504				return self.samples[sample1][f'SE_D{self._4x}']**2
2505			else:
2506				c = 0
2507				for session in self.sessions:
2508					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2509					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2510					if sdata1 and sdata2:
2511						a = self.sessions[session]['a']
2512						# !! TODO: CM below does not account for temporal changes in standardization parameters
2513						CM = self.sessions[session]['CM'][:3,:3]
2514						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2515						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2516						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2517						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2518						c += (
2519							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2520							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2521							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2522							@ CM
2523							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2524							) / a**2
2525				return float(c)
2526
2527	def sample_D4x_correl(self, sample1, sample2 = None):
2528		'''
2529		Correlation between Δ4x errors of samples
2530
2531		Returns the error correlation between the average Δ4x values of two samples.
2532		'''
2533		if sample2 is None or sample2 == sample1:
2534			return 1.
2535		return (
2536			self.sample_D4x_covar(sample1, sample2)
2537			/ self.unknowns[sample1][f'SE_D{self._4x}']
2538			/ self.unknowns[sample2][f'SE_D{self._4x}']
2539			)
2540
2541	def plot_single_session(self,
2542		session,
2543		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2544		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2545		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2546		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2547		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2548		xylimits = 'free', # | 'constant'
2549		x_label = None,
2550		y_label = None,
2551		error_contour_interval = 'auto',
2552		fig = 'new',
2553		):
2554		'''
2555		Generate plot for a single session
2556		'''
2557		if x_label is None:
2558			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2559		if y_label is None:
2560			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2561
2562		out = _SessionPlot()
2563		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2564		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2565		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2566		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2567		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2568		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2569		anchor_avg = (np.array([ np.array([
2570				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2571				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2572				]) for sample in anchors]).T,
2573			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2574		unknown_avg = (np.array([ np.array([
2575				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2576				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2577				]) for sample in unknowns]).T,
2578			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2579		
2580		
2581		if fig == 'new':
2582			out.fig = ppl.figure(figsize = (6,6))
2583			ppl.subplots_adjust(.1,.1,.9,.9)
2584
2585		out.anchor_analyses, = ppl.plot(
2586			anchors_d,
2587			anchors_D,
2588			**kw_plot_anchors)
2589		out.unknown_analyses, = ppl.plot(
2590			unknowns_d,
2591			unknowns_D,
2592			**kw_plot_unknowns)
2593		out.anchor_avg = ppl.plot(
2594			*anchor_avg,
2595			**kw_plot_anchor_avg)
2596		out.unknown_avg = ppl.plot(
2597			*unknown_avg,
2598			**kw_plot_unknown_avg)
2599		if xylimits == 'constant':
2600			x = [r[f'd{self._4x}'] for r in self]
2601			y = [r[f'D{self._4x}'] for r in self]
2602			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2603			w, h = x2-x1, y2-y1
2604			x1 -= w/20
2605			x2 += w/20
2606			y1 -= h/20
2607			y2 += h/20
2608			ppl.axis([x1, x2, y1, y2])
2609		elif xylimits == 'free':
2610			x1, x2, y1, y2 = ppl.axis()
2611		else:
2612			x1, x2, y1, y2 = ppl.axis(xylimits)
2613				
2614		if error_contour_interval != 'none':
2615			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2616			XI,YI = np.meshgrid(xi, yi)
2617			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2618			if error_contour_interval == 'auto':
2619				rng = np.max(SI) - np.min(SI)
2620				if rng <= 0.01:
2621					cinterval = 0.001
2622				elif rng <= 0.03:
2623					cinterval = 0.004
2624				elif rng <= 0.1:
2625					cinterval = 0.01
2626				elif rng <= 0.3:
2627					cinterval = 0.03
2628				elif rng <= 1.:
2629					cinterval = 0.1
2630				else:
2631					cinterval = 0.5
2632			else:
2633				cinterval = error_contour_interval
2634
2635			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2636			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2637			out.clabel = ppl.clabel(out.contour)
2638			contour = (XI, YI, SI, cval, cinterval)
2639
2640		if fig == None:
2641			return {
2642			'anchors':anchors,
2643			'unknowns':unknowns,
2644			'anchors_d':anchors_d,
2645			'anchors_D':anchors_D,
2646			'unknowns_d':unknowns_d,
2647			'unknowns_D':unknowns_D,
2648			'anchor_avg':anchor_avg,
2649			'unknown_avg':unknown_avg,
2650			'contour':contour,
2651			}
2652
2653		ppl.xlabel(x_label)
2654		ppl.ylabel(y_label)
2655		ppl.title(session, weight = 'bold')
2656		ppl.grid(alpha = .2)
2657		out.ax = ppl.gca()		
2658
2659		return out
2660
2661	def plot_residuals(
2662		self,
2663		kde = False,
2664		hist = False,
2665		binwidth = 2/3,
2666		dir = 'output',
2667		filename = None,
2668		highlight = [],
2669		colors = None,
2670		figsize = None,
2671		dpi = 100,
2672		yspan = None,
2673		):
2674		'''
2675		Plot residuals of each analysis as a function of time (actually, as a function of
2676		the order of analyses in the `D4xdata` object)
2677
2678		+ `kde`: whether to add a kernel density estimate of residuals
2679		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2680		+ `histbins`: specify bin edges for the histogram
2681		+ `dir`: the directory in which to save the plot
2682		+ `highlight`: a list of samples to highlight
2683		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2684		+ `figsize`: (width, height) of figure
2685		+ `dpi`: resolution for PNG output
2686		+ `yspan`: factor controlling the range of y values shown in plot
2687		  (by default: `yspan = 1.5 if kde else 1.0`)
2688		'''
2689		
2690		from matplotlib import ticker
2691
2692		if yspan is None:
2693			if kde:
2694				yspan = 1.5
2695			else:
2696				yspan = 1.0
2697		
2698		# Layout
2699		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2700		if hist or kde:
2701			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2702			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2703		else:
2704			ppl.subplots_adjust(.08,.05,.78,.8)
2705			ax1 = ppl.subplot(111)
2706		
2707		# Colors
2708		N = len(self.anchors)
2709		if colors is None:
2710			if len(highlight) > 0:
2711				Nh = len(highlight)
2712				if Nh == 1:
2713					colors = {highlight[0]: (0,0,0)}
2714				elif Nh == 3:
2715					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2716				elif Nh == 4:
2717					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2718				else:
2719					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2720			else:
2721				if N == 3:
2722					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2723				elif N == 4:
2724					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2725				else:
2726					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2727
2728		ppl.sca(ax1)
2729		
2730		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2731
2732		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2733
2734		session = self[0]['Session']
2735		x1 = 0
2736# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2737		x_sessions = {}
2738		one_or_more_singlets = False
2739		one_or_more_multiplets = False
2740		multiplets = set()
2741		for k,r in enumerate(self):
2742			if r['Session'] != session:
2743				x2 = k-1
2744				x_sessions[session] = (x1+x2)/2
2745				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2746				session = r['Session']
2747				x1 = k
2748			singlet = len(self.samples[r['Sample']]['data']) == 1
2749			if not singlet:
2750				multiplets.add(r['Sample'])
2751			if r['Sample'] in self.unknowns:
2752				if singlet:
2753					one_or_more_singlets = True
2754				else:
2755					one_or_more_multiplets = True
2756			kw = dict(
2757				marker = 'x' if singlet else '+',
2758				ms = 4 if singlet else 5,
2759				ls = 'None',
2760				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2761				mew = 1,
2762				alpha = 0.2 if singlet else 1,
2763				)
2764			if highlight and r['Sample'] not in highlight:
2765				kw['alpha'] = 0.2
2766			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2767		x2 = k
2768		x_sessions[session] = (x1+x2)/2
2769
2770		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2771		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2772		if not (hist or kde):
2773			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2774			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2775
2776		xmin, xmax, ymin, ymax = ppl.axis()
2777		if yspan != 1:
2778			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2779		for s in x_sessions:
2780			ppl.text(
2781				x_sessions[s],
2782				ymax +1,
2783				s,
2784				va = 'bottom',
2785				**(
2786					dict(ha = 'center')
2787					if len(self.sessions[s]['data']) > (0.15 * len(self))
2788					else dict(ha = 'left', rotation = 45)
2789					)
2790				)
2791
2792		if hist or kde:
2793			ppl.sca(ax2)
2794
2795		for s in colors:
2796			kw['marker'] = '+'
2797			kw['ms'] = 5
2798			kw['mec'] = colors[s]
2799			kw['label'] = s
2800			kw['alpha'] = 1
2801			ppl.plot([], [], **kw)
2802
2803		kw['mec'] = (0,0,0)
2804
2805		if one_or_more_singlets:
2806			kw['marker'] = 'x'
2807			kw['ms'] = 4
2808			kw['alpha'] = .2
2809			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2810			ppl.plot([], [], **kw)
2811
2812		if one_or_more_multiplets:
2813			kw['marker'] = '+'
2814			kw['ms'] = 4
2815			kw['alpha'] = 1
2816			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2817			ppl.plot([], [], **kw)
2818
2819		if hist or kde:
2820			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2821		else:
2822			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2823		leg.set_zorder(-1000)
2824
2825		ppl.sca(ax1)
2826
2827		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2828		ppl.xticks([])
2829		ppl.axis([-1, len(self), None, None])
2830
2831		if hist or kde:
2832			ppl.sca(ax2)
2833			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2834
2835			if kde:
2836				from scipy.stats import gaussian_kde
2837				yi = np.linspace(ymin, ymax, 201)
2838				xi = gaussian_kde(X).evaluate(yi)
2839				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2840# 				ppl.plot(xi, yi, 'k-', lw = 1)
2841			elif hist:
2842				ppl.hist(
2843					X,
2844					orientation = 'horizontal',
2845					histtype = 'stepfilled',
2846					ec = [.4]*3,
2847					fc = [.25]*3,
2848					alpha = .25,
2849					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2850					)
2851			ppl.text(0, 0,
2852				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2853				size = 7.5,
2854				alpha = 1,
2855				va = 'center',
2856				ha = 'left',
2857				)
2858
2859			ppl.axis([0, None, ymin, ymax])
2860			ppl.xticks([])
2861			ppl.yticks([])
2862# 			ax2.spines['left'].set_visible(False)
2863			ax2.spines['right'].set_visible(False)
2864			ax2.spines['top'].set_visible(False)
2865			ax2.spines['bottom'].set_visible(False)
2866
2867		ax1.axis([None, None, ymin, ymax])
2868
2869		if not os.path.exists(dir):
2870			os.makedirs(dir)
2871		if filename is None:
2872			return fig
2873		elif filename == '':
2874			filename = f'D{self._4x}_residuals.pdf'
2875		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2876		ppl.close(fig)
2877				
2878
2879	def simulate(self, *args, **kwargs):
2880		'''
2881		Legacy function with warning message pointing to `virtual_data()`
2882		'''
2883		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2884
2885	def plot_distribution_of_analyses(
2886		self,
2887		dir = 'output',
2888		filename = None,
2889		vs_time = False,
2890		figsize = (6,4),
2891		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2892		output = None,
2893		dpi = 100,
2894		):
2895		'''
2896		Plot temporal distribution of all analyses in the data set.
2897		
2898		**Parameters**
2899
2900		+ `dir`: the directory in which to save the plot
2901		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2902		+ `dpi`: resolution for PNG output
2903		+ `figsize`: (width, height) of figure
2904		+ `dpi`: resolution for PNG output
2905		'''
2906
2907		asamples = [s for s in self.anchors]
2908		usamples = [s for s in self.unknowns]
2909		if output is None or output == 'fig':
2910			fig = ppl.figure(figsize = figsize)
2911			ppl.subplots_adjust(*subplots_adjust)
2912		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2913		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2914		Xmax += (Xmax-Xmin)/40
2915		Xmin -= (Xmax-Xmin)/41
2916		for k, s in enumerate(asamples + usamples):
2917			if vs_time:
2918				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2919			else:
2920				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2921			Y = [-k for x in X]
2922			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2923			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2924			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2925		ppl.axis([Xmin, Xmax, -k-1, 1])
2926		ppl.xlabel('\ntime')
2927		ppl.gca().annotate('',
2928			xy = (0.6, -0.02),
2929			xycoords = 'axes fraction',
2930			xytext = (.4, -0.02), 
2931            arrowprops = dict(arrowstyle = "->", color = 'k'),
2932            )
2933			
2934
2935		x2 = -1
2936		for session in self.sessions:
2937			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2938			if vs_time:
2939				ppl.axvline(x1, color = 'k', lw = .75)
2940			if x2 > -1:
2941				if not vs_time:
2942					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2943			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2944# 			from xlrd import xldate_as_datetime
2945# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2946			if vs_time:
2947				ppl.axvline(x2, color = 'k', lw = .75)
2948				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2949			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2950
2951		ppl.xticks([])
2952		ppl.yticks([])
2953
2954		if output is None:
2955			if not os.path.exists(dir):
2956				os.makedirs(dir)
2957			if filename == None:
2958				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2959			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2960			ppl.close(fig)
2961		elif output == 'ax':
2962			return ppl.gca()
2963		elif output == 'fig':
2964			return fig
2965
2966
2967	def plot_bulk_compositions(
2968		self,
2969		samples = None,
2970		dir = 'output/bulk_compositions',
2971		figsize = (6,6),
2972		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2973		show = False,
2974		sample_color = (0,.5,1),
2975		analysis_color = (.7,.7,.7),
2976		labeldist = 0.3,
2977		radius = 0.05,
2978		):
2979		'''
2980		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2981		
2982		By default, creates a directory `./output/bulk_compositions` where plots for
2983		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2984		
2985		
2986		**Parameters**
2987
2988		+ `samples`: Only these samples are processed (by default: all samples).
2989		+ `dir`: where to save the plots
2990		+ `figsize`: (width, height) of figure
2991		+ `subplots_adjust`: passed to `subplots_adjust()`
2992		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2993		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2994		+ `sample_color`: color used for replicate markers/labels
2995		+ `analysis_color`: color used for sample markers/labels
2996		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2997		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2998		'''
2999
3000		from matplotlib.patches import Ellipse
3001
3002		if samples is None:
3003			samples = [_ for _ in self.samples]
3004
3005		saved = {}
3006
3007		for s in samples:
3008
3009			fig = ppl.figure(figsize = figsize)
3010			fig.subplots_adjust(*subplots_adjust)
3011			ax = ppl.subplot(111)
3012			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3013			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3014			ppl.title(s)
3015
3016
3017			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3018			UID = [_['UID'] for _ in self.samples[s]['data']]
3019			XY0 = XY.mean(0)
3020
3021			for xy in XY:
3022				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3023				
3024			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3025			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3026			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3027			saved[s] = [XY, XY0]
3028			
3029			x1, x2, y1, y2 = ppl.axis()
3030			x0, dx = (x1+x2)/2, (x2-x1)/2
3031			y0, dy = (y1+y2)/2, (y2-y1)/2
3032			dx, dy = [max(max(dx, dy), radius)]*2
3033
3034			ppl.axis([
3035				x0 - 1.2*dx,
3036				x0 + 1.2*dx,
3037				y0 - 1.2*dy,
3038				y0 + 1.2*dy,
3039				])			
3040
3041			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3042
3043			for xy, uid in zip(XY, UID):
3044
3045				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3046				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3047
3048				if (vector_in_display_space**2).sum() > 0:
3049
3050					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3051					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3052					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3053					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3054
3055					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3056
3057				else:
3058
3059					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3060
3061			if radius:
3062				ax.add_artist(Ellipse(
3063					xy = XY0,
3064					width = radius*2,
3065					height = radius*2,
3066					ls = (0, (2,2)),
3067					lw = .7,
3068					ec = analysis_color,
3069					fc = 'None',
3070					))
3071				ppl.text(
3072					XY0[0],
3073					XY0[1]-radius,
3074					f'\n± {radius*1e3:.0f} ppm',
3075					color = analysis_color,
3076					va = 'top',
3077					ha = 'center',
3078					linespacing = 0.4,
3079					size = 8,
3080					)
3081
3082			if not os.path.exists(dir):
3083				os.makedirs(dir)
3084			fig.savefig(f'{dir}/{s}.pdf')
3085			ppl.close(fig)
3086
3087		fig = ppl.figure(figsize = figsize)
3088		fig.subplots_adjust(*subplots_adjust)
3089		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3090		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3091
3092		for s in saved:
3093			for xy in saved[s][0]:
3094				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3095			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3096			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3097			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3098
3099		x1, x2, y1, y2 = ppl.axis()
3100		ppl.axis([
3101			x1 - (x2-x1)/10,
3102			x2 + (x2-x1)/10,
3103			y1 - (y2-y1)/10,
3104			y2 + (y2-y1)/10,
3105			])			
3106
3107
3108		if not os.path.exists(dir):
3109			os.makedirs(dir)
3110		fig.savefig(f'{dir}/__all__.pdf')
3111		if show:
3112			ppl.show()
3113		ppl.close(fig)
3114		
3115
3116	def _save_D4x_correl(
3117		self,
3118		samples = None,
3119		dir = 'output',
3120		filename = None,
3121		D4x_precision = 4,
3122		correl_precision = 4,
3123		):
3124		'''
3125		Save D4x values along with their SE and correlation matrix.
3126
3127		**Parameters**
3128
3129		+ `samples`: Only these samples are output (by default: all samples).
3130		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3131		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3132		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3133		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3134		'''
3135		if samples is None:
3136			samples = sorted([s for s in self.unknowns])
3137		
3138		out = [['Sample']] + [[s] for s in samples]
3139		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3140		for k,s in enumerate(samples):
3141			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3142			for s2 in samples:
3143				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3144		
3145		if not os.path.exists(dir):
3146			os.makedirs(dir)
3147		if filename is None:
3148			filename = f'D{self._4x}_correl.csv'
3149		with open(f'{dir}/{filename}', 'w') as fid:
3150			fid.write(make_csv(out))
3151		
3152		
3153		
3154
3155class D47data(D4xdata):
3156	'''
3157	Store and process data for a large set of Δ47 analyses,
3158	usually comprising more than one analytical session.
3159	'''
3160
3161	Nominal_D4x = {
3162		'ETH-1':   0.2052,
3163		'ETH-2':   0.2085,
3164		'ETH-3':   0.6132,
3165		'ETH-4':   0.4511,
3166		'IAEA-C1': 0.3018,
3167		'IAEA-C2': 0.6409,
3168		'MERCK':   0.5135,
3169		} # I-CDES (Bernasconi et al., 2021)
3170	'''
3171	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3172	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3173	reference frame.
3174
3175	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3176	```py
3177	{
3178		'ETH-1'   : 0.2052,
3179		'ETH-2'   : 0.2085,
3180		'ETH-3'   : 0.6132,
3181		'ETH-4'   : 0.4511,
3182		'IAEA-C1' : 0.3018,
3183		'IAEA-C2' : 0.6409,
3184		'MERCK'   : 0.5135,
3185	}
3186	```
3187	'''
3188
3189
3190	@property
3191	def Nominal_D47(self):
3192		return self.Nominal_D4x
3193	
3194
3195	@Nominal_D47.setter
3196	def Nominal_D47(self, new):
3197		self.Nominal_D4x = dict(**new)
3198		self.refresh()
3199
3200
3201	def __init__(self, l = [], **kwargs):
3202		'''
3203		**Parameters:** same as `D4xdata.__init__()`
3204		'''
3205		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3206
3207
3208	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3209		'''
3210		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3211		value for that temperature, and add treat these samples as additional anchors.
3212
3213		**Parameters**
3214
3215		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3216		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3217		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3218		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3219		if `new`: keep pre-existing anchors but update them in case of conflict
3220		between old and new Δ47 values;
3221		if `old`: keep pre-existing anchors but preserve their original Δ47
3222		values in case of conflict.
3223		'''
3224		f = {
3225			'petersen': fCO2eqD47_Petersen,
3226			'wang': fCO2eqD47_Wang,
3227			}[fCo2eqD47]
3228		foo = {}
3229		for r in self:
3230			if 'Teq' in r:
3231				if r['Sample'] in foo:
3232					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3233				else:
3234					foo[r['Sample']] = f(r['Teq'])
3235			else:
3236					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3237
3238		if priority == 'replace':
3239			self.Nominal_D47 = {}
3240		for s in foo:
3241			if priority != 'old' or s not in self.Nominal_D47:
3242				self.Nominal_D47[s] = foo[s]
3243	
3244	def save_D47_correl(self, *args, **kwargs):
3245		return self._save_D4x_correl(*args, **kwargs)
3246
3247	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3248
3249
3250class D48data(D4xdata):
3251	'''
3252	Store and process data for a large set of Δ48 analyses,
3253	usually comprising more than one analytical session.
3254	'''
3255
3256	Nominal_D4x = {
3257		'ETH-1':  0.138,
3258		'ETH-2':  0.138,
3259		'ETH-3':  0.270,
3260		'ETH-4':  0.223,
3261		'GU-1':  -0.419,
3262		} # (Fiebig et al., 2019, 2021)
3263	'''
3264	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3265	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3266	reference frame.
3267
3268	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3269	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3270
3271	```py
3272	{
3273		'ETH-1' :  0.138,
3274		'ETH-2' :  0.138,
3275		'ETH-3' :  0.270,
3276		'ETH-4' :  0.223,
3277		'GU-1'  : -0.419,
3278	}
3279	```
3280	'''
3281
3282
3283	@property
3284	def Nominal_D48(self):
3285		return self.Nominal_D4x
3286
3287	
3288	@Nominal_D48.setter
3289	def Nominal_D48(self, new):
3290		self.Nominal_D4x = dict(**new)
3291		self.refresh()
3292
3293
3294	def __init__(self, l = [], **kwargs):
3295		'''
3296		**Parameters:** same as `D4xdata.__init__()`
3297		'''
3298		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3299
3300	def save_D48_correl(self, *args, **kwargs):
3301		return self._save_D4x_correl(*args, **kwargs)
3302
3303	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3304
3305
3306class D49data(D4xdata):
3307	'''
3308	Store and process data for a large set of Δ49 analyses,
3309	usually comprising more than one analytical session.
3310	'''
3311	
3312	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3313	'''
3314	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3315	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3316	reference frame.
3317
3318	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3319
3320	```py
3321	{
3322		"1000C": 0.0,
3323		"25C": 2.228
3324	}
3325	```
3326	'''
3327	
3328	@property
3329	def Nominal_D49(self):
3330		return self.Nominal_D4x
3331	
3332	@Nominal_D49.setter
3333	def Nominal_D49(self, new):
3334		self.Nominal_D4x = dict(**new)
3335		self.refresh()
3336	
3337	def __init__(self, l=[], **kwargs):
3338		'''
3339		**Parameters:** same as `D4xdata.__init__()`
3340		'''
3341		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3342	
3343	def save_D49_correl(self, *args, **kwargs):
3344		return self._save_D4x_correl(*args, **kwargs)
3345	
3346	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3347
3348class _SessionPlot():
3349	'''
3350	Simple placeholder class
3351	'''
3352	def __init__(self):
3353		pass
3354
3355_app = typer.Typer(
3356	add_completion = False,
3357	context_settings={'help_option_names': ['-h', '--help']},
3358	rich_markup_mode = 'rich',
3359	)
3360
3361@_app.command()
3362def _cli(
3363	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3364	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3365	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3366	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3367	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3368	):
3369	"""
3370	Process raw D47 data and return standardized results.
3371	
3372	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3373	
3374	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3375	the data based either on the default [b]d13C_VDPB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3376	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3377	the following sequence is applied:
3378	
3379	* [b]D47data.wg()[/b]
3380	* [b]D47data.crunch()[/b]
3381	* [b]D47data.standardize()[/b]
3382	* [b]D47data.summary()[/b]
3383	* [b]D47data.table_of_samples()[/b]
3384	* [b]D47data.table_of_sessions()[/b]
3385	* [b]D47data.plot_sessions()[/b]
3386	* [b]D47data.plot_residuals()[/b]
3387	* [b]D47data.table_of_analyses()[/b]
3388	* [b]D47data.plot_distribution_of_analyses()[/b]
3389	* [b]D47data.plot_bulk_compositions()[/b]
3390	* [b]D47data.save_D47_correl()[/b]
3391	
3392	Optionally, also apply similar methods for [b]]D48[/b].
3393	
3394	[b]Example CSV file for --anchors option:[/b]	
3395	[i]
3396	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3397	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3398	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3399	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3400	ETH-4,            ,           ,  0.4511,  0.223
3401	[/i]
3402	Except for [i]Sample[/i], none of the columns above are mandatory.
3403
3404	[b]Example CSV file for --exclude option:[/b]	
3405	[i]
3406	Sample,  UID
3407	 FOO-1,
3408	 BAR-2,
3409	      ,  A04
3410	      ,  A17
3411	      ,  A88
3412	[/i]
3413	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3414	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3415	Neither column is mandatory.
3416	"""
3417
3418	data = D47data()
3419	data.read(rawdata)
3420
3421	if exclude != 'none':
3422		exclude = read_csv(exclude)
3423		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3424		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3425	else:
3426		exclude_uid = []
3427		exclude_sample = []
3428	
3429	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3430
3431	if anchors != 'none':
3432		anchors = read_csv(anchors)
3433		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3434			data.Nominal_d13C_VPDB = {
3435				_['Sample']: _['d13C_VPDB']
3436				for _ in anchors
3437				if 'd13C_VPDB' in _
3438				}
3439		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3440			data.Nominal_d18O_VPDB = {
3441				_['Sample']: _['d18O_VPDB']
3442				for _ in anchors
3443				if 'd18O_VPDB' in _
3444				}
3445		if len([_ for _ in anchors if 'D47' in _]):
3446			data.Nominal_D4x = {
3447				_['Sample']: _['D47']
3448				for _ in anchors
3449				if 'D47' in _
3450				}
3451
3452	data.refresh()
3453	data.wg()
3454	data.crunch()
3455	data.standardize()
3456	data.summary(dir = output_dir)
3457	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3458	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3459	data.plot_sessions(dir = output_dir)
3460	data.save_D47_correl(dir = output_dir)
3461	
3462	if not run_D48:
3463		data.table_of_samples(dir = output_dir)
3464		data.table_of_analyses(dir = output_dir)
3465		data.table_of_sessions(dir = output_dir)
3466
3467
3468	if run_D48:
3469		data2 = D48data()
3470		print(rawdata)
3471		data2.read(rawdata)
3472
3473		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3474
3475		if anchors != 'none':
3476			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3477				data2.Nominal_d13C_VPDB = {
3478					_['Sample']: _['d13C_VPDB']
3479					for _ in anchors
3480					if 'd13C_VPDB' in _
3481					}
3482			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3483				data2.Nominal_d18O_VPDB = {
3484					_['Sample']: _['d18O_VPDB']
3485					for _ in anchors
3486					if 'd18O_VPDB' in _
3487					}
3488			if len([_ for _ in anchors if 'D48' in _]):
3489				data2.Nominal_D4x = {
3490					_['Sample']: _['D48']
3491					for _ in anchors
3492					if 'D48' in _
3493					}
3494
3495		data2.refresh()
3496		data2.wg()
3497		data2.crunch()
3498		data2.standardize()
3499		data2.summary(dir = output_dir)
3500		data2.plot_sessions(dir = output_dir)
3501		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3502		data2.plot_distribution_of_analyses(dir = output_dir)
3503		data2.save_D48_correl(dir = output_dir)
3504
3505		table_of_analyses(data, data2, dir = output_dir)
3506		table_of_samples(data, data2, dir = output_dir)
3507		table_of_sessions(data, data2, dir = output_dir)
3508		
3509def __cli():
3510	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
68def fCO2eqD47_Petersen(T):
69	'''
70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
72
73	'''
74	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
79def fCO2eqD47_Wang(T):
80	'''
81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
84	'''
85	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 88def correlated_sum(X, C, w = None):
 89	'''
 90	Compute covariance-aware linear combinations
 91
 92	**Parameters**
 93	
 94	+ `X`: list or 1-D array of values to sum
 95	+ `C`: covariance matrix for the elements of `X`
 96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 97	       (all equal to 1 by default)
 98
 99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
100	to the elements of `w`, accounting for covariances between the elements of `X`.
101	'''
102	if w is None:
103		w = [1 for x in X]
104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
107def make_csv(x, hsep = ',', vsep = '\n'):
108	'''
109	Formats a list of lists of strings as a CSV
110
111	**Parameters**
112
113	+ `x`: the list of lists of strings to format
114	+ `hsep`: the field separator (`,` by default)
115	+ `vsep`: the line-ending convention to use (`\\n` by default)
116
117	**Example**
118
119	```py
120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
121	```
122
123	outputs:
124
125	```py
126	a,b,c
127	d,e,f
128	```
129	'''
130	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
133def pf(txt):
134	'''
135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
136	'''
137	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
140def smart_type(x):
141	'''
142	Tries to convert string `x` to a float if it includes a decimal point, or
143	to an integer if it does not. If both attempts fail, return the original
144	string unchanged.
145	'''
146	try:
147		y = float(x)
148	except ValueError:
149		return x
150	if '.' not in x:
151		return int(y)
152	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

def pretty_table(x, header=1, hsep=' ', vsep='–', align='<'):
155def pretty_table(x, header = 1, hsep = '  ', vsep = '–', align = '<'):
156	'''
157	Reads a list of lists of strings and outputs an ascii table
158
159	**Parameters**
160
161	+ `x`: a list of lists of strings
162	+ `header`: the number of lines to treat as header lines
163	+ `hsep`: the horizontal separator between columns
164	+ `vsep`: the character to use as vertical separator
165	+ `align`: string of left (`<`) or right (`>`) alignment characters.
166
167	**Example**
168
169	```py
170	x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
171	print(pretty_table(x))
172	```
173	yields:	
174	```
175	--  ------  ---
176	A        B    C
177	--  ------  ---
178	1   1.9999  foo
179	10       x  bar
180	--  ------  ---
181	```
182	
183	'''
184	txt = []
185	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
186
187	if len(widths) > len(align):
188		align += '>' * (len(widths)-len(align))
189	sepline = hsep.join([vsep*w for w in widths])
190	txt += [sepline]
191	for k,l in enumerate(x):
192		if k and k == header:
193			txt += [sepline]
194		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
195	txt += [sepline]
196	txt += ['']
197	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
print(pretty_table(x))

yields:

--  ------  ---
A        B    C
--  ------  ---
1   1.9999  foo
10       x  bar
--  ------  ---
def transpose_table(x):
200def transpose_table(x):
201	'''
202	Transpose a list if lists
203
204	**Parameters**
205
206	+ `x`: a list of lists
207
208	**Example**
209
210	```py
211	x = [[1, 2], [3, 4]]
212	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
213	```
214	'''
215	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
218def w_avg(X, sX) :
219	'''
220	Compute variance-weighted average
221
222	Returns the value and SE of the weighted average of the elements of `X`,
223	with relative weights equal to their inverse variances (`1/sX**2`).
224
225	**Parameters**
226
227	+ `X`: array-like of elements to average
228	+ `sX`: array-like of the corresponding SE values
229
230	**Tip**
231
232	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
233	they may be rearranged using `zip()`:
234
235	```python
236	foo = [(0, 1), (1, 0.5), (2, 0.5)]
237	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
238	```
239	'''
240	X = [ x for x in X ]
241	sX = [ sx for sx in sX ]
242	W = [ sx**-2 for sx in sX ]
243	W = [ w/sum(W) for w in W ]
244	Xavg = sum([ w*x for w,x in zip(W,X) ])
245	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
246	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
249def read_csv(filename, sep = ''):
250	'''
251	Read contents of `filename` in csv format and return a list of dictionaries.
252
253	In the csv string, spaces before and after field separators (`','` by default)
254	are optional.
255
256	**Parameters**
257
258	+ `filename`: the csv file to read
259	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
260	whichever appers most often in the contents of `filename`.
261	'''
262	with open(filename) as fid:
263		txt = fid.read()
264
265	if sep == '':
266		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
267	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
268	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
271def simulate_single_analysis(
272	sample = 'MYSAMPLE',
273	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
274	d13C_VPDB = None, d18O_VPDB = None,
275	D47 = None, D48 = None, D49 = 0., D17O = 0.,
276	a47 = 1., b47 = 0., c47 = -0.9,
277	a48 = 1., b48 = 0., c48 = -0.45,
278	Nominal_D47 = None,
279	Nominal_D48 = None,
280	Nominal_d13C_VPDB = None,
281	Nominal_d18O_VPDB = None,
282	ALPHA_18O_ACID_REACTION = None,
283	R13_VPDB = None,
284	R17_VSMOW = None,
285	R18_VSMOW = None,
286	LAMBDA_17 = None,
287	R18_VPDB = None,
288	):
289	'''
290	Compute working-gas delta values for a single analysis, assuming a stochastic working
291	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
292	
293	**Parameters**
294
295	+ `sample`: sample name
296	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
297		(respectively –4 and +26 ‰ by default)
298	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
299	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
300		of the carbonate sample
301	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
302		Δ48 values if `D47` or `D48` are not specified
303	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
304		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
305	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
306	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
307		correction parameters (by default equal to the `D4xdata` default values)
308	
309	Returns a dictionary with fields
310	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
311	'''
312
313	if Nominal_d13C_VPDB is None:
314		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
315
316	if Nominal_d18O_VPDB is None:
317		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
318
319	if ALPHA_18O_ACID_REACTION is None:
320		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
321
322	if R13_VPDB is None:
323		R13_VPDB = D4xdata().R13_VPDB
324
325	if R17_VSMOW is None:
326		R17_VSMOW = D4xdata().R17_VSMOW
327
328	if R18_VSMOW is None:
329		R18_VSMOW = D4xdata().R18_VSMOW
330
331	if LAMBDA_17 is None:
332		LAMBDA_17 = D4xdata().LAMBDA_17
333
334	if R18_VPDB is None:
335		R18_VPDB = D4xdata().R18_VPDB
336	
337	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
338	
339	if Nominal_D47 is None:
340		Nominal_D47 = D47data().Nominal_D47
341
342	if Nominal_D48 is None:
343		Nominal_D48 = D48data().Nominal_D48
344	
345	if d13C_VPDB is None:
346		if sample in Nominal_d13C_VPDB:
347			d13C_VPDB = Nominal_d13C_VPDB[sample]
348		else:
349			raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.")
350
351	if d18O_VPDB is None:
352		if sample in Nominal_d18O_VPDB:
353			d18O_VPDB = Nominal_d18O_VPDB[sample]
354		else:
355			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
356
357	if D47 is None:
358		if sample in Nominal_D47:
359			D47 = Nominal_D47[sample]
360		else:
361			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
362
363	if D48 is None:
364		if sample in Nominal_D48:
365			D48 = Nominal_D48[sample]
366		else:
367			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
368
369	X = D4xdata()
370	X.R13_VPDB = R13_VPDB
371	X.R17_VSMOW = R17_VSMOW
372	X.R18_VSMOW = R18_VSMOW
373	X.LAMBDA_17 = LAMBDA_17
374	X.R18_VPDB = R18_VPDB
375	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
376
377	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
378		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
379		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
380		)
381	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
382		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
383		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
384		D17O=D17O, D47=D47, D48=D48, D49=D49,
385		)
386	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
387		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
388		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
389		D17O=D17O,
390		)
391	
392	d45 = 1000 * (R45/R45wg - 1)
393	d46 = 1000 * (R46/R46wg - 1)
394	d47 = 1000 * (R47/R47wg - 1)
395	d48 = 1000 * (R48/R48wg - 1)
396	d49 = 1000 * (R49/R49wg - 1)
397
398	for k in range(3): # dumb iteration to adjust for small changes in d47
399		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
400		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
401		d47 = 1000 * (R47raw/R47wg - 1)
402		d48 = 1000 * (R48raw/R48wg - 1)
403
404	return dict(
405		Sample = sample,
406		D17O = D17O,
407		d13Cwg_VPDB = d13Cwg_VPDB,
408		d18Owg_VSMOW = d18Owg_VSMOW,
409		d45 = d45,
410		d46 = d46,
411		d47 = d47,
412		d48 = d48,
413		d49 = d49,
414		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
417def virtual_data(
418	samples = [],
419	a47 = 1., b47 = 0., c47 = -0.9,
420	a48 = 1., b48 = 0., c48 = -0.45,
421	rd45 = 0.020, rd46 = 0.060,
422	rD47 = 0.015, rD48 = 0.045,
423	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
424	session = None,
425	Nominal_D47 = None, Nominal_D48 = None,
426	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
427	ALPHA_18O_ACID_REACTION = None,
428	R13_VPDB = None,
429	R17_VSMOW = None,
430	R18_VSMOW = None,
431	LAMBDA_17 = None,
432	R18_VPDB = None,
433	seed = 0,
434	shuffle = True,
435	):
436	'''
437	Return list with simulated analyses from a single session.
438	
439	**Parameters**
440	
441	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
442	    * `Sample`: the name of the sample
443	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
444	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
445	    * `N`: how many analyses to generate for this sample
446	+ `a47`: scrambling factor for Δ47
447	+ `b47`: compositional nonlinearity for Δ47
448	+ `c47`: working gas offset for Δ47
449	+ `a48`: scrambling factor for Δ48
450	+ `b48`: compositional nonlinearity for Δ48
451	+ `c48`: working gas offset for Δ48
452	+ `rd45`: analytical repeatability of δ45
453	+ `rd46`: analytical repeatability of δ46
454	+ `rD47`: analytical repeatability of Δ47
455	+ `rD48`: analytical repeatability of Δ48
456	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
457		(by default equal to the `simulate_single_analysis` default values)
458	+ `session`: name of the session (no name by default)
459	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
460		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
461	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
462		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
463		(by default equal to the `simulate_single_analysis` defaults)
464	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
465		(by default equal to the `simulate_single_analysis` defaults)
466	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
467		correction parameters (by default equal to the `simulate_single_analysis` default)
468	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
469	+ `shuffle`: randomly reorder the sequence of analyses
470	
471		
472	Here is an example of using this method to generate an arbitrary combination of
473	anchors and unknowns for a bunch of sessions:
474
475	```py
476	.. include:: ../../code_examples/virtual_data/example.py
477	```
478	
479	This should output something like:
480	
481	```
482	.. include:: ../../code_examples/virtual_data/output.txt
483	```
484	'''
485	
486	kwargs = locals().copy()
487
488	from numpy import random as nprandom
489	if seed:
490		rng = nprandom.default_rng(seed)
491	else:
492		rng = nprandom.default_rng()
493	
494	N = sum([s['N'] for s in samples])
495	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
496	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
497	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
498	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
499	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
500	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
501	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
502	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
503	
504	k = 0
505	out = []
506	for s in samples:
507		kw = {}
508		kw['sample'] = s['Sample']
509		kw = {
510			**kw,
511			**{var: kwargs[var]
512				for var in [
513					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
514					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
515					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
516					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
517					]
518				if kwargs[var] is not None},
519			**{var: s[var]
520				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
521				if var in s},
522			}
523
524		sN = s['N']
525		while sN:
526			out.append(simulate_single_analysis(**kw))
527			out[-1]['d45'] += errors45[k]
528			out[-1]['d46'] += errors46[k]
529			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
530			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
531			sN -= 1
532			k += 1
533
534		if session is not None:
535			for r in out:
536				r['Session'] = session
537
538		if shuffle:
539			nprandom.shuffle(out)
540
541	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––  ––––––––––––––

[table_of_samples] 
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
––––––  ––  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

[table_of_analyses] 
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––
1    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
2    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
3    Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
4    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
5    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
6    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
7    Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
8    Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
9    Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
10   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
11   Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
12   Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
13   Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
14   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
15   Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
16   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
17   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
18   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
19   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
20   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
21   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
22   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
23   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
24   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
25   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
26   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
27   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
28   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
29   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
30   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
31   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
32   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
33   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
34   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
35   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
36   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
37   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
38   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
39   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
40   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
41   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
42   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
43   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
44   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
45   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
46   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
47   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
48   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
49   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
50   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
51   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
52   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
53   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
54   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
55   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
56   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
57   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
58   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
59   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
60   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
543def table_of_samples(
544	data47 = None,
545	data48 = None,
546	dir = 'output',
547	filename = None,
548	save_to_file = True,
549	print_out = True,
550	output = None,
551	):
552	'''
553	Print out, save to disk and/or return a combined table of samples
554	for a pair of `D47data` and `D48data` objects.
555
556	**Parameters**
557
558	+ `data47`: `D47data` instance
559	+ `data48`: `D48data` instance
560	+ `dir`: the directory in which to save the table
561	+ `filename`: the name to the csv file to write to
562	+ `save_to_file`: whether to save the table to disk
563	+ `print_out`: whether to print out the table
564	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
565		if set to `'raw'`: return a list of list of strings
566		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
567	'''
568	if data47 is None:
569		if data48 is None:
570			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
571		else:
572			return data48.table_of_samples(
573				dir = dir,
574				filename = filename,
575				save_to_file = save_to_file,
576				print_out = print_out,
577				output = output
578				)
579	else:
580		if data48 is None:
581			return data47.table_of_samples(
582				dir = dir,
583				filename = filename,
584				save_to_file = save_to_file,
585				print_out = print_out,
586				output = output
587				)
588		else:
589			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
590			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
591			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
592
593			if save_to_file:
594				if not os.path.exists(dir):
595					os.makedirs(dir)
596				if filename is None:
597					filename = f'D47D48_samples.csv'
598				with open(f'{dir}/{filename}', 'w') as fid:
599					fid.write(make_csv(out))
600			if print_out:
601				print('\n'+pretty_table(out))
602			if output == 'raw':
603				return out
604			elif output == 'pretty':
605				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
608def table_of_sessions(
609	data47 = None,
610	data48 = None,
611	dir = 'output',
612	filename = None,
613	save_to_file = True,
614	print_out = True,
615	output = None,
616	):
617	'''
618	Print out, save to disk and/or return a combined table of sessions
619	for a pair of `D47data` and `D48data` objects.
620	***Only applicable if the sessions in `data47` and those in `data48`
621	consist of the exact same sets of analyses.***
622
623	**Parameters**
624
625	+ `data47`: `D47data` instance
626	+ `data48`: `D48data` instance
627	+ `dir`: the directory in which to save the table
628	+ `filename`: the name to the csv file to write to
629	+ `save_to_file`: whether to save the table to disk
630	+ `print_out`: whether to print out the table
631	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
632		if set to `'raw'`: return a list of list of strings
633		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
634	'''
635	if data47 is None:
636		if data48 is None:
637			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
638		else:
639			return data48.table_of_sessions(
640				dir = dir,
641				filename = filename,
642				save_to_file = save_to_file,
643				print_out = print_out,
644				output = output
645				)
646	else:
647		if data48 is None:
648			return data47.table_of_sessions(
649				dir = dir,
650				filename = filename,
651				save_to_file = save_to_file,
652				print_out = print_out,
653				output = output
654				)
655		else:
656			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
657			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
658			for k,x in enumerate(out47[0]):
659				if k>7:
660					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
661					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
662			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
663
664			if save_to_file:
665				if not os.path.exists(dir):
666					os.makedirs(dir)
667				if filename is None:
668					filename = f'D47D48_sessions.csv'
669				with open(f'{dir}/{filename}', 'w') as fid:
670					fid.write(make_csv(out))
671			if print_out:
672				print('\n'+pretty_table(out))
673			if output == 'raw':
674				return out
675			elif output == 'pretty':
676				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
679def table_of_analyses(
680	data47 = None,
681	data48 = None,
682	dir = 'output',
683	filename = None,
684	save_to_file = True,
685	print_out = True,
686	output = None,
687	):
688	'''
689	Print out, save to disk and/or return a combined table of analyses
690	for a pair of `D47data` and `D48data` objects.
691
692	If the sessions in `data47` and those in `data48` do not consist of
693	the exact same sets of analyses, the table will have two columns
694	`Session_47` and `Session_48` instead of a single `Session` column.
695
696	**Parameters**
697
698	+ `data47`: `D47data` instance
699	+ `data48`: `D48data` instance
700	+ `dir`: the directory in which to save the table
701	+ `filename`: the name to the csv file to write to
702	+ `save_to_file`: whether to save the table to disk
703	+ `print_out`: whether to print out the table
704	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
705		if set to `'raw'`: return a list of list of strings
706		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
707	'''
708	if data47 is None:
709		if data48 is None:
710			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
711		else:
712			return data48.table_of_analyses(
713				dir = dir,
714				filename = filename,
715				save_to_file = save_to_file,
716				print_out = print_out,
717				output = output
718				)
719	else:
720		if data48 is None:
721			return data47.table_of_analyses(
722				dir = dir,
723				filename = filename,
724				save_to_file = save_to_file,
725				print_out = print_out,
726				output = output
727				)
728		else:
729			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
730			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
731			
732			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
733				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
734			else:
735				out47[0][1] = 'Session_47'
736				out48[0][1] = 'Session_48'
737				out47 = transpose_table(out47)
738				out48 = transpose_table(out48)
739				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
740
741			if save_to_file:
742				if not os.path.exists(dir):
743					os.makedirs(dir)
744				if filename is None:
745					filename = f'D47D48_sessions.csv'
746				with open(f'{dir}/{filename}', 'w') as fid:
747					fid.write(make_csv(out))
748			if print_out:
749				print('\n'+pretty_table(out))
750			if output == 'raw':
751				return out
752			elif output == 'pretty':
753				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 801class D4xdata(list):
 802	'''
 803	Store and process data for a large set of Δ47 and/or Δ48
 804	analyses, usually comprising more than one analytical session.
 805	'''
 806
 807	### 17O CORRECTION PARAMETERS
 808	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 809	'''
 810	Absolute (13C/12C) ratio of VPDB.
 811	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 812	'''
 813
 814	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 815	'''
 816	Absolute (18O/16C) ratio of VSMOW.
 817	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 818	'''
 819
 820	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 821	'''
 822	Mass-dependent exponent for triple oxygen isotopes.
 823	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 824	'''
 825
 826	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 827	'''
 828	Absolute (17O/16C) ratio of VSMOW.
 829	By default equal to 0.00038475
 830	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 831	rescaled to `R13_VPDB`)
 832	'''
 833
 834	R18_VPDB = R18_VSMOW * 1.03092
 835	'''
 836	Absolute (18O/16C) ratio of VPDB.
 837	By definition equal to `R18_VSMOW * 1.03092`.
 838	'''
 839
 840	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 841	'''
 842	Absolute (17O/16C) ratio of VPDB.
 843	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 844	'''
 845
 846	LEVENE_REF_SAMPLE = 'ETH-3'
 847	'''
 848	After the Δ4x standardization step, each sample is tested to
 849	assess whether the Δ4x variance within all analyses for that
 850	sample differs significantly from that observed for a given reference
 851	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 852	which yields a p-value corresponding to the null hypothesis that the
 853	underlying variances are equal).
 854
 855	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 856	sample should be used as a reference for this test.
 857	'''
 858
 859	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 860	'''
 861	Specifies the 18O/16O fractionation factor generally applicable
 862	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 863	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 864
 865	By default equal to 1.008129 (calcite reacted at 90 °C,
 866	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 867	'''
 868
 869	Nominal_d13C_VPDB = {
 870		'ETH-1': 2.02,
 871		'ETH-2': -10.17,
 872		'ETH-3': 1.71,
 873		}	# (Bernasconi et al., 2018)
 874	'''
 875	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 876	`D4xdata.standardize_d13C()`.
 877
 878	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 879	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 880	'''
 881
 882	Nominal_d18O_VPDB = {
 883		'ETH-1': -2.19,
 884		'ETH-2': -18.69,
 885		'ETH-3': -1.78,
 886		}	# (Bernasconi et al., 2018)
 887	'''
 888	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 889	`D4xdata.standardize_d18O()`.
 890
 891	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 892	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 893	'''
 894
 895	d13C_STANDARDIZATION_METHOD = '2pt'
 896	'''
 897	Method by which to standardize δ13C values:
 898	
 899	+ `none`: do not apply any δ13C standardization.
 900	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 901	minimize the difference between final δ13C_VPDB values and
 902	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 903	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 904	values so as to minimize the difference between final δ13C_VPDB
 905	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 906	is defined).
 907	'''
 908
 909	d18O_STANDARDIZATION_METHOD = '2pt'
 910	'''
 911	Method by which to standardize δ18O values:
 912	
 913	+ `none`: do not apply any δ18O standardization.
 914	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 915	minimize the difference between final δ18O_VPDB values and
 916	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 917	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 918	values so as to minimize the difference between final δ18O_VPDB
 919	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 920	is defined).
 921	'''
 922
 923	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 924		'''
 925		**Parameters**
 926
 927		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 928		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 929		+ `mass`: `'47'` or `'48'`
 930		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 931		+ `session`: define session name for analyses without a `Session` key
 932		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 933
 934		Returns a `D4xdata` object derived from `list`.
 935		'''
 936		self._4x = mass
 937		self.verbose = verbose
 938		self.prefix = 'D4xdata'
 939		self.logfile = logfile
 940		list.__init__(self, l)
 941		self.Nf = None
 942		self.repeatability = {}
 943		self.refresh(session = session)
 944
 945
 946	def make_verbal(oldfun):
 947		'''
 948		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 949		'''
 950		@wraps(oldfun)
 951		def newfun(*args, verbose = '', **kwargs):
 952			myself = args[0]
 953			oldprefix = myself.prefix
 954			myself.prefix = oldfun.__name__
 955			if verbose != '':
 956				oldverbose = myself.verbose
 957				myself.verbose = verbose
 958			out = oldfun(*args, **kwargs)
 959			myself.prefix = oldprefix
 960			if verbose != '':
 961				myself.verbose = oldverbose
 962			return out
 963		return newfun
 964
 965
 966	def msg(self, txt):
 967		'''
 968		Log a message to `self.logfile`, and print it out if `verbose = True`
 969		'''
 970		self.log(txt)
 971		if self.verbose:
 972			print(f'{f"[{self.prefix}]":<16} {txt}')
 973
 974
 975	def vmsg(self, txt):
 976		'''
 977		Log a message to `self.logfile` and print it out
 978		'''
 979		self.log(txt)
 980		print(txt)
 981
 982
 983	def log(self, *txts):
 984		'''
 985		Log a message to `self.logfile`
 986		'''
 987		if self.logfile:
 988			with open(self.logfile, 'a') as fid:
 989				for txt in txts:
 990					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
 991
 992
 993	def refresh(self, session = 'mySession'):
 994		'''
 995		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
 996		'''
 997		self.fill_in_missing_info(session = session)
 998		self.refresh_sessions()
 999		self.refresh_samples()
1000
1001
1002	def refresh_sessions(self):
1003		'''
1004		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1005		to `False` for all sessions.
1006		'''
1007		self.sessions = {
1008			s: {'data': [r for r in self if r['Session'] == s]}
1009			for s in sorted({r['Session'] for r in self})
1010			}
1011		for s in self.sessions:
1012			self.sessions[s]['scrambling_drift'] = False
1013			self.sessions[s]['slope_drift'] = False
1014			self.sessions[s]['wg_drift'] = False
1015			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1016			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1017
1018
1019	def refresh_samples(self):
1020		'''
1021		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1022		'''
1023		self.samples = {
1024			s: {'data': [r for r in self if r['Sample'] == s]}
1025			for s in sorted({r['Sample'] for r in self})
1026			}
1027		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1028		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1029
1030
1031	def read(self, filename, sep = '', session = ''):
1032		'''
1033		Read file in csv format to load data into a `D47data` object.
1034
1035		In the csv file, spaces before and after field separators (`','` by default)
1036		are optional. Each line corresponds to a single analysis.
1037
1038		The required fields are:
1039
1040		+ `UID`: a unique identifier
1041		+ `Session`: an identifier for the analytical session
1042		+ `Sample`: a sample identifier
1043		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1044
1045		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1046		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1047		and `d49` are optional, and set to NaN by default.
1048
1049		**Parameters**
1050
1051		+ `fileneme`: the path of the file to read
1052		+ `sep`: csv separator delimiting the fields
1053		+ `session`: set `Session` field to this string for all analyses
1054		'''
1055		with open(filename) as fid:
1056			self.input(fid.read(), sep = sep, session = session)
1057
1058
1059	def input(self, txt, sep = '', session = ''):
1060		'''
1061		Read `txt` string in csv format to load analysis data into a `D47data` object.
1062
1063		In the csv string, spaces before and after field separators (`','` by default)
1064		are optional. Each line corresponds to a single analysis.
1065
1066		The required fields are:
1067
1068		+ `UID`: a unique identifier
1069		+ `Session`: an identifier for the analytical session
1070		+ `Sample`: a sample identifier
1071		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1072
1073		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1074		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1075		and `d49` are optional, and set to NaN by default.
1076
1077		**Parameters**
1078
1079		+ `txt`: the csv string to read
1080		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1081		whichever appers most often in `txt`.
1082		+ `session`: set `Session` field to this string for all analyses
1083		'''
1084		if sep == '':
1085			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1086		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1087		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1088
1089		if session != '':
1090			for r in data:
1091				r['Session'] = session
1092
1093		self += data
1094		self.refresh()
1095
1096
1097	@make_verbal
1098	def wg(self, samples = None, a18_acid = None):
1099		'''
1100		Compute bulk composition of the working gas for each session based on
1101		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1102		`self.Nominal_d18O_VPDB`.
1103		'''
1104
1105		self.msg('Computing WG composition:')
1106
1107		if a18_acid is None:
1108			a18_acid = self.ALPHA_18O_ACID_REACTION
1109		if samples is None:
1110			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1111
1112		assert a18_acid, f'Acid fractionation factor should not be zero.'
1113
1114		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1115		R45R46_standards = {}
1116		for sample in samples:
1117			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1118			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1119			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1120			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1121			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1122
1123			C12_s = 1 / (1 + R13_s)
1124			C13_s = R13_s / (1 + R13_s)
1125			C16_s = 1 / (1 + R17_s + R18_s)
1126			C17_s = R17_s / (1 + R17_s + R18_s)
1127			C18_s = R18_s / (1 + R17_s + R18_s)
1128
1129			C626_s = C12_s * C16_s ** 2
1130			C627_s = 2 * C12_s * C16_s * C17_s
1131			C628_s = 2 * C12_s * C16_s * C18_s
1132			C636_s = C13_s * C16_s ** 2
1133			C637_s = 2 * C13_s * C16_s * C17_s
1134			C727_s = C12_s * C17_s ** 2
1135
1136			R45_s = (C627_s + C636_s) / C626_s
1137			R46_s = (C628_s + C637_s + C727_s) / C626_s
1138			R45R46_standards[sample] = (R45_s, R46_s)
1139		
1140		for s in self.sessions:
1141			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1142			assert db, f'No sample from {samples} found in session "{s}".'
1143# 			dbsamples = sorted({r['Sample'] for r in db})
1144
1145			X = [r['d45'] for r in db]
1146			Y = [R45R46_standards[r['Sample']][0] for r in db]
1147			x1, x2 = np.min(X), np.max(X)
1148
1149			if x1 < x2:
1150				wgcoord = x1/(x1-x2)
1151			else:
1152				wgcoord = 999
1153
1154			if wgcoord < -.5 or wgcoord > 1.5:
1155				# unreasonable to extrapolate to d45 = 0
1156				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1157			else :
1158				# d45 = 0 is reasonably well bracketed
1159				R45_wg = np.polyfit(X, Y, 1)[1]
1160
1161			X = [r['d46'] for r in db]
1162			Y = [R45R46_standards[r['Sample']][1] for r in db]
1163			x1, x2 = np.min(X), np.max(X)
1164
1165			if x1 < x2:
1166				wgcoord = x1/(x1-x2)
1167			else:
1168				wgcoord = 999
1169
1170			if wgcoord < -.5 or wgcoord > 1.5:
1171				# unreasonable to extrapolate to d46 = 0
1172				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1173			else :
1174				# d46 = 0 is reasonably well bracketed
1175				R46_wg = np.polyfit(X, Y, 1)[1]
1176
1177			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1178
1179			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1180
1181			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1182			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1183			for r in self.sessions[s]['data']:
1184				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1185				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1186
1187
1188	def compute_bulk_delta(self, R45, R46, D17O = 0):
1189		'''
1190		Compute δ13C_VPDB and δ18O_VSMOW,
1191		by solving the generalized form of equation (17) from
1192		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1193		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1194		solving the corresponding second-order Taylor polynomial.
1195		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1196		'''
1197
1198		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1199
1200		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1201		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1202		C = 2 * self.R18_VSMOW
1203		D = -R46
1204
1205		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1206		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1207		cc = A + B + C + D
1208
1209		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1210
1211		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1212		R17 = K * R18 ** self.LAMBDA_17
1213		R13 = R45 - 2 * R17
1214
1215		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1216
1217		return d13C_VPDB, d18O_VSMOW
1218
1219
1220	@make_verbal
1221	def crunch(self, verbose = ''):
1222		'''
1223		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1224		'''
1225		for r in self:
1226			self.compute_bulk_and_clumping_deltas(r)
1227		self.standardize_d13C()
1228		self.standardize_d18O()
1229		self.msg(f"Crunched {len(self)} analyses.")
1230
1231
1232	def fill_in_missing_info(self, session = 'mySession'):
1233		'''
1234		Fill in optional fields with default values
1235		'''
1236		for i,r in enumerate(self):
1237			if 'D17O' not in r:
1238				r['D17O'] = 0.
1239			if 'UID' not in r:
1240				r['UID'] = f'{i+1}'
1241			if 'Session' not in r:
1242				r['Session'] = session
1243			for k in ['d47', 'd48', 'd49']:
1244				if k not in r:
1245					r[k] = np.nan
1246
1247
1248	def standardize_d13C(self):
1249		'''
1250		Perform δ13C standadization within each session `s` according to
1251		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1252		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1253		may be redefined abitrarily at a later stage.
1254		'''
1255		for s in self.sessions:
1256			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1257				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1258				X,Y = zip(*XY)
1259				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1260					offset = np.mean(Y) - np.mean(X)
1261					for r in self.sessions[s]['data']:
1262						r['d13C_VPDB'] += offset				
1263				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1264					a,b = np.polyfit(X,Y,1)
1265					for r in self.sessions[s]['data']:
1266						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1267
1268	def standardize_d18O(self):
1269		'''
1270		Perform δ18O standadization within each session `s` according to
1271		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1272		which is defined by default by `D47data.refresh_sessions()`as equal to
1273		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1274		'''
1275		for s in self.sessions:
1276			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1277				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1278				X,Y = zip(*XY)
1279				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1280				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1281					offset = np.mean(Y) - np.mean(X)
1282					for r in self.sessions[s]['data']:
1283						r['d18O_VSMOW'] += offset				
1284				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1285					a,b = np.polyfit(X,Y,1)
1286					for r in self.sessions[s]['data']:
1287						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1288	
1289
1290	def compute_bulk_and_clumping_deltas(self, r):
1291		'''
1292		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1293		'''
1294
1295		# Compute working gas R13, R18, and isobar ratios
1296		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1297		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1298		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1299
1300		# Compute analyte isobar ratios
1301		R45 = (1 + r['d45'] / 1000) * R45_wg
1302		R46 = (1 + r['d46'] / 1000) * R46_wg
1303		R47 = (1 + r['d47'] / 1000) * R47_wg
1304		R48 = (1 + r['d48'] / 1000) * R48_wg
1305		R49 = (1 + r['d49'] / 1000) * R49_wg
1306
1307		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1308		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1309		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1310
1311		# Compute stochastic isobar ratios of the analyte
1312		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1313			R13, R18, D17O = r['D17O']
1314		)
1315
1316		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1317		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1318		if (R45 / R45stoch - 1) > 5e-8:
1319			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1320		if (R46 / R46stoch - 1) > 5e-8:
1321			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1322
1323		# Compute raw clumped isotope anomalies
1324		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1325		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1326		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1327
1328
1329	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1330		'''
1331		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1332		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1333		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1334		'''
1335
1336		# Compute R17
1337		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1338
1339		# Compute isotope concentrations
1340		C12 = (1 + R13) ** -1
1341		C13 = C12 * R13
1342		C16 = (1 + R17 + R18) ** -1
1343		C17 = C16 * R17
1344		C18 = C16 * R18
1345
1346		# Compute stochastic isotopologue concentrations
1347		C626 = C16 * C12 * C16
1348		C627 = C16 * C12 * C17 * 2
1349		C628 = C16 * C12 * C18 * 2
1350		C636 = C16 * C13 * C16
1351		C637 = C16 * C13 * C17 * 2
1352		C638 = C16 * C13 * C18 * 2
1353		C727 = C17 * C12 * C17
1354		C728 = C17 * C12 * C18 * 2
1355		C737 = C17 * C13 * C17
1356		C738 = C17 * C13 * C18 * 2
1357		C828 = C18 * C12 * C18
1358		C838 = C18 * C13 * C18
1359
1360		# Compute stochastic isobar ratios
1361		R45 = (C636 + C627) / C626
1362		R46 = (C628 + C637 + C727) / C626
1363		R47 = (C638 + C728 + C737) / C626
1364		R48 = (C738 + C828) / C626
1365		R49 = C838 / C626
1366
1367		# Account for stochastic anomalies
1368		R47 *= 1 + D47 / 1000
1369		R48 *= 1 + D48 / 1000
1370		R49 *= 1 + D49 / 1000
1371
1372		# Return isobar ratios
1373		return R45, R46, R47, R48, R49
1374
1375
1376	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1377		'''
1378		Split unknown samples by UID (treat all analyses as different samples)
1379		or by session (treat analyses of a given sample in different sessions as
1380		different samples).
1381
1382		**Parameters**
1383
1384		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1385		+ `grouping`: `by_uid` | `by_session`
1386		'''
1387		if samples_to_split == 'all':
1388			samples_to_split = [s for s in self.unknowns]
1389		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1390		self.grouping = grouping.lower()
1391		if self.grouping in gkeys:
1392			gkey = gkeys[self.grouping]
1393		for r in self:
1394			if r['Sample'] in samples_to_split:
1395				r['Sample_original'] = r['Sample']
1396				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1397			elif r['Sample'] in self.unknowns:
1398				r['Sample_original'] = r['Sample']
1399		self.refresh_samples()
1400
1401
1402	def unsplit_samples(self, tables = False):
1403		'''
1404		Reverse the effects of `D47data.split_samples()`.
1405		
1406		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1407		
1408		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1409		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1410		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1411		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1412		that case session-averaged Δ4x values are statistically independent).
1413		'''
1414		unknowns_old = sorted({s for s in self.unknowns})
1415		CM_old = self.standardization.covar[:,:]
1416		VD_old = self.standardization.params.valuesdict().copy()
1417		vars_old = self.standardization.var_names
1418
1419		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1420
1421		Ns = len(vars_old) - len(unknowns_old)
1422		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1423		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1424
1425		W = np.zeros((len(vars_new), len(vars_old)))
1426		W[:Ns,:Ns] = np.eye(Ns)
1427		for u in unknowns_new:
1428			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1429			if self.grouping == 'by_session':
1430				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1431			elif self.grouping == 'by_uid':
1432				weights = [1 for s in splits]
1433			sw = sum(weights)
1434			weights = [w/sw for w in weights]
1435			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1436
1437		CM_new = W @ CM_old @ W.T
1438		V = W @ np.array([[VD_old[k]] for k in vars_old])
1439		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1440
1441		self.standardization.covar = CM_new
1442		self.standardization.params.valuesdict = lambda : VD_new
1443		self.standardization.var_names = vars_new
1444
1445		for r in self:
1446			if r['Sample'] in self.unknowns:
1447				r['Sample_split'] = r['Sample']
1448				r['Sample'] = r['Sample_original']
1449
1450		self.refresh_samples()
1451		self.consolidate_samples()
1452		self.repeatabilities()
1453
1454		if tables:
1455			self.table_of_analyses()
1456			self.table_of_samples()
1457
1458	def assign_timestamps(self):
1459		'''
1460		Assign a time field `t` of type `float` to each analysis.
1461
1462		If `TimeTag` is one of the data fields, `t` is equal within a given session
1463		to `TimeTag` minus the mean value of `TimeTag` for that session.
1464		Otherwise, `TimeTag` is by default equal to the index of each analysis
1465		in the dataset and `t` is defined as above.
1466		'''
1467		for session in self.sessions:
1468			sdata = self.sessions[session]['data']
1469			try:
1470				t0 = np.mean([r['TimeTag'] for r in sdata])
1471				for r in sdata:
1472					r['t'] = r['TimeTag'] - t0
1473			except KeyError:
1474				t0 = (len(sdata)-1)/2
1475				for t,r in enumerate(sdata):
1476					r['t'] = t - t0
1477
1478
1479	def report(self):
1480		'''
1481		Prints a report on the standardization fit.
1482		Only applicable after `D4xdata.standardize(method='pooled')`.
1483		'''
1484		report_fit(self.standardization)
1485
1486
1487	def combine_samples(self, sample_groups):
1488		'''
1489		Combine analyses of different samples to compute weighted average Δ4x
1490		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1491		dictionary.
1492		
1493		Caution: samples are weighted by number of replicate analyses, which is a
1494		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1495		correlated analytical errors for one or more samples).
1496		
1497		Returns a tuplet of:
1498		
1499		+ the list of group names
1500		+ an array of the corresponding Δ4x values
1501		+ the corresponding (co)variance matrix
1502		
1503		**Parameters**
1504
1505		+ `sample_groups`: a dictionary of the form:
1506		```py
1507		{'group1': ['sample_1', 'sample_2'],
1508		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1509		```
1510		'''
1511		
1512		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1513		groups = sorted(sample_groups.keys())
1514		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1515		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1516		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1517		W = np.array([
1518			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1519			for j in groups])
1520		D4x_new = W @ D4x_old
1521		CM_new = W @ CM_old @ W.T
1522
1523		return groups, D4x_new[:,0], CM_new
1524		
1525
1526	@make_verbal
1527	def standardize(self,
1528		method = 'pooled',
1529		weighted_sessions = [],
1530		consolidate = True,
1531		consolidate_tables = False,
1532		consolidate_plots = False,
1533		constraints = {},
1534		):
1535		'''
1536		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1537		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1538		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1539		i.e. that their true Δ4x value does not change between sessions,
1540		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1541		`'indep_sessions'`, the standardization processes each session independently, based only
1542		on anchors analyses.
1543		'''
1544
1545		self.standardization_method = method
1546		self.assign_timestamps()
1547
1548		if method == 'pooled':
1549			if weighted_sessions:
1550				for session_group in weighted_sessions:
1551					if self._4x == '47':
1552						X = D47data([r for r in self if r['Session'] in session_group])
1553					elif self._4x == '48':
1554						X = D48data([r for r in self if r['Session'] in session_group])
1555					X.Nominal_D4x = self.Nominal_D4x.copy()
1556					X.refresh()
1557					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1558					w = np.sqrt(result.redchi)
1559					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1560					for r in X:
1561						r[f'wD{self._4x}raw'] *= w
1562			else:
1563				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1564				for r in self:
1565					r[f'wD{self._4x}raw'] = 1.
1566
1567			params = Parameters()
1568			for k,session in enumerate(self.sessions):
1569				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1570				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1571				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1572				s = pf(session)
1573				params.add(f'a_{s}', value = 0.9)
1574				params.add(f'b_{s}', value = 0.)
1575				params.add(f'c_{s}', value = -0.9)
1576				params.add(f'a2_{s}', value = 0.,
1577# 					vary = self.sessions[session]['scrambling_drift'],
1578					)
1579				params.add(f'b2_{s}', value = 0.,
1580# 					vary = self.sessions[session]['slope_drift'],
1581					)
1582				params.add(f'c2_{s}', value = 0.,
1583# 					vary = self.sessions[session]['wg_drift'],
1584					)
1585				if not self.sessions[session]['scrambling_drift']:
1586					params[f'a2_{s}'].expr = '0'
1587				if not self.sessions[session]['slope_drift']:
1588					params[f'b2_{s}'].expr = '0'
1589				if not self.sessions[session]['wg_drift']:
1590					params[f'c2_{s}'].expr = '0'
1591
1592			for sample in self.unknowns:
1593				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1594
1595			for k in constraints:
1596				params[k].expr = constraints[k]
1597
1598			def residuals(p):
1599				R = []
1600				for r in self:
1601					session = pf(r['Session'])
1602					sample = pf(r['Sample'])
1603					if r['Sample'] in self.Nominal_D4x:
1604						R += [ (
1605							r[f'D{self._4x}raw'] - (
1606								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1607								+ p[f'b_{session}'] * r[f'd{self._4x}']
1608								+	p[f'c_{session}']
1609								+ r['t'] * (
1610									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1611									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1612									+	p[f'c2_{session}']
1613									)
1614								)
1615							) / r[f'wD{self._4x}raw'] ]
1616					else:
1617						R += [ (
1618							r[f'D{self._4x}raw'] - (
1619								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1620								+ p[f'b_{session}'] * r[f'd{self._4x}']
1621								+	p[f'c_{session}']
1622								+ r['t'] * (
1623									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1624									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1625									+	p[f'c2_{session}']
1626									)
1627								)
1628							) / r[f'wD{self._4x}raw'] ]
1629				return R
1630
1631			M = Minimizer(residuals, params)
1632			result = M.least_squares()
1633			self.Nf = result.nfree
1634			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1635			new_names, new_covar, new_se = _fullcovar(result)[:3]
1636			result.var_names = new_names
1637			result.covar = new_covar
1638
1639			for r in self:
1640				s = pf(r["Session"])
1641				a = result.params.valuesdict()[f'a_{s}']
1642				b = result.params.valuesdict()[f'b_{s}']
1643				c = result.params.valuesdict()[f'c_{s}']
1644				a2 = result.params.valuesdict()[f'a2_{s}']
1645				b2 = result.params.valuesdict()[f'b2_{s}']
1646				c2 = result.params.valuesdict()[f'c2_{s}']
1647				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1648				
1649
1650			self.standardization = result
1651
1652			for session in self.sessions:
1653				self.sessions[session]['Np'] = 3
1654				for k in ['scrambling', 'slope', 'wg']:
1655					if self.sessions[session][f'{k}_drift']:
1656						self.sessions[session]['Np'] += 1
1657
1658			if consolidate:
1659				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1660			return result
1661
1662
1663		elif method == 'indep_sessions':
1664
1665			if weighted_sessions:
1666				for session_group in weighted_sessions:
1667					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1668					X.Nominal_D4x = self.Nominal_D4x.copy()
1669					X.refresh()
1670					# This is only done to assign r['wD47raw'] for r in X:
1671					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1672					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1673			else:
1674				self.msg('All weights set to 1 ‰')
1675				for r in self:
1676					r[f'wD{self._4x}raw'] = 1
1677
1678			for session in self.sessions:
1679				s = self.sessions[session]
1680				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1681				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1682				s['Np'] = sum(p_active)
1683				sdata = s['data']
1684
1685				A = np.array([
1686					[
1687						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1688						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1689						1 / r[f'wD{self._4x}raw'],
1690						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1691						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1692						r['t'] / r[f'wD{self._4x}raw']
1693						]
1694					for r in sdata if r['Sample'] in self.anchors
1695					])[:,p_active] # only keep columns for the active parameters
1696				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1697				s['Na'] = Y.size
1698				CM = linalg.inv(A.T @ A)
1699				bf = (CM @ A.T @ Y).T[0,:]
1700				k = 0
1701				for n,a in zip(p_names, p_active):
1702					if a:
1703						s[n] = bf[k]
1704# 						self.msg(f'{n} = {bf[k]}')
1705						k += 1
1706					else:
1707						s[n] = 0.
1708# 						self.msg(f'{n} = 0.0')
1709
1710				for r in sdata :
1711					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1712					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1713					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1714
1715				s['CM'] = np.zeros((6,6))
1716				i = 0
1717				k_active = [j for j,a in enumerate(p_active) if a]
1718				for j,a in enumerate(p_active):
1719					if a:
1720						s['CM'][j,k_active] = CM[i,:]
1721						i += 1
1722
1723			if not weighted_sessions:
1724				w = self.rmswd()['rmswd']
1725				for r in self:
1726						r[f'wD{self._4x}'] *= w
1727						r[f'wD{self._4x}raw'] *= w
1728				for session in self.sessions:
1729					self.sessions[session]['CM'] *= w**2
1730
1731			for session in self.sessions:
1732				s = self.sessions[session]
1733				s['SE_a'] = s['CM'][0,0]**.5
1734				s['SE_b'] = s['CM'][1,1]**.5
1735				s['SE_c'] = s['CM'][2,2]**.5
1736				s['SE_a2'] = s['CM'][3,3]**.5
1737				s['SE_b2'] = s['CM'][4,4]**.5
1738				s['SE_c2'] = s['CM'][5,5]**.5
1739
1740			if not weighted_sessions:
1741				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1742			else:
1743				self.Nf = 0
1744				for sg in weighted_sessions:
1745					self.Nf += self.rmswd(sessions = sg)['Nf']
1746
1747			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1748
1749			avgD4x = {
1750				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1751				for sample in self.samples
1752				}
1753			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1754			rD4x = (chi2/self.Nf)**.5
1755			self.repeatability[f'sigma_{self._4x}'] = rD4x
1756
1757			if consolidate:
1758				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1759
1760
1761	def standardization_error(self, session, d4x, D4x, t = 0):
1762		'''
1763		Compute standardization error for a given session and
1764		(δ47, Δ47) composition.
1765		'''
1766		a = self.sessions[session]['a']
1767		b = self.sessions[session]['b']
1768		c = self.sessions[session]['c']
1769		a2 = self.sessions[session]['a2']
1770		b2 = self.sessions[session]['b2']
1771		c2 = self.sessions[session]['c2']
1772		CM = self.sessions[session]['CM']
1773
1774		x, y = D4x, d4x
1775		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1776# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1777		dxdy = -(b+b2*t) / (a+a2*t)
1778		dxdz = 1. / (a+a2*t)
1779		dxda = -x / (a+a2*t)
1780		dxdb = -y / (a+a2*t)
1781		dxdc = -1. / (a+a2*t)
1782		dxda2 = -x * a2 / (a+a2*t)
1783		dxdb2 = -y * t / (a+a2*t)
1784		dxdc2 = -t / (a+a2*t)
1785		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1786		sx = (V @ CM @ V.T) ** .5
1787		return sx
1788
1789
1790	@make_verbal
1791	def summary(self,
1792		dir = 'output',
1793		filename = None,
1794		save_to_file = True,
1795		print_out = True,
1796		):
1797		'''
1798		Print out an/or save to disk a summary of the standardization results.
1799
1800		**Parameters**
1801
1802		+ `dir`: the directory in which to save the table
1803		+ `filename`: the name to the csv file to write to
1804		+ `save_to_file`: whether to save the table to disk
1805		+ `print_out`: whether to print out the table
1806		'''
1807
1808		out = []
1809		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1810		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1811		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1812		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1813		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1814		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1815		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1816		out += [['Model degrees of freedom', f"{self.Nf}"]]
1817		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1818		out += [['Standardization method', self.standardization_method]]
1819
1820		if save_to_file:
1821			if not os.path.exists(dir):
1822				os.makedirs(dir)
1823			if filename is None:
1824				filename = f'D{self._4x}_summary.csv'
1825			with open(f'{dir}/{filename}', 'w') as fid:
1826				fid.write(make_csv(out))
1827		if print_out:
1828			self.msg('\n' + pretty_table(out, header = 0))
1829
1830
1831	@make_verbal
1832	def table_of_sessions(self,
1833		dir = 'output',
1834		filename = None,
1835		save_to_file = True,
1836		print_out = True,
1837		output = None,
1838		):
1839		'''
1840		Print out an/or save to disk a table of sessions.
1841
1842		**Parameters**
1843
1844		+ `dir`: the directory in which to save the table
1845		+ `filename`: the name to the csv file to write to
1846		+ `save_to_file`: whether to save the table to disk
1847		+ `print_out`: whether to print out the table
1848		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1849		    if set to `'raw'`: return a list of list of strings
1850		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1851		'''
1852		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1853		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1854		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1855
1856		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1857		if include_a2:
1858			out[-1] += ['a2 ± SE']
1859		if include_b2:
1860			out[-1] += ['b2 ± SE']
1861		if include_c2:
1862			out[-1] += ['c2 ± SE']
1863		for session in self.sessions:
1864			out += [[
1865				session,
1866				f"{self.sessions[session]['Na']}",
1867				f"{self.sessions[session]['Nu']}",
1868				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1869				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1870				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1871				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1872				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1873				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1874				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1875				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1876				]]
1877			if include_a2:
1878				if self.sessions[session]['scrambling_drift']:
1879					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1880				else:
1881					out[-1] += ['']
1882			if include_b2:
1883				if self.sessions[session]['slope_drift']:
1884					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1885				else:
1886					out[-1] += ['']
1887			if include_c2:
1888				if self.sessions[session]['wg_drift']:
1889					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1890				else:
1891					out[-1] += ['']
1892
1893		if save_to_file:
1894			if not os.path.exists(dir):
1895				os.makedirs(dir)
1896			if filename is None:
1897				filename = f'D{self._4x}_sessions.csv'
1898			with open(f'{dir}/{filename}', 'w') as fid:
1899				fid.write(make_csv(out))
1900		if print_out:
1901			self.msg('\n' + pretty_table(out))
1902		if output == 'raw':
1903			return out
1904		elif output == 'pretty':
1905			return pretty_table(out)
1906
1907
1908	@make_verbal
1909	def table_of_analyses(
1910		self,
1911		dir = 'output',
1912		filename = None,
1913		save_to_file = True,
1914		print_out = True,
1915		output = None,
1916		):
1917		'''
1918		Print out an/or save to disk a table of analyses.
1919
1920		**Parameters**
1921
1922		+ `dir`: the directory in which to save the table
1923		+ `filename`: the name to the csv file to write to
1924		+ `save_to_file`: whether to save the table to disk
1925		+ `print_out`: whether to print out the table
1926		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1927		    if set to `'raw'`: return a list of list of strings
1928		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1929		'''
1930
1931		out = [['UID','Session','Sample']]
1932		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1933		for f in extra_fields:
1934			out[-1] += [f[0]]
1935		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1936		for r in self:
1937			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1938			for f in extra_fields:
1939				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1940			out[-1] += [
1941				f"{r['d13Cwg_VPDB']:.3f}",
1942				f"{r['d18Owg_VSMOW']:.3f}",
1943				f"{r['d45']:.6f}",
1944				f"{r['d46']:.6f}",
1945				f"{r['d47']:.6f}",
1946				f"{r['d48']:.6f}",
1947				f"{r['d49']:.6f}",
1948				f"{r['d13C_VPDB']:.6f}",
1949				f"{r['d18O_VSMOW']:.6f}",
1950				f"{r['D47raw']:.6f}",
1951				f"{r['D48raw']:.6f}",
1952				f"{r['D49raw']:.6f}",
1953				f"{r[f'D{self._4x}']:.6f}"
1954				]
1955		if save_to_file:
1956			if not os.path.exists(dir):
1957				os.makedirs(dir)
1958			if filename is None:
1959				filename = f'D{self._4x}_analyses.csv'
1960			with open(f'{dir}/{filename}', 'w') as fid:
1961				fid.write(make_csv(out))
1962		if print_out:
1963			self.msg('\n' + pretty_table(out))
1964		return out
1965
1966	@make_verbal
1967	def covar_table(
1968		self,
1969		correl = False,
1970		dir = 'output',
1971		filename = None,
1972		save_to_file = True,
1973		print_out = True,
1974		output = None,
1975		):
1976		'''
1977		Print out, save to disk and/or return the variance-covariance matrix of D4x
1978		for all unknown samples.
1979
1980		**Parameters**
1981
1982		+ `dir`: the directory in which to save the csv
1983		+ `filename`: the name of the csv file to write to
1984		+ `save_to_file`: whether to save the csv
1985		+ `print_out`: whether to print out the matrix
1986		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1987		    if set to `'raw'`: return a list of list of strings
1988		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1989		'''
1990		samples = sorted([u for u in self.unknowns])
1991		out = [[''] + samples]
1992		for s1 in samples:
1993			out.append([s1])
1994			for s2 in samples:
1995				if correl:
1996					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1997				else:
1998					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1999
2000		if save_to_file:
2001			if not os.path.exists(dir):
2002				os.makedirs(dir)
2003			if filename is None:
2004				if correl:
2005					filename = f'D{self._4x}_correl.csv'
2006				else:
2007					filename = f'D{self._4x}_covar.csv'
2008			with open(f'{dir}/{filename}', 'w') as fid:
2009				fid.write(make_csv(out))
2010		if print_out:
2011			self.msg('\n'+pretty_table(out))
2012		if output == 'raw':
2013			return out
2014		elif output == 'pretty':
2015			return pretty_table(out)
2016
2017	@make_verbal
2018	def table_of_samples(
2019		self,
2020		dir = 'output',
2021		filename = None,
2022		save_to_file = True,
2023		print_out = True,
2024		output = None,
2025		):
2026		'''
2027		Print out, save to disk and/or return a table of samples.
2028
2029		**Parameters**
2030
2031		+ `dir`: the directory in which to save the csv
2032		+ `filename`: the name of the csv file to write to
2033		+ `save_to_file`: whether to save the csv
2034		+ `print_out`: whether to print out the table
2035		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2036		    if set to `'raw'`: return a list of list of strings
2037		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2038		'''
2039
2040		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2041		for sample in self.anchors:
2042			out += [[
2043				f"{sample}",
2044				f"{self.samples[sample]['N']}",
2045				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2046				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2047				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2048				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2049				]]
2050		for sample in self.unknowns:
2051			out += [[
2052				f"{sample}",
2053				f"{self.samples[sample]['N']}",
2054				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2055				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2056				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2057				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2058				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2059				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2060				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2061				]]
2062		if save_to_file:
2063			if not os.path.exists(dir):
2064				os.makedirs(dir)
2065			if filename is None:
2066				filename = f'D{self._4x}_samples.csv'
2067			with open(f'{dir}/{filename}', 'w') as fid:
2068				fid.write(make_csv(out))
2069		if print_out:
2070			self.msg('\n'+pretty_table(out))
2071		if output == 'raw':
2072			return out
2073		elif output == 'pretty':
2074			return pretty_table(out)
2075
2076
2077	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2078		'''
2079		Generate session plots and save them to disk.
2080
2081		**Parameters**
2082
2083		+ `dir`: the directory in which to save the plots
2084		+ `figsize`: the width and height (in inches) of each plot
2085		+ `filetype`: 'pdf' or 'png'
2086		+ `dpi`: resolution for PNG output
2087		'''
2088		if not os.path.exists(dir):
2089			os.makedirs(dir)
2090
2091		for session in self.sessions:
2092			sp = self.plot_single_session(session, xylimits = 'constant')
2093			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2094			ppl.close(sp.fig)
2095			
2096
2097
2098	@make_verbal
2099	def consolidate_samples(self):
2100		'''
2101		Compile various statistics for each sample.
2102
2103		For each anchor sample:
2104
2105		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2106		+ `SE_D47` or `SE_D48`: set to zero by definition
2107
2108		For each unknown sample:
2109
2110		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2111		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2112
2113		For each anchor and unknown:
2114
2115		+ `N`: the total number of analyses of this sample
2116		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2117		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2118		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2119		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2120		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2121		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2122		'''
2123		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2124		for sample in self.samples:
2125			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2126			if self.samples[sample]['N'] > 1:
2127				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2128
2129			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2130			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2131
2132			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2133			if len(D4x_pop) > 2:
2134				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2135			
2136		if self.standardization_method == 'pooled':
2137			for sample in self.anchors:
2138				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2139				self.samples[sample][f'SE_D{self._4x}'] = 0.
2140			for sample in self.unknowns:
2141				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2142				try:
2143					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2144				except ValueError:
2145					# when `sample` is constrained by self.standardize(constraints = {...}),
2146					# it is no longer listed in self.standardization.var_names.
2147					# Temporary fix: define SE as zero for now
2148					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2149
2150		elif self.standardization_method == 'indep_sessions':
2151			for sample in self.anchors:
2152				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2153				self.samples[sample][f'SE_D{self._4x}'] = 0.
2154			for sample in self.unknowns:
2155				self.msg(f'Consolidating sample {sample}')
2156				self.unknowns[sample][f'session_D{self._4x}'] = {}
2157				session_avg = []
2158				for session in self.sessions:
2159					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2160					if sdata:
2161						self.msg(f'{sample} found in session {session}')
2162						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2163						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2164						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2165						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2166						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2167						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2168						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2169				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2170				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2171				wsum = sum([weights[s] for s in weights])
2172				for s in weights:
2173					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2174
2175		for r in self:
2176			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2177
2178
2179
2180	def consolidate_sessions(self):
2181		'''
2182		Compute various statistics for each session.
2183
2184		+ `Na`: Number of anchor analyses in the session
2185		+ `Nu`: Number of unknown analyses in the session
2186		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2187		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2188		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2189		+ `a`: scrambling factor
2190		+ `b`: compositional slope
2191		+ `c`: WG offset
2192		+ `SE_a`: Model stadard erorr of `a`
2193		+ `SE_b`: Model stadard erorr of `b`
2194		+ `SE_c`: Model stadard erorr of `c`
2195		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2196		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2197		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2198		+ `a2`: scrambling factor drift
2199		+ `b2`: compositional slope drift
2200		+ `c2`: WG offset drift
2201		+ `Np`: Number of standardization parameters to fit
2202		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2203		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2204		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2205		'''
2206		for session in self.sessions:
2207			if 'd13Cwg_VPDB' not in self.sessions[session]:
2208				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2209			if 'd18Owg_VSMOW' not in self.sessions[session]:
2210				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2211			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2212			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2213
2214			self.msg(f'Computing repeatabilities for session {session}')
2215			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2216			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2217			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2218
2219		if self.standardization_method == 'pooled':
2220			for session in self.sessions:
2221
2222				# different (better?) computation of D4x repeatability for each session:
2223				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2224				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2225
2226				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2227				i = self.standardization.var_names.index(f'a_{pf(session)}')
2228				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2229
2230				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2231				i = self.standardization.var_names.index(f'b_{pf(session)}')
2232				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2233
2234				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2235				i = self.standardization.var_names.index(f'c_{pf(session)}')
2236				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2237
2238				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2239				if self.sessions[session]['scrambling_drift']:
2240					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2241					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2242				else:
2243					self.sessions[session]['SE_a2'] = 0.
2244
2245				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2246				if self.sessions[session]['slope_drift']:
2247					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2248					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2249				else:
2250					self.sessions[session]['SE_b2'] = 0.
2251
2252				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2253				if self.sessions[session]['wg_drift']:
2254					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2255					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2256				else:
2257					self.sessions[session]['SE_c2'] = 0.
2258
2259				i = self.standardization.var_names.index(f'a_{pf(session)}')
2260				j = self.standardization.var_names.index(f'b_{pf(session)}')
2261				k = self.standardization.var_names.index(f'c_{pf(session)}')
2262				CM = np.zeros((6,6))
2263				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2264				try:
2265					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2266					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2267					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2268					try:
2269						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2270						CM[3,4] = self.standardization.covar[i2,j2]
2271						CM[4,3] = self.standardization.covar[j2,i2]
2272					except ValueError:
2273						pass
2274					try:
2275						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2276						CM[3,5] = self.standardization.covar[i2,k2]
2277						CM[5,3] = self.standardization.covar[k2,i2]
2278					except ValueError:
2279						pass
2280				except ValueError:
2281					pass
2282				try:
2283					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2284					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2285					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2286					try:
2287						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2288						CM[4,5] = self.standardization.covar[j2,k2]
2289						CM[5,4] = self.standardization.covar[k2,j2]
2290					except ValueError:
2291						pass
2292				except ValueError:
2293					pass
2294				try:
2295					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2296					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2297					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2298				except ValueError:
2299					pass
2300
2301				self.sessions[session]['CM'] = CM
2302
2303		elif self.standardization_method == 'indep_sessions':
2304			pass # Not implemented yet
2305
2306
2307	@make_verbal
2308	def repeatabilities(self):
2309		'''
2310		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2311		(for all samples, for anchors, and for unknowns).
2312		'''
2313		self.msg('Computing reproducibilities for all sessions')
2314
2315		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2316		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2317		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2318		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2319		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2320
2321
2322	@make_verbal
2323	def consolidate(self, tables = True, plots = True):
2324		'''
2325		Collect information about samples, sessions and repeatabilities.
2326		'''
2327		self.consolidate_samples()
2328		self.consolidate_sessions()
2329		self.repeatabilities()
2330
2331		if tables:
2332			self.summary()
2333			self.table_of_sessions()
2334			self.table_of_analyses()
2335			self.table_of_samples()
2336
2337		if plots:
2338			self.plot_sessions()
2339
2340
2341	@make_verbal
2342	def rmswd(self,
2343		samples = 'all samples',
2344		sessions = 'all sessions',
2345		):
2346		'''
2347		Compute the χ2, root mean squared weighted deviation
2348		(i.e. reduced χ2), and corresponding degrees of freedom of the
2349		Δ4x values for samples in `samples` and sessions in `sessions`.
2350		
2351		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2352		'''
2353		if samples == 'all samples':
2354			mysamples = [k for k in self.samples]
2355		elif samples == 'anchors':
2356			mysamples = [k for k in self.anchors]
2357		elif samples == 'unknowns':
2358			mysamples = [k for k in self.unknowns]
2359		else:
2360			mysamples = samples
2361
2362		if sessions == 'all sessions':
2363			sessions = [k for k in self.sessions]
2364
2365		chisq, Nf = 0, 0
2366		for sample in mysamples :
2367			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2368			if len(G) > 1 :
2369				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2370				Nf += (len(G) - 1)
2371				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2372		r = (chisq / Nf)**.5 if Nf > 0 else 0
2373		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2374		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2375
2376	
2377	@make_verbal
2378	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2379		'''
2380		Compute the repeatability of `[r[key] for r in self]`
2381		'''
2382
2383		if samples == 'all samples':
2384			mysamples = [k for k in self.samples]
2385		elif samples == 'anchors':
2386			mysamples = [k for k in self.anchors]
2387		elif samples == 'unknowns':
2388			mysamples = [k for k in self.unknowns]
2389		else:
2390			mysamples = samples
2391
2392		if sessions == 'all sessions':
2393			sessions = [k for k in self.sessions]
2394
2395		if key in ['D47', 'D48']:
2396			# Full disclosure: the definition of Nf is tricky/debatable
2397			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2398			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2399			Nf = len(G)
2400# 			print(f'len(G) = {Nf}')
2401			Nf -= len([s for s in mysamples if s in self.unknowns])
2402# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2403			for session in sessions:
2404				Np = len([
2405					_ for _ in self.standardization.params
2406					if (
2407						self.standardization.params[_].expr is not None
2408						and (
2409							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2410							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2411							)
2412						)
2413					])
2414# 				print(f'session {session}: {Np} parameters to consider')
2415				Na = len({
2416					r['Sample'] for r in self.sessions[session]['data']
2417					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2418					})
2419# 				print(f'session {session}: {Na} different anchors in that session')
2420				Nf -= min(Np, Na)
2421# 			print(f'Nf = {Nf}')
2422
2423# 			for sample in mysamples :
2424# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2425# 				if len(X) > 1 :
2426# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2427# 					if sample in self.unknowns:
2428# 						Nf += len(X) - 1
2429# 					else:
2430# 						Nf += len(X)
2431# 			if samples in ['anchors', 'all samples']:
2432# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2433			r = (chisq / Nf)**.5 if Nf > 0 else 0
2434
2435		else: # if key not in ['D47', 'D48']
2436			chisq, Nf = 0, 0
2437			for sample in mysamples :
2438				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2439				if len(X) > 1 :
2440					Nf += len(X) - 1
2441					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2442			r = (chisq / Nf)**.5 if Nf > 0 else 0
2443
2444		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2445		return r
2446
2447	def sample_average(self, samples, weights = 'equal', normalize = True):
2448		'''
2449		Weighted average Δ4x value of a group of samples, accounting for covariance.
2450
2451		Returns the weighed average Δ4x value and associated SE
2452		of a group of samples. Weights are equal by default. If `normalize` is
2453		true, `weights` will be rescaled so that their sum equals 1.
2454
2455		**Examples**
2456
2457		```python
2458		self.sample_average(['X','Y'], [1, 2])
2459		```
2460
2461		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2462		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2463		values of samples X and Y, respectively.
2464
2465		```python
2466		self.sample_average(['X','Y'], [1, -1], normalize = False)
2467		```
2468
2469		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2470		'''
2471		if weights == 'equal':
2472			weights = [1/len(samples)] * len(samples)
2473
2474		if normalize:
2475			s = sum(weights)
2476			if s:
2477				weights = [w/s for w in weights]
2478
2479		try:
2480# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2481# 			C = self.standardization.covar[indices,:][:,indices]
2482			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2483			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2484			return correlated_sum(X, C, weights)
2485		except ValueError:
2486			return (0., 0.)
2487
2488
2489	def sample_D4x_covar(self, sample1, sample2 = None):
2490		'''
2491		Covariance between Δ4x values of samples
2492
2493		Returns the error covariance between the average Δ4x values of two
2494		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2495		returns the Δ4x variance for that sample.
2496		'''
2497		if sample2 is None:
2498			sample2 = sample1
2499		if self.standardization_method == 'pooled':
2500			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2501			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2502			return self.standardization.covar[i, j]
2503		elif self.standardization_method == 'indep_sessions':
2504			if sample1 == sample2:
2505				return self.samples[sample1][f'SE_D{self._4x}']**2
2506			else:
2507				c = 0
2508				for session in self.sessions:
2509					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2510					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2511					if sdata1 and sdata2:
2512						a = self.sessions[session]['a']
2513						# !! TODO: CM below does not account for temporal changes in standardization parameters
2514						CM = self.sessions[session]['CM'][:3,:3]
2515						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2516						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2517						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2518						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2519						c += (
2520							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2521							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2522							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2523							@ CM
2524							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2525							) / a**2
2526				return float(c)
2527
2528	def sample_D4x_correl(self, sample1, sample2 = None):
2529		'''
2530		Correlation between Δ4x errors of samples
2531
2532		Returns the error correlation between the average Δ4x values of two samples.
2533		'''
2534		if sample2 is None or sample2 == sample1:
2535			return 1.
2536		return (
2537			self.sample_D4x_covar(sample1, sample2)
2538			/ self.unknowns[sample1][f'SE_D{self._4x}']
2539			/ self.unknowns[sample2][f'SE_D{self._4x}']
2540			)
2541
2542	def plot_single_session(self,
2543		session,
2544		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2545		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2546		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2547		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2548		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2549		xylimits = 'free', # | 'constant'
2550		x_label = None,
2551		y_label = None,
2552		error_contour_interval = 'auto',
2553		fig = 'new',
2554		):
2555		'''
2556		Generate plot for a single session
2557		'''
2558		if x_label is None:
2559			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2560		if y_label is None:
2561			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2562
2563		out = _SessionPlot()
2564		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2565		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2566		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2567		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2568		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2569		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2570		anchor_avg = (np.array([ np.array([
2571				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2572				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2573				]) for sample in anchors]).T,
2574			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2575		unknown_avg = (np.array([ np.array([
2576				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2577				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2578				]) for sample in unknowns]).T,
2579			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2580		
2581		
2582		if fig == 'new':
2583			out.fig = ppl.figure(figsize = (6,6))
2584			ppl.subplots_adjust(.1,.1,.9,.9)
2585
2586		out.anchor_analyses, = ppl.plot(
2587			anchors_d,
2588			anchors_D,
2589			**kw_plot_anchors)
2590		out.unknown_analyses, = ppl.plot(
2591			unknowns_d,
2592			unknowns_D,
2593			**kw_plot_unknowns)
2594		out.anchor_avg = ppl.plot(
2595			*anchor_avg,
2596			**kw_plot_anchor_avg)
2597		out.unknown_avg = ppl.plot(
2598			*unknown_avg,
2599			**kw_plot_unknown_avg)
2600		if xylimits == 'constant':
2601			x = [r[f'd{self._4x}'] for r in self]
2602			y = [r[f'D{self._4x}'] for r in self]
2603			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2604			w, h = x2-x1, y2-y1
2605			x1 -= w/20
2606			x2 += w/20
2607			y1 -= h/20
2608			y2 += h/20
2609			ppl.axis([x1, x2, y1, y2])
2610		elif xylimits == 'free':
2611			x1, x2, y1, y2 = ppl.axis()
2612		else:
2613			x1, x2, y1, y2 = ppl.axis(xylimits)
2614				
2615		if error_contour_interval != 'none':
2616			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2617			XI,YI = np.meshgrid(xi, yi)
2618			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2619			if error_contour_interval == 'auto':
2620				rng = np.max(SI) - np.min(SI)
2621				if rng <= 0.01:
2622					cinterval = 0.001
2623				elif rng <= 0.03:
2624					cinterval = 0.004
2625				elif rng <= 0.1:
2626					cinterval = 0.01
2627				elif rng <= 0.3:
2628					cinterval = 0.03
2629				elif rng <= 1.:
2630					cinterval = 0.1
2631				else:
2632					cinterval = 0.5
2633			else:
2634				cinterval = error_contour_interval
2635
2636			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2637			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2638			out.clabel = ppl.clabel(out.contour)
2639			contour = (XI, YI, SI, cval, cinterval)
2640
2641		if fig == None:
2642			return {
2643			'anchors':anchors,
2644			'unknowns':unknowns,
2645			'anchors_d':anchors_d,
2646			'anchors_D':anchors_D,
2647			'unknowns_d':unknowns_d,
2648			'unknowns_D':unknowns_D,
2649			'anchor_avg':anchor_avg,
2650			'unknown_avg':unknown_avg,
2651			'contour':contour,
2652			}
2653
2654		ppl.xlabel(x_label)
2655		ppl.ylabel(y_label)
2656		ppl.title(session, weight = 'bold')
2657		ppl.grid(alpha = .2)
2658		out.ax = ppl.gca()		
2659
2660		return out
2661
2662	def plot_residuals(
2663		self,
2664		kde = False,
2665		hist = False,
2666		binwidth = 2/3,
2667		dir = 'output',
2668		filename = None,
2669		highlight = [],
2670		colors = None,
2671		figsize = None,
2672		dpi = 100,
2673		yspan = None,
2674		):
2675		'''
2676		Plot residuals of each analysis as a function of time (actually, as a function of
2677		the order of analyses in the `D4xdata` object)
2678
2679		+ `kde`: whether to add a kernel density estimate of residuals
2680		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2681		+ `histbins`: specify bin edges for the histogram
2682		+ `dir`: the directory in which to save the plot
2683		+ `highlight`: a list of samples to highlight
2684		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2685		+ `figsize`: (width, height) of figure
2686		+ `dpi`: resolution for PNG output
2687		+ `yspan`: factor controlling the range of y values shown in plot
2688		  (by default: `yspan = 1.5 if kde else 1.0`)
2689		'''
2690		
2691		from matplotlib import ticker
2692
2693		if yspan is None:
2694			if kde:
2695				yspan = 1.5
2696			else:
2697				yspan = 1.0
2698		
2699		# Layout
2700		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2701		if hist or kde:
2702			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2703			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2704		else:
2705			ppl.subplots_adjust(.08,.05,.78,.8)
2706			ax1 = ppl.subplot(111)
2707		
2708		# Colors
2709		N = len(self.anchors)
2710		if colors is None:
2711			if len(highlight) > 0:
2712				Nh = len(highlight)
2713				if Nh == 1:
2714					colors = {highlight[0]: (0,0,0)}
2715				elif Nh == 3:
2716					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2717				elif Nh == 4:
2718					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2719				else:
2720					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2721			else:
2722				if N == 3:
2723					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2724				elif N == 4:
2725					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2726				else:
2727					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2728
2729		ppl.sca(ax1)
2730		
2731		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2732
2733		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2734
2735		session = self[0]['Session']
2736		x1 = 0
2737# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2738		x_sessions = {}
2739		one_or_more_singlets = False
2740		one_or_more_multiplets = False
2741		multiplets = set()
2742		for k,r in enumerate(self):
2743			if r['Session'] != session:
2744				x2 = k-1
2745				x_sessions[session] = (x1+x2)/2
2746				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2747				session = r['Session']
2748				x1 = k
2749			singlet = len(self.samples[r['Sample']]['data']) == 1
2750			if not singlet:
2751				multiplets.add(r['Sample'])
2752			if r['Sample'] in self.unknowns:
2753				if singlet:
2754					one_or_more_singlets = True
2755				else:
2756					one_or_more_multiplets = True
2757			kw = dict(
2758				marker = 'x' if singlet else '+',
2759				ms = 4 if singlet else 5,
2760				ls = 'None',
2761				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2762				mew = 1,
2763				alpha = 0.2 if singlet else 1,
2764				)
2765			if highlight and r['Sample'] not in highlight:
2766				kw['alpha'] = 0.2
2767			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2768		x2 = k
2769		x_sessions[session] = (x1+x2)/2
2770
2771		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2772		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2773		if not (hist or kde):
2774			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2775			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2776
2777		xmin, xmax, ymin, ymax = ppl.axis()
2778		if yspan != 1:
2779			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2780		for s in x_sessions:
2781			ppl.text(
2782				x_sessions[s],
2783				ymax +1,
2784				s,
2785				va = 'bottom',
2786				**(
2787					dict(ha = 'center')
2788					if len(self.sessions[s]['data']) > (0.15 * len(self))
2789					else dict(ha = 'left', rotation = 45)
2790					)
2791				)
2792
2793		if hist or kde:
2794			ppl.sca(ax2)
2795
2796		for s in colors:
2797			kw['marker'] = '+'
2798			kw['ms'] = 5
2799			kw['mec'] = colors[s]
2800			kw['label'] = s
2801			kw['alpha'] = 1
2802			ppl.plot([], [], **kw)
2803
2804		kw['mec'] = (0,0,0)
2805
2806		if one_or_more_singlets:
2807			kw['marker'] = 'x'
2808			kw['ms'] = 4
2809			kw['alpha'] = .2
2810			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2811			ppl.plot([], [], **kw)
2812
2813		if one_or_more_multiplets:
2814			kw['marker'] = '+'
2815			kw['ms'] = 4
2816			kw['alpha'] = 1
2817			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2818			ppl.plot([], [], **kw)
2819
2820		if hist or kde:
2821			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2822		else:
2823			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2824		leg.set_zorder(-1000)
2825
2826		ppl.sca(ax1)
2827
2828		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2829		ppl.xticks([])
2830		ppl.axis([-1, len(self), None, None])
2831
2832		if hist or kde:
2833			ppl.sca(ax2)
2834			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2835
2836			if kde:
2837				from scipy.stats import gaussian_kde
2838				yi = np.linspace(ymin, ymax, 201)
2839				xi = gaussian_kde(X).evaluate(yi)
2840				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2841# 				ppl.plot(xi, yi, 'k-', lw = 1)
2842			elif hist:
2843				ppl.hist(
2844					X,
2845					orientation = 'horizontal',
2846					histtype = 'stepfilled',
2847					ec = [.4]*3,
2848					fc = [.25]*3,
2849					alpha = .25,
2850					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2851					)
2852			ppl.text(0, 0,
2853				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2854				size = 7.5,
2855				alpha = 1,
2856				va = 'center',
2857				ha = 'left',
2858				)
2859
2860			ppl.axis([0, None, ymin, ymax])
2861			ppl.xticks([])
2862			ppl.yticks([])
2863# 			ax2.spines['left'].set_visible(False)
2864			ax2.spines['right'].set_visible(False)
2865			ax2.spines['top'].set_visible(False)
2866			ax2.spines['bottom'].set_visible(False)
2867
2868		ax1.axis([None, None, ymin, ymax])
2869
2870		if not os.path.exists(dir):
2871			os.makedirs(dir)
2872		if filename is None:
2873			return fig
2874		elif filename == '':
2875			filename = f'D{self._4x}_residuals.pdf'
2876		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2877		ppl.close(fig)
2878				
2879
2880	def simulate(self, *args, **kwargs):
2881		'''
2882		Legacy function with warning message pointing to `virtual_data()`
2883		'''
2884		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2885
2886	def plot_distribution_of_analyses(
2887		self,
2888		dir = 'output',
2889		filename = None,
2890		vs_time = False,
2891		figsize = (6,4),
2892		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2893		output = None,
2894		dpi = 100,
2895		):
2896		'''
2897		Plot temporal distribution of all analyses in the data set.
2898		
2899		**Parameters**
2900
2901		+ `dir`: the directory in which to save the plot
2902		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2903		+ `dpi`: resolution for PNG output
2904		+ `figsize`: (width, height) of figure
2905		+ `dpi`: resolution for PNG output
2906		'''
2907
2908		asamples = [s for s in self.anchors]
2909		usamples = [s for s in self.unknowns]
2910		if output is None or output == 'fig':
2911			fig = ppl.figure(figsize = figsize)
2912			ppl.subplots_adjust(*subplots_adjust)
2913		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2914		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2915		Xmax += (Xmax-Xmin)/40
2916		Xmin -= (Xmax-Xmin)/41
2917		for k, s in enumerate(asamples + usamples):
2918			if vs_time:
2919				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2920			else:
2921				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2922			Y = [-k for x in X]
2923			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2924			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2925			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2926		ppl.axis([Xmin, Xmax, -k-1, 1])
2927		ppl.xlabel('\ntime')
2928		ppl.gca().annotate('',
2929			xy = (0.6, -0.02),
2930			xycoords = 'axes fraction',
2931			xytext = (.4, -0.02), 
2932            arrowprops = dict(arrowstyle = "->", color = 'k'),
2933            )
2934			
2935
2936		x2 = -1
2937		for session in self.sessions:
2938			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2939			if vs_time:
2940				ppl.axvline(x1, color = 'k', lw = .75)
2941			if x2 > -1:
2942				if not vs_time:
2943					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2944			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2945# 			from xlrd import xldate_as_datetime
2946# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2947			if vs_time:
2948				ppl.axvline(x2, color = 'k', lw = .75)
2949				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2950			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2951
2952		ppl.xticks([])
2953		ppl.yticks([])
2954
2955		if output is None:
2956			if not os.path.exists(dir):
2957				os.makedirs(dir)
2958			if filename == None:
2959				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2960			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2961			ppl.close(fig)
2962		elif output == 'ax':
2963			return ppl.gca()
2964		elif output == 'fig':
2965			return fig
2966
2967
2968	def plot_bulk_compositions(
2969		self,
2970		samples = None,
2971		dir = 'output/bulk_compositions',
2972		figsize = (6,6),
2973		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2974		show = False,
2975		sample_color = (0,.5,1),
2976		analysis_color = (.7,.7,.7),
2977		labeldist = 0.3,
2978		radius = 0.05,
2979		):
2980		'''
2981		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2982		
2983		By default, creates a directory `./output/bulk_compositions` where plots for
2984		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2985		
2986		
2987		**Parameters**
2988
2989		+ `samples`: Only these samples are processed (by default: all samples).
2990		+ `dir`: where to save the plots
2991		+ `figsize`: (width, height) of figure
2992		+ `subplots_adjust`: passed to `subplots_adjust()`
2993		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2994		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2995		+ `sample_color`: color used for replicate markers/labels
2996		+ `analysis_color`: color used for sample markers/labels
2997		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2998		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2999		'''
3000
3001		from matplotlib.patches import Ellipse
3002
3003		if samples is None:
3004			samples = [_ for _ in self.samples]
3005
3006		saved = {}
3007
3008		for s in samples:
3009
3010			fig = ppl.figure(figsize = figsize)
3011			fig.subplots_adjust(*subplots_adjust)
3012			ax = ppl.subplot(111)
3013			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3014			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3015			ppl.title(s)
3016
3017
3018			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3019			UID = [_['UID'] for _ in self.samples[s]['data']]
3020			XY0 = XY.mean(0)
3021
3022			for xy in XY:
3023				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3024				
3025			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3026			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3027			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3028			saved[s] = [XY, XY0]
3029			
3030			x1, x2, y1, y2 = ppl.axis()
3031			x0, dx = (x1+x2)/2, (x2-x1)/2
3032			y0, dy = (y1+y2)/2, (y2-y1)/2
3033			dx, dy = [max(max(dx, dy), radius)]*2
3034
3035			ppl.axis([
3036				x0 - 1.2*dx,
3037				x0 + 1.2*dx,
3038				y0 - 1.2*dy,
3039				y0 + 1.2*dy,
3040				])			
3041
3042			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3043
3044			for xy, uid in zip(XY, UID):
3045
3046				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3047				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3048
3049				if (vector_in_display_space**2).sum() > 0:
3050
3051					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3052					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3053					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3054					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3055
3056					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3057
3058				else:
3059
3060					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3061
3062			if radius:
3063				ax.add_artist(Ellipse(
3064					xy = XY0,
3065					width = radius*2,
3066					height = radius*2,
3067					ls = (0, (2,2)),
3068					lw = .7,
3069					ec = analysis_color,
3070					fc = 'None',
3071					))
3072				ppl.text(
3073					XY0[0],
3074					XY0[1]-radius,
3075					f'\n± {radius*1e3:.0f} ppm',
3076					color = analysis_color,
3077					va = 'top',
3078					ha = 'center',
3079					linespacing = 0.4,
3080					size = 8,
3081					)
3082
3083			if not os.path.exists(dir):
3084				os.makedirs(dir)
3085			fig.savefig(f'{dir}/{s}.pdf')
3086			ppl.close(fig)
3087
3088		fig = ppl.figure(figsize = figsize)
3089		fig.subplots_adjust(*subplots_adjust)
3090		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3091		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3092
3093		for s in saved:
3094			for xy in saved[s][0]:
3095				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3096			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3097			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3098			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3099
3100		x1, x2, y1, y2 = ppl.axis()
3101		ppl.axis([
3102			x1 - (x2-x1)/10,
3103			x2 + (x2-x1)/10,
3104			y1 - (y2-y1)/10,
3105			y2 + (y2-y1)/10,
3106			])			
3107
3108
3109		if not os.path.exists(dir):
3110			os.makedirs(dir)
3111		fig.savefig(f'{dir}/__all__.pdf')
3112		if show:
3113			ppl.show()
3114		ppl.close(fig)
3115		
3116
3117	def _save_D4x_correl(
3118		self,
3119		samples = None,
3120		dir = 'output',
3121		filename = None,
3122		D4x_precision = 4,
3123		correl_precision = 4,
3124		):
3125		'''
3126		Save D4x values along with their SE and correlation matrix.
3127
3128		**Parameters**
3129
3130		+ `samples`: Only these samples are output (by default: all samples).
3131		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3132		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3133		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3134		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3135		'''
3136		if samples is None:
3137			samples = sorted([s for s in self.unknowns])
3138		
3139		out = [['Sample']] + [[s] for s in samples]
3140		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3141		for k,s in enumerate(samples):
3142			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3143			for s2 in samples:
3144				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3145		
3146		if not os.path.exists(dir):
3147			os.makedirs(dir)
3148		if filename is None:
3149			filename = f'D{self._4x}_correl.csv'
3150		with open(f'{dir}/{filename}', 'w') as fid:
3151			fid.write(make_csv(out))

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
923	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
924		'''
925		**Parameters**
926
927		+ `l`: a list of dictionaries, with each dictionary including at least the keys
928		`Sample`, `d45`, `d46`, and `d47` or `d48`.
929		+ `mass`: `'47'` or `'48'`
930		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
931		+ `session`: define session name for analyses without a `Session` key
932		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
933
934		Returns a `D4xdata` object derived from `list`.
935		'''
936		self._4x = mass
937		self.verbose = verbose
938		self.prefix = 'D4xdata'
939		self.logfile = logfile
940		list.__init__(self, l)
941		self.Nf = None
942		self.repeatability = {}
943		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
946	def make_verbal(oldfun):
947		'''
948		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
949		'''
950		@wraps(oldfun)
951		def newfun(*args, verbose = '', **kwargs):
952			myself = args[0]
953			oldprefix = myself.prefix
954			myself.prefix = oldfun.__name__
955			if verbose != '':
956				oldverbose = myself.verbose
957				myself.verbose = verbose
958			out = oldfun(*args, **kwargs)
959			myself.prefix = oldprefix
960			if verbose != '':
961				myself.verbose = oldverbose
962			return out
963		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
966	def msg(self, txt):
967		'''
968		Log a message to `self.logfile`, and print it out if `verbose = True`
969		'''
970		self.log(txt)
971		if self.verbose:
972			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
975	def vmsg(self, txt):
976		'''
977		Log a message to `self.logfile` and print it out
978		'''
979		self.log(txt)
980		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
983	def log(self, *txts):
984		'''
985		Log a message to `self.logfile`
986		'''
987		if self.logfile:
988			with open(self.logfile, 'a') as fid:
989				for txt in txts:
990					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
993	def refresh(self, session = 'mySession'):
994		'''
995		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
996		'''
997		self.fill_in_missing_info(session = session)
998		self.refresh_sessions()
999		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1002	def refresh_sessions(self):
1003		'''
1004		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1005		to `False` for all sessions.
1006		'''
1007		self.sessions = {
1008			s: {'data': [r for r in self if r['Session'] == s]}
1009			for s in sorted({r['Session'] for r in self})
1010			}
1011		for s in self.sessions:
1012			self.sessions[s]['scrambling_drift'] = False
1013			self.sessions[s]['slope_drift'] = False
1014			self.sessions[s]['wg_drift'] = False
1015			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1016			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1019	def refresh_samples(self):
1020		'''
1021		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1022		'''
1023		self.samples = {
1024			s: {'data': [r for r in self if r['Sample'] == s]}
1025			for s in sorted({r['Sample'] for r in self})
1026			}
1027		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1028		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1031	def read(self, filename, sep = '', session = ''):
1032		'''
1033		Read file in csv format to load data into a `D47data` object.
1034
1035		In the csv file, spaces before and after field separators (`','` by default)
1036		are optional. Each line corresponds to a single analysis.
1037
1038		The required fields are:
1039
1040		+ `UID`: a unique identifier
1041		+ `Session`: an identifier for the analytical session
1042		+ `Sample`: a sample identifier
1043		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1044
1045		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1046		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1047		and `d49` are optional, and set to NaN by default.
1048
1049		**Parameters**
1050
1051		+ `fileneme`: the path of the file to read
1052		+ `sep`: csv separator delimiting the fields
1053		+ `session`: set `Session` field to this string for all analyses
1054		'''
1055		with open(filename) as fid:
1056			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1059	def input(self, txt, sep = '', session = ''):
1060		'''
1061		Read `txt` string in csv format to load analysis data into a `D47data` object.
1062
1063		In the csv string, spaces before and after field separators (`','` by default)
1064		are optional. Each line corresponds to a single analysis.
1065
1066		The required fields are:
1067
1068		+ `UID`: a unique identifier
1069		+ `Session`: an identifier for the analytical session
1070		+ `Sample`: a sample identifier
1071		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1072
1073		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1074		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1075		and `d49` are optional, and set to NaN by default.
1076
1077		**Parameters**
1078
1079		+ `txt`: the csv string to read
1080		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1081		whichever appers most often in `txt`.
1082		+ `session`: set `Session` field to this string for all analyses
1083		'''
1084		if sep == '':
1085			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1086		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1087		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1088
1089		if session != '':
1090			for r in data:
1091				r['Session'] = session
1092
1093		self += data
1094		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, a18_acid=None):
1097	@make_verbal
1098	def wg(self, samples = None, a18_acid = None):
1099		'''
1100		Compute bulk composition of the working gas for each session based on
1101		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1102		`self.Nominal_d18O_VPDB`.
1103		'''
1104
1105		self.msg('Computing WG composition:')
1106
1107		if a18_acid is None:
1108			a18_acid = self.ALPHA_18O_ACID_REACTION
1109		if samples is None:
1110			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1111
1112		assert a18_acid, f'Acid fractionation factor should not be zero.'
1113
1114		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1115		R45R46_standards = {}
1116		for sample in samples:
1117			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1118			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1119			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1120			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1121			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1122
1123			C12_s = 1 / (1 + R13_s)
1124			C13_s = R13_s / (1 + R13_s)
1125			C16_s = 1 / (1 + R17_s + R18_s)
1126			C17_s = R17_s / (1 + R17_s + R18_s)
1127			C18_s = R18_s / (1 + R17_s + R18_s)
1128
1129			C626_s = C12_s * C16_s ** 2
1130			C627_s = 2 * C12_s * C16_s * C17_s
1131			C628_s = 2 * C12_s * C16_s * C18_s
1132			C636_s = C13_s * C16_s ** 2
1133			C637_s = 2 * C13_s * C16_s * C17_s
1134			C727_s = C12_s * C17_s ** 2
1135
1136			R45_s = (C627_s + C636_s) / C626_s
1137			R46_s = (C628_s + C637_s + C727_s) / C626_s
1138			R45R46_standards[sample] = (R45_s, R46_s)
1139		
1140		for s in self.sessions:
1141			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1142			assert db, f'No sample from {samples} found in session "{s}".'
1143# 			dbsamples = sorted({r['Sample'] for r in db})
1144
1145			X = [r['d45'] for r in db]
1146			Y = [R45R46_standards[r['Sample']][0] for r in db]
1147			x1, x2 = np.min(X), np.max(X)
1148
1149			if x1 < x2:
1150				wgcoord = x1/(x1-x2)
1151			else:
1152				wgcoord = 999
1153
1154			if wgcoord < -.5 or wgcoord > 1.5:
1155				# unreasonable to extrapolate to d45 = 0
1156				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1157			else :
1158				# d45 = 0 is reasonably well bracketed
1159				R45_wg = np.polyfit(X, Y, 1)[1]
1160
1161			X = [r['d46'] for r in db]
1162			Y = [R45R46_standards[r['Sample']][1] for r in db]
1163			x1, x2 = np.min(X), np.max(X)
1164
1165			if x1 < x2:
1166				wgcoord = x1/(x1-x2)
1167			else:
1168				wgcoord = 999
1169
1170			if wgcoord < -.5 or wgcoord > 1.5:
1171				# unreasonable to extrapolate to d46 = 0
1172				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1173			else :
1174				# d46 = 0 is reasonably well bracketed
1175				R46_wg = np.polyfit(X, Y, 1)[1]
1176
1177			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1178
1179			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1180
1181			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1182			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1183			for r in self.sessions[s]['data']:
1184				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1185				r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

def compute_bulk_delta(self, R45, R46, D17O=0):
1188	def compute_bulk_delta(self, R45, R46, D17O = 0):
1189		'''
1190		Compute δ13C_VPDB and δ18O_VSMOW,
1191		by solving the generalized form of equation (17) from
1192		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1193		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1194		solving the corresponding second-order Taylor polynomial.
1195		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1196		'''
1197
1198		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1199
1200		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1201		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1202		C = 2 * self.R18_VSMOW
1203		D = -R46
1204
1205		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1206		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1207		cc = A + B + C + D
1208
1209		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1210
1211		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1212		R17 = K * R18 ** self.LAMBDA_17
1213		R13 = R45 - 2 * R17
1214
1215		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1216
1217		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1220	@make_verbal
1221	def crunch(self, verbose = ''):
1222		'''
1223		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1224		'''
1225		for r in self:
1226			self.compute_bulk_and_clumping_deltas(r)
1227		self.standardize_d13C()
1228		self.standardize_d18O()
1229		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1232	def fill_in_missing_info(self, session = 'mySession'):
1233		'''
1234		Fill in optional fields with default values
1235		'''
1236		for i,r in enumerate(self):
1237			if 'D17O' not in r:
1238				r['D17O'] = 0.
1239			if 'UID' not in r:
1240				r['UID'] = f'{i+1}'
1241			if 'Session' not in r:
1242				r['Session'] = session
1243			for k in ['d47', 'd48', 'd49']:
1244				if k not in r:
1245					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1248	def standardize_d13C(self):
1249		'''
1250		Perform δ13C standadization within each session `s` according to
1251		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1252		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1253		may be redefined abitrarily at a later stage.
1254		'''
1255		for s in self.sessions:
1256			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1257				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1258				X,Y = zip(*XY)
1259				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1260					offset = np.mean(Y) - np.mean(X)
1261					for r in self.sessions[s]['data']:
1262						r['d13C_VPDB'] += offset				
1263				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1264					a,b = np.polyfit(X,Y,1)
1265					for r in self.sessions[s]['data']:
1266						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1268	def standardize_d18O(self):
1269		'''
1270		Perform δ18O standadization within each session `s` according to
1271		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1272		which is defined by default by `D47data.refresh_sessions()`as equal to
1273		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1274		'''
1275		for s in self.sessions:
1276			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1277				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1278				X,Y = zip(*XY)
1279				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1280				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1281					offset = np.mean(Y) - np.mean(X)
1282					for r in self.sessions[s]['data']:
1283						r['d18O_VSMOW'] += offset				
1284				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1285					a,b = np.polyfit(X,Y,1)
1286					for r in self.sessions[s]['data']:
1287						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1290	def compute_bulk_and_clumping_deltas(self, r):
1291		'''
1292		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1293		'''
1294
1295		# Compute working gas R13, R18, and isobar ratios
1296		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1297		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1298		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1299
1300		# Compute analyte isobar ratios
1301		R45 = (1 + r['d45'] / 1000) * R45_wg
1302		R46 = (1 + r['d46'] / 1000) * R46_wg
1303		R47 = (1 + r['d47'] / 1000) * R47_wg
1304		R48 = (1 + r['d48'] / 1000) * R48_wg
1305		R49 = (1 + r['d49'] / 1000) * R49_wg
1306
1307		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1308		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1309		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1310
1311		# Compute stochastic isobar ratios of the analyte
1312		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1313			R13, R18, D17O = r['D17O']
1314		)
1315
1316		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1317		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1318		if (R45 / R45stoch - 1) > 5e-8:
1319			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1320		if (R46 / R46stoch - 1) > 5e-8:
1321			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1322
1323		# Compute raw clumped isotope anomalies
1324		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1325		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1326		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1329	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1330		'''
1331		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1332		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1333		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1334		'''
1335
1336		# Compute R17
1337		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1338
1339		# Compute isotope concentrations
1340		C12 = (1 + R13) ** -1
1341		C13 = C12 * R13
1342		C16 = (1 + R17 + R18) ** -1
1343		C17 = C16 * R17
1344		C18 = C16 * R18
1345
1346		# Compute stochastic isotopologue concentrations
1347		C626 = C16 * C12 * C16
1348		C627 = C16 * C12 * C17 * 2
1349		C628 = C16 * C12 * C18 * 2
1350		C636 = C16 * C13 * C16
1351		C637 = C16 * C13 * C17 * 2
1352		C638 = C16 * C13 * C18 * 2
1353		C727 = C17 * C12 * C17
1354		C728 = C17 * C12 * C18 * 2
1355		C737 = C17 * C13 * C17
1356		C738 = C17 * C13 * C18 * 2
1357		C828 = C18 * C12 * C18
1358		C838 = C18 * C13 * C18
1359
1360		# Compute stochastic isobar ratios
1361		R45 = (C636 + C627) / C626
1362		R46 = (C628 + C637 + C727) / C626
1363		R47 = (C638 + C728 + C737) / C626
1364		R48 = (C738 + C828) / C626
1365		R49 = C838 / C626
1366
1367		# Account for stochastic anomalies
1368		R47 *= 1 + D47 / 1000
1369		R48 *= 1 + D48 / 1000
1370		R49 *= 1 + D49 / 1000
1371
1372		# Return isobar ratios
1373		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1376	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1377		'''
1378		Split unknown samples by UID (treat all analyses as different samples)
1379		or by session (treat analyses of a given sample in different sessions as
1380		different samples).
1381
1382		**Parameters**
1383
1384		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1385		+ `grouping`: `by_uid` | `by_session`
1386		'''
1387		if samples_to_split == 'all':
1388			samples_to_split = [s for s in self.unknowns]
1389		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1390		self.grouping = grouping.lower()
1391		if self.grouping in gkeys:
1392			gkey = gkeys[self.grouping]
1393		for r in self:
1394			if r['Sample'] in samples_to_split:
1395				r['Sample_original'] = r['Sample']
1396				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1397			elif r['Sample'] in self.unknowns:
1398				r['Sample_original'] = r['Sample']
1399		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1402	def unsplit_samples(self, tables = False):
1403		'''
1404		Reverse the effects of `D47data.split_samples()`.
1405		
1406		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1407		
1408		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1409		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1410		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1411		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1412		that case session-averaged Δ4x values are statistically independent).
1413		'''
1414		unknowns_old = sorted({s for s in self.unknowns})
1415		CM_old = self.standardization.covar[:,:]
1416		VD_old = self.standardization.params.valuesdict().copy()
1417		vars_old = self.standardization.var_names
1418
1419		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1420
1421		Ns = len(vars_old) - len(unknowns_old)
1422		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1423		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1424
1425		W = np.zeros((len(vars_new), len(vars_old)))
1426		W[:Ns,:Ns] = np.eye(Ns)
1427		for u in unknowns_new:
1428			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1429			if self.grouping == 'by_session':
1430				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1431			elif self.grouping == 'by_uid':
1432				weights = [1 for s in splits]
1433			sw = sum(weights)
1434			weights = [w/sw for w in weights]
1435			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1436
1437		CM_new = W @ CM_old @ W.T
1438		V = W @ np.array([[VD_old[k]] for k in vars_old])
1439		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1440
1441		self.standardization.covar = CM_new
1442		self.standardization.params.valuesdict = lambda : VD_new
1443		self.standardization.var_names = vars_new
1444
1445		for r in self:
1446			if r['Sample'] in self.unknowns:
1447				r['Sample_split'] = r['Sample']
1448				r['Sample'] = r['Sample_original']
1449
1450		self.refresh_samples()
1451		self.consolidate_samples()
1452		self.repeatabilities()
1453
1454		if tables:
1455			self.table_of_analyses()
1456			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1458	def assign_timestamps(self):
1459		'''
1460		Assign a time field `t` of type `float` to each analysis.
1461
1462		If `TimeTag` is one of the data fields, `t` is equal within a given session
1463		to `TimeTag` minus the mean value of `TimeTag` for that session.
1464		Otherwise, `TimeTag` is by default equal to the index of each analysis
1465		in the dataset and `t` is defined as above.
1466		'''
1467		for session in self.sessions:
1468			sdata = self.sessions[session]['data']
1469			try:
1470				t0 = np.mean([r['TimeTag'] for r in sdata])
1471				for r in sdata:
1472					r['t'] = r['TimeTag'] - t0
1473			except KeyError:
1474				t0 = (len(sdata)-1)/2
1475				for t,r in enumerate(sdata):
1476					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1479	def report(self):
1480		'''
1481		Prints a report on the standardization fit.
1482		Only applicable after `D4xdata.standardize(method='pooled')`.
1483		'''
1484		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1487	def combine_samples(self, sample_groups):
1488		'''
1489		Combine analyses of different samples to compute weighted average Δ4x
1490		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1491		dictionary.
1492		
1493		Caution: samples are weighted by number of replicate analyses, which is a
1494		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1495		correlated analytical errors for one or more samples).
1496		
1497		Returns a tuplet of:
1498		
1499		+ the list of group names
1500		+ an array of the corresponding Δ4x values
1501		+ the corresponding (co)variance matrix
1502		
1503		**Parameters**
1504
1505		+ `sample_groups`: a dictionary of the form:
1506		```py
1507		{'group1': ['sample_1', 'sample_2'],
1508		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1509		```
1510		'''
1511		
1512		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1513		groups = sorted(sample_groups.keys())
1514		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1515		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1516		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1517		W = np.array([
1518			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1519			for j in groups])
1520		D4x_new = W @ D4x_old
1521		CM_new = W @ CM_old @ W.T
1522
1523		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1526	@make_verbal
1527	def standardize(self,
1528		method = 'pooled',
1529		weighted_sessions = [],
1530		consolidate = True,
1531		consolidate_tables = False,
1532		consolidate_plots = False,
1533		constraints = {},
1534		):
1535		'''
1536		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1537		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1538		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1539		i.e. that their true Δ4x value does not change between sessions,
1540		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1541		`'indep_sessions'`, the standardization processes each session independently, based only
1542		on anchors analyses.
1543		'''
1544
1545		self.standardization_method = method
1546		self.assign_timestamps()
1547
1548		if method == 'pooled':
1549			if weighted_sessions:
1550				for session_group in weighted_sessions:
1551					if self._4x == '47':
1552						X = D47data([r for r in self if r['Session'] in session_group])
1553					elif self._4x == '48':
1554						X = D48data([r for r in self if r['Session'] in session_group])
1555					X.Nominal_D4x = self.Nominal_D4x.copy()
1556					X.refresh()
1557					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1558					w = np.sqrt(result.redchi)
1559					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1560					for r in X:
1561						r[f'wD{self._4x}raw'] *= w
1562			else:
1563				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1564				for r in self:
1565					r[f'wD{self._4x}raw'] = 1.
1566
1567			params = Parameters()
1568			for k,session in enumerate(self.sessions):
1569				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1570				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1571				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1572				s = pf(session)
1573				params.add(f'a_{s}', value = 0.9)
1574				params.add(f'b_{s}', value = 0.)
1575				params.add(f'c_{s}', value = -0.9)
1576				params.add(f'a2_{s}', value = 0.,
1577# 					vary = self.sessions[session]['scrambling_drift'],
1578					)
1579				params.add(f'b2_{s}', value = 0.,
1580# 					vary = self.sessions[session]['slope_drift'],
1581					)
1582				params.add(f'c2_{s}', value = 0.,
1583# 					vary = self.sessions[session]['wg_drift'],
1584					)
1585				if not self.sessions[session]['scrambling_drift']:
1586					params[f'a2_{s}'].expr = '0'
1587				if not self.sessions[session]['slope_drift']:
1588					params[f'b2_{s}'].expr = '0'
1589				if not self.sessions[session]['wg_drift']:
1590					params[f'c2_{s}'].expr = '0'
1591
1592			for sample in self.unknowns:
1593				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1594
1595			for k in constraints:
1596				params[k].expr = constraints[k]
1597
1598			def residuals(p):
1599				R = []
1600				for r in self:
1601					session = pf(r['Session'])
1602					sample = pf(r['Sample'])
1603					if r['Sample'] in self.Nominal_D4x:
1604						R += [ (
1605							r[f'D{self._4x}raw'] - (
1606								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1607								+ p[f'b_{session}'] * r[f'd{self._4x}']
1608								+	p[f'c_{session}']
1609								+ r['t'] * (
1610									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1611									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1612									+	p[f'c2_{session}']
1613									)
1614								)
1615							) / r[f'wD{self._4x}raw'] ]
1616					else:
1617						R += [ (
1618							r[f'D{self._4x}raw'] - (
1619								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1620								+ p[f'b_{session}'] * r[f'd{self._4x}']
1621								+	p[f'c_{session}']
1622								+ r['t'] * (
1623									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1624									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1625									+	p[f'c2_{session}']
1626									)
1627								)
1628							) / r[f'wD{self._4x}raw'] ]
1629				return R
1630
1631			M = Minimizer(residuals, params)
1632			result = M.least_squares()
1633			self.Nf = result.nfree
1634			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1635			new_names, new_covar, new_se = _fullcovar(result)[:3]
1636			result.var_names = new_names
1637			result.covar = new_covar
1638
1639			for r in self:
1640				s = pf(r["Session"])
1641				a = result.params.valuesdict()[f'a_{s}']
1642				b = result.params.valuesdict()[f'b_{s}']
1643				c = result.params.valuesdict()[f'c_{s}']
1644				a2 = result.params.valuesdict()[f'a2_{s}']
1645				b2 = result.params.valuesdict()[f'b2_{s}']
1646				c2 = result.params.valuesdict()[f'c2_{s}']
1647				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1648				
1649
1650			self.standardization = result
1651
1652			for session in self.sessions:
1653				self.sessions[session]['Np'] = 3
1654				for k in ['scrambling', 'slope', 'wg']:
1655					if self.sessions[session][f'{k}_drift']:
1656						self.sessions[session]['Np'] += 1
1657
1658			if consolidate:
1659				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1660			return result
1661
1662
1663		elif method == 'indep_sessions':
1664
1665			if weighted_sessions:
1666				for session_group in weighted_sessions:
1667					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1668					X.Nominal_D4x = self.Nominal_D4x.copy()
1669					X.refresh()
1670					# This is only done to assign r['wD47raw'] for r in X:
1671					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1672					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1673			else:
1674				self.msg('All weights set to 1 ‰')
1675				for r in self:
1676					r[f'wD{self._4x}raw'] = 1
1677
1678			for session in self.sessions:
1679				s = self.sessions[session]
1680				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1681				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1682				s['Np'] = sum(p_active)
1683				sdata = s['data']
1684
1685				A = np.array([
1686					[
1687						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1688						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1689						1 / r[f'wD{self._4x}raw'],
1690						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1691						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1692						r['t'] / r[f'wD{self._4x}raw']
1693						]
1694					for r in sdata if r['Sample'] in self.anchors
1695					])[:,p_active] # only keep columns for the active parameters
1696				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1697				s['Na'] = Y.size
1698				CM = linalg.inv(A.T @ A)
1699				bf = (CM @ A.T @ Y).T[0,:]
1700				k = 0
1701				for n,a in zip(p_names, p_active):
1702					if a:
1703						s[n] = bf[k]
1704# 						self.msg(f'{n} = {bf[k]}')
1705						k += 1
1706					else:
1707						s[n] = 0.
1708# 						self.msg(f'{n} = 0.0')
1709
1710				for r in sdata :
1711					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1712					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1713					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1714
1715				s['CM'] = np.zeros((6,6))
1716				i = 0
1717				k_active = [j for j,a in enumerate(p_active) if a]
1718				for j,a in enumerate(p_active):
1719					if a:
1720						s['CM'][j,k_active] = CM[i,:]
1721						i += 1
1722
1723			if not weighted_sessions:
1724				w = self.rmswd()['rmswd']
1725				for r in self:
1726						r[f'wD{self._4x}'] *= w
1727						r[f'wD{self._4x}raw'] *= w
1728				for session in self.sessions:
1729					self.sessions[session]['CM'] *= w**2
1730
1731			for session in self.sessions:
1732				s = self.sessions[session]
1733				s['SE_a'] = s['CM'][0,0]**.5
1734				s['SE_b'] = s['CM'][1,1]**.5
1735				s['SE_c'] = s['CM'][2,2]**.5
1736				s['SE_a2'] = s['CM'][3,3]**.5
1737				s['SE_b2'] = s['CM'][4,4]**.5
1738				s['SE_c2'] = s['CM'][5,5]**.5
1739
1740			if not weighted_sessions:
1741				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1742			else:
1743				self.Nf = 0
1744				for sg in weighted_sessions:
1745					self.Nf += self.rmswd(sessions = sg)['Nf']
1746
1747			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1748
1749			avgD4x = {
1750				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1751				for sample in self.samples
1752				}
1753			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1754			rD4x = (chi2/self.Nf)**.5
1755			self.repeatability[f'sigma_{self._4x}'] = rD4x
1756
1757			if consolidate:
1758				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1761	def standardization_error(self, session, d4x, D4x, t = 0):
1762		'''
1763		Compute standardization error for a given session and
1764		(δ47, Δ47) composition.
1765		'''
1766		a = self.sessions[session]['a']
1767		b = self.sessions[session]['b']
1768		c = self.sessions[session]['c']
1769		a2 = self.sessions[session]['a2']
1770		b2 = self.sessions[session]['b2']
1771		c2 = self.sessions[session]['c2']
1772		CM = self.sessions[session]['CM']
1773
1774		x, y = D4x, d4x
1775		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1776# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1777		dxdy = -(b+b2*t) / (a+a2*t)
1778		dxdz = 1. / (a+a2*t)
1779		dxda = -x / (a+a2*t)
1780		dxdb = -y / (a+a2*t)
1781		dxdc = -1. / (a+a2*t)
1782		dxda2 = -x * a2 / (a+a2*t)
1783		dxdb2 = -y * t / (a+a2*t)
1784		dxdc2 = -t / (a+a2*t)
1785		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1786		sx = (V @ CM @ V.T) ** .5
1787		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1790	@make_verbal
1791	def summary(self,
1792		dir = 'output',
1793		filename = None,
1794		save_to_file = True,
1795		print_out = True,
1796		):
1797		'''
1798		Print out an/or save to disk a summary of the standardization results.
1799
1800		**Parameters**
1801
1802		+ `dir`: the directory in which to save the table
1803		+ `filename`: the name to the csv file to write to
1804		+ `save_to_file`: whether to save the table to disk
1805		+ `print_out`: whether to print out the table
1806		'''
1807
1808		out = []
1809		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1810		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1811		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1812		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1813		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1814		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1815		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1816		out += [['Model degrees of freedom', f"{self.Nf}"]]
1817		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1818		out += [['Standardization method', self.standardization_method]]
1819
1820		if save_to_file:
1821			if not os.path.exists(dir):
1822				os.makedirs(dir)
1823			if filename is None:
1824				filename = f'D{self._4x}_summary.csv'
1825			with open(f'{dir}/{filename}', 'w') as fid:
1826				fid.write(make_csv(out))
1827		if print_out:
1828			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1831	@make_verbal
1832	def table_of_sessions(self,
1833		dir = 'output',
1834		filename = None,
1835		save_to_file = True,
1836		print_out = True,
1837		output = None,
1838		):
1839		'''
1840		Print out an/or save to disk a table of sessions.
1841
1842		**Parameters**
1843
1844		+ `dir`: the directory in which to save the table
1845		+ `filename`: the name to the csv file to write to
1846		+ `save_to_file`: whether to save the table to disk
1847		+ `print_out`: whether to print out the table
1848		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1849		    if set to `'raw'`: return a list of list of strings
1850		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1851		'''
1852		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1853		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1854		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1855
1856		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1857		if include_a2:
1858			out[-1] += ['a2 ± SE']
1859		if include_b2:
1860			out[-1] += ['b2 ± SE']
1861		if include_c2:
1862			out[-1] += ['c2 ± SE']
1863		for session in self.sessions:
1864			out += [[
1865				session,
1866				f"{self.sessions[session]['Na']}",
1867				f"{self.sessions[session]['Nu']}",
1868				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1869				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1870				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1871				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1872				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1873				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1874				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1875				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1876				]]
1877			if include_a2:
1878				if self.sessions[session]['scrambling_drift']:
1879					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1880				else:
1881					out[-1] += ['']
1882			if include_b2:
1883				if self.sessions[session]['slope_drift']:
1884					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1885				else:
1886					out[-1] += ['']
1887			if include_c2:
1888				if self.sessions[session]['wg_drift']:
1889					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1890				else:
1891					out[-1] += ['']
1892
1893		if save_to_file:
1894			if not os.path.exists(dir):
1895				os.makedirs(dir)
1896			if filename is None:
1897				filename = f'D{self._4x}_sessions.csv'
1898			with open(f'{dir}/{filename}', 'w') as fid:
1899				fid.write(make_csv(out))
1900		if print_out:
1901			self.msg('\n' + pretty_table(out))
1902		if output == 'raw':
1903			return out
1904		elif output == 'pretty':
1905			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1908	@make_verbal
1909	def table_of_analyses(
1910		self,
1911		dir = 'output',
1912		filename = None,
1913		save_to_file = True,
1914		print_out = True,
1915		output = None,
1916		):
1917		'''
1918		Print out an/or save to disk a table of analyses.
1919
1920		**Parameters**
1921
1922		+ `dir`: the directory in which to save the table
1923		+ `filename`: the name to the csv file to write to
1924		+ `save_to_file`: whether to save the table to disk
1925		+ `print_out`: whether to print out the table
1926		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1927		    if set to `'raw'`: return a list of list of strings
1928		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1929		'''
1930
1931		out = [['UID','Session','Sample']]
1932		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1933		for f in extra_fields:
1934			out[-1] += [f[0]]
1935		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1936		for r in self:
1937			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1938			for f in extra_fields:
1939				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1940			out[-1] += [
1941				f"{r['d13Cwg_VPDB']:.3f}",
1942				f"{r['d18Owg_VSMOW']:.3f}",
1943				f"{r['d45']:.6f}",
1944				f"{r['d46']:.6f}",
1945				f"{r['d47']:.6f}",
1946				f"{r['d48']:.6f}",
1947				f"{r['d49']:.6f}",
1948				f"{r['d13C_VPDB']:.6f}",
1949				f"{r['d18O_VSMOW']:.6f}",
1950				f"{r['D47raw']:.6f}",
1951				f"{r['D48raw']:.6f}",
1952				f"{r['D49raw']:.6f}",
1953				f"{r[f'D{self._4x}']:.6f}"
1954				]
1955		if save_to_file:
1956			if not os.path.exists(dir):
1957				os.makedirs(dir)
1958			if filename is None:
1959				filename = f'D{self._4x}_analyses.csv'
1960			with open(f'{dir}/{filename}', 'w') as fid:
1961				fid.write(make_csv(out))
1962		if print_out:
1963			self.msg('\n' + pretty_table(out))
1964		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1966	@make_verbal
1967	def covar_table(
1968		self,
1969		correl = False,
1970		dir = 'output',
1971		filename = None,
1972		save_to_file = True,
1973		print_out = True,
1974		output = None,
1975		):
1976		'''
1977		Print out, save to disk and/or return the variance-covariance matrix of D4x
1978		for all unknown samples.
1979
1980		**Parameters**
1981
1982		+ `dir`: the directory in which to save the csv
1983		+ `filename`: the name of the csv file to write to
1984		+ `save_to_file`: whether to save the csv
1985		+ `print_out`: whether to print out the matrix
1986		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
1987		    if set to `'raw'`: return a list of list of strings
1988		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1989		'''
1990		samples = sorted([u for u in self.unknowns])
1991		out = [[''] + samples]
1992		for s1 in samples:
1993			out.append([s1])
1994			for s2 in samples:
1995				if correl:
1996					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
1997				else:
1998					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
1999
2000		if save_to_file:
2001			if not os.path.exists(dir):
2002				os.makedirs(dir)
2003			if filename is None:
2004				if correl:
2005					filename = f'D{self._4x}_correl.csv'
2006				else:
2007					filename = f'D{self._4x}_covar.csv'
2008			with open(f'{dir}/{filename}', 'w') as fid:
2009				fid.write(make_csv(out))
2010		if print_out:
2011			self.msg('\n'+pretty_table(out))
2012		if output == 'raw':
2013			return out
2014		elif output == 'pretty':
2015			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2017	@make_verbal
2018	def table_of_samples(
2019		self,
2020		dir = 'output',
2021		filename = None,
2022		save_to_file = True,
2023		print_out = True,
2024		output = None,
2025		):
2026		'''
2027		Print out, save to disk and/or return a table of samples.
2028
2029		**Parameters**
2030
2031		+ `dir`: the directory in which to save the csv
2032		+ `filename`: the name of the csv file to write to
2033		+ `save_to_file`: whether to save the csv
2034		+ `print_out`: whether to print out the table
2035		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2036		    if set to `'raw'`: return a list of list of strings
2037		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2038		'''
2039
2040		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2041		for sample in self.anchors:
2042			out += [[
2043				f"{sample}",
2044				f"{self.samples[sample]['N']}",
2045				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2046				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2047				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2048				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2049				]]
2050		for sample in self.unknowns:
2051			out += [[
2052				f"{sample}",
2053				f"{self.samples[sample]['N']}",
2054				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2055				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2056				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2057				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2058				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2059				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2060				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2061				]]
2062		if save_to_file:
2063			if not os.path.exists(dir):
2064				os.makedirs(dir)
2065			if filename is None:
2066				filename = f'D{self._4x}_samples.csv'
2067			with open(f'{dir}/{filename}', 'w') as fid:
2068				fid.write(make_csv(out))
2069		if print_out:
2070			self.msg('\n'+pretty_table(out))
2071		if output == 'raw':
2072			return out
2073		elif output == 'pretty':
2074			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2077	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2078		'''
2079		Generate session plots and save them to disk.
2080
2081		**Parameters**
2082
2083		+ `dir`: the directory in which to save the plots
2084		+ `figsize`: the width and height (in inches) of each plot
2085		+ `filetype`: 'pdf' or 'png'
2086		+ `dpi`: resolution for PNG output
2087		'''
2088		if not os.path.exists(dir):
2089			os.makedirs(dir)
2090
2091		for session in self.sessions:
2092			sp = self.plot_single_session(session, xylimits = 'constant')
2093			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2094			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2098	@make_verbal
2099	def consolidate_samples(self):
2100		'''
2101		Compile various statistics for each sample.
2102
2103		For each anchor sample:
2104
2105		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2106		+ `SE_D47` or `SE_D48`: set to zero by definition
2107
2108		For each unknown sample:
2109
2110		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2111		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2112
2113		For each anchor and unknown:
2114
2115		+ `N`: the total number of analyses of this sample
2116		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2117		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2118		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2119		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2120		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2121		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2122		'''
2123		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2124		for sample in self.samples:
2125			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2126			if self.samples[sample]['N'] > 1:
2127				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2128
2129			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2130			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2131
2132			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2133			if len(D4x_pop) > 2:
2134				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2135			
2136		if self.standardization_method == 'pooled':
2137			for sample in self.anchors:
2138				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2139				self.samples[sample][f'SE_D{self._4x}'] = 0.
2140			for sample in self.unknowns:
2141				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2142				try:
2143					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2144				except ValueError:
2145					# when `sample` is constrained by self.standardize(constraints = {...}),
2146					# it is no longer listed in self.standardization.var_names.
2147					# Temporary fix: define SE as zero for now
2148					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2149
2150		elif self.standardization_method == 'indep_sessions':
2151			for sample in self.anchors:
2152				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2153				self.samples[sample][f'SE_D{self._4x}'] = 0.
2154			for sample in self.unknowns:
2155				self.msg(f'Consolidating sample {sample}')
2156				self.unknowns[sample][f'session_D{self._4x}'] = {}
2157				session_avg = []
2158				for session in self.sessions:
2159					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2160					if sdata:
2161						self.msg(f'{sample} found in session {session}')
2162						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2163						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2164						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2165						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2166						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2167						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2168						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2169				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2170				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2171				wsum = sum([weights[s] for s in weights])
2172				for s in weights:
2173					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2174
2175		for r in self:
2176			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2180	def consolidate_sessions(self):
2181		'''
2182		Compute various statistics for each session.
2183
2184		+ `Na`: Number of anchor analyses in the session
2185		+ `Nu`: Number of unknown analyses in the session
2186		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2187		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2188		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2189		+ `a`: scrambling factor
2190		+ `b`: compositional slope
2191		+ `c`: WG offset
2192		+ `SE_a`: Model stadard erorr of `a`
2193		+ `SE_b`: Model stadard erorr of `b`
2194		+ `SE_c`: Model stadard erorr of `c`
2195		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2196		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2197		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2198		+ `a2`: scrambling factor drift
2199		+ `b2`: compositional slope drift
2200		+ `c2`: WG offset drift
2201		+ `Np`: Number of standardization parameters to fit
2202		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2203		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2204		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2205		'''
2206		for session in self.sessions:
2207			if 'd13Cwg_VPDB' not in self.sessions[session]:
2208				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2209			if 'd18Owg_VSMOW' not in self.sessions[session]:
2210				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2211			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2212			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2213
2214			self.msg(f'Computing repeatabilities for session {session}')
2215			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2216			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2217			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2218
2219		if self.standardization_method == 'pooled':
2220			for session in self.sessions:
2221
2222				# different (better?) computation of D4x repeatability for each session:
2223				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2224				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2225
2226				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2227				i = self.standardization.var_names.index(f'a_{pf(session)}')
2228				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2229
2230				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2231				i = self.standardization.var_names.index(f'b_{pf(session)}')
2232				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2233
2234				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2235				i = self.standardization.var_names.index(f'c_{pf(session)}')
2236				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2237
2238				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2239				if self.sessions[session]['scrambling_drift']:
2240					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2241					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2242				else:
2243					self.sessions[session]['SE_a2'] = 0.
2244
2245				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2246				if self.sessions[session]['slope_drift']:
2247					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2248					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2249				else:
2250					self.sessions[session]['SE_b2'] = 0.
2251
2252				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2253				if self.sessions[session]['wg_drift']:
2254					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2255					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2256				else:
2257					self.sessions[session]['SE_c2'] = 0.
2258
2259				i = self.standardization.var_names.index(f'a_{pf(session)}')
2260				j = self.standardization.var_names.index(f'b_{pf(session)}')
2261				k = self.standardization.var_names.index(f'c_{pf(session)}')
2262				CM = np.zeros((6,6))
2263				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2264				try:
2265					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2266					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2267					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2268					try:
2269						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2270						CM[3,4] = self.standardization.covar[i2,j2]
2271						CM[4,3] = self.standardization.covar[j2,i2]
2272					except ValueError:
2273						pass
2274					try:
2275						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2276						CM[3,5] = self.standardization.covar[i2,k2]
2277						CM[5,3] = self.standardization.covar[k2,i2]
2278					except ValueError:
2279						pass
2280				except ValueError:
2281					pass
2282				try:
2283					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2284					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2285					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2286					try:
2287						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2288						CM[4,5] = self.standardization.covar[j2,k2]
2289						CM[5,4] = self.standardization.covar[k2,j2]
2290					except ValueError:
2291						pass
2292				except ValueError:
2293					pass
2294				try:
2295					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2296					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2297					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2298				except ValueError:
2299					pass
2300
2301				self.sessions[session]['CM'] = CM
2302
2303		elif self.standardization_method == 'indep_sessions':
2304			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2307	@make_verbal
2308	def repeatabilities(self):
2309		'''
2310		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2311		(for all samples, for anchors, and for unknowns).
2312		'''
2313		self.msg('Computing reproducibilities for all sessions')
2314
2315		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2316		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2317		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2318		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2319		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2322	@make_verbal
2323	def consolidate(self, tables = True, plots = True):
2324		'''
2325		Collect information about samples, sessions and repeatabilities.
2326		'''
2327		self.consolidate_samples()
2328		self.consolidate_sessions()
2329		self.repeatabilities()
2330
2331		if tables:
2332			self.summary()
2333			self.table_of_sessions()
2334			self.table_of_analyses()
2335			self.table_of_samples()
2336
2337		if plots:
2338			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2341	@make_verbal
2342	def rmswd(self,
2343		samples = 'all samples',
2344		sessions = 'all sessions',
2345		):
2346		'''
2347		Compute the χ2, root mean squared weighted deviation
2348		(i.e. reduced χ2), and corresponding degrees of freedom of the
2349		Δ4x values for samples in `samples` and sessions in `sessions`.
2350		
2351		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2352		'''
2353		if samples == 'all samples':
2354			mysamples = [k for k in self.samples]
2355		elif samples == 'anchors':
2356			mysamples = [k for k in self.anchors]
2357		elif samples == 'unknowns':
2358			mysamples = [k for k in self.unknowns]
2359		else:
2360			mysamples = samples
2361
2362		if sessions == 'all sessions':
2363			sessions = [k for k in self.sessions]
2364
2365		chisq, Nf = 0, 0
2366		for sample in mysamples :
2367			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2368			if len(G) > 1 :
2369				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2370				Nf += (len(G) - 1)
2371				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2372		r = (chisq / Nf)**.5 if Nf > 0 else 0
2373		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2374		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2377	@make_verbal
2378	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2379		'''
2380		Compute the repeatability of `[r[key] for r in self]`
2381		'''
2382
2383		if samples == 'all samples':
2384			mysamples = [k for k in self.samples]
2385		elif samples == 'anchors':
2386			mysamples = [k for k in self.anchors]
2387		elif samples == 'unknowns':
2388			mysamples = [k for k in self.unknowns]
2389		else:
2390			mysamples = samples
2391
2392		if sessions == 'all sessions':
2393			sessions = [k for k in self.sessions]
2394
2395		if key in ['D47', 'D48']:
2396			# Full disclosure: the definition of Nf is tricky/debatable
2397			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2398			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2399			Nf = len(G)
2400# 			print(f'len(G) = {Nf}')
2401			Nf -= len([s for s in mysamples if s in self.unknowns])
2402# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2403			for session in sessions:
2404				Np = len([
2405					_ for _ in self.standardization.params
2406					if (
2407						self.standardization.params[_].expr is not None
2408						and (
2409							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2410							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2411							)
2412						)
2413					])
2414# 				print(f'session {session}: {Np} parameters to consider')
2415				Na = len({
2416					r['Sample'] for r in self.sessions[session]['data']
2417					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2418					})
2419# 				print(f'session {session}: {Na} different anchors in that session')
2420				Nf -= min(Np, Na)
2421# 			print(f'Nf = {Nf}')
2422
2423# 			for sample in mysamples :
2424# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2425# 				if len(X) > 1 :
2426# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2427# 					if sample in self.unknowns:
2428# 						Nf += len(X) - 1
2429# 					else:
2430# 						Nf += len(X)
2431# 			if samples in ['anchors', 'all samples']:
2432# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2433			r = (chisq / Nf)**.5 if Nf > 0 else 0
2434
2435		else: # if key not in ['D47', 'D48']
2436			chisq, Nf = 0, 0
2437			for sample in mysamples :
2438				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2439				if len(X) > 1 :
2440					Nf += len(X) - 1
2441					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2442			r = (chisq / Nf)**.5 if Nf > 0 else 0
2443
2444		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2445		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2447	def sample_average(self, samples, weights = 'equal', normalize = True):
2448		'''
2449		Weighted average Δ4x value of a group of samples, accounting for covariance.
2450
2451		Returns the weighed average Δ4x value and associated SE
2452		of a group of samples. Weights are equal by default. If `normalize` is
2453		true, `weights` will be rescaled so that their sum equals 1.
2454
2455		**Examples**
2456
2457		```python
2458		self.sample_average(['X','Y'], [1, 2])
2459		```
2460
2461		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2462		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2463		values of samples X and Y, respectively.
2464
2465		```python
2466		self.sample_average(['X','Y'], [1, -1], normalize = False)
2467		```
2468
2469		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2470		'''
2471		if weights == 'equal':
2472			weights = [1/len(samples)] * len(samples)
2473
2474		if normalize:
2475			s = sum(weights)
2476			if s:
2477				weights = [w/s for w in weights]
2478
2479		try:
2480# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2481# 			C = self.standardization.covar[indices,:][:,indices]
2482			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2483			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2484			return correlated_sum(X, C, weights)
2485		except ValueError:
2486			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2489	def sample_D4x_covar(self, sample1, sample2 = None):
2490		'''
2491		Covariance between Δ4x values of samples
2492
2493		Returns the error covariance between the average Δ4x values of two
2494		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2495		returns the Δ4x variance for that sample.
2496		'''
2497		if sample2 is None:
2498			sample2 = sample1
2499		if self.standardization_method == 'pooled':
2500			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2501			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2502			return self.standardization.covar[i, j]
2503		elif self.standardization_method == 'indep_sessions':
2504			if sample1 == sample2:
2505				return self.samples[sample1][f'SE_D{self._4x}']**2
2506			else:
2507				c = 0
2508				for session in self.sessions:
2509					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2510					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2511					if sdata1 and sdata2:
2512						a = self.sessions[session]['a']
2513						# !! TODO: CM below does not account for temporal changes in standardization parameters
2514						CM = self.sessions[session]['CM'][:3,:3]
2515						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2516						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2517						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2518						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2519						c += (
2520							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2521							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2522							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2523							@ CM
2524							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2525							) / a**2
2526				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2528	def sample_D4x_correl(self, sample1, sample2 = None):
2529		'''
2530		Correlation between Δ4x errors of samples
2531
2532		Returns the error correlation between the average Δ4x values of two samples.
2533		'''
2534		if sample2 is None or sample2 == sample1:
2535			return 1.
2536		return (
2537			self.sample_D4x_covar(sample1, sample2)
2538			/ self.unknowns[sample1][f'SE_D{self._4x}']
2539			/ self.unknowns[sample2][f'SE_D{self._4x}']
2540			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2542	def plot_single_session(self,
2543		session,
2544		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2545		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2546		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2547		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2548		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2549		xylimits = 'free', # | 'constant'
2550		x_label = None,
2551		y_label = None,
2552		error_contour_interval = 'auto',
2553		fig = 'new',
2554		):
2555		'''
2556		Generate plot for a single session
2557		'''
2558		if x_label is None:
2559			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2560		if y_label is None:
2561			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2562
2563		out = _SessionPlot()
2564		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2565		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2566		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2567		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2568		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2569		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2570		anchor_avg = (np.array([ np.array([
2571				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2572				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2573				]) for sample in anchors]).T,
2574			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2575		unknown_avg = (np.array([ np.array([
2576				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2577				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2578				]) for sample in unknowns]).T,
2579			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2580		
2581		
2582		if fig == 'new':
2583			out.fig = ppl.figure(figsize = (6,6))
2584			ppl.subplots_adjust(.1,.1,.9,.9)
2585
2586		out.anchor_analyses, = ppl.plot(
2587			anchors_d,
2588			anchors_D,
2589			**kw_plot_anchors)
2590		out.unknown_analyses, = ppl.plot(
2591			unknowns_d,
2592			unknowns_D,
2593			**kw_plot_unknowns)
2594		out.anchor_avg = ppl.plot(
2595			*anchor_avg,
2596			**kw_plot_anchor_avg)
2597		out.unknown_avg = ppl.plot(
2598			*unknown_avg,
2599			**kw_plot_unknown_avg)
2600		if xylimits == 'constant':
2601			x = [r[f'd{self._4x}'] for r in self]
2602			y = [r[f'D{self._4x}'] for r in self]
2603			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2604			w, h = x2-x1, y2-y1
2605			x1 -= w/20
2606			x2 += w/20
2607			y1 -= h/20
2608			y2 += h/20
2609			ppl.axis([x1, x2, y1, y2])
2610		elif xylimits == 'free':
2611			x1, x2, y1, y2 = ppl.axis()
2612		else:
2613			x1, x2, y1, y2 = ppl.axis(xylimits)
2614				
2615		if error_contour_interval != 'none':
2616			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2617			XI,YI = np.meshgrid(xi, yi)
2618			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2619			if error_contour_interval == 'auto':
2620				rng = np.max(SI) - np.min(SI)
2621				if rng <= 0.01:
2622					cinterval = 0.001
2623				elif rng <= 0.03:
2624					cinterval = 0.004
2625				elif rng <= 0.1:
2626					cinterval = 0.01
2627				elif rng <= 0.3:
2628					cinterval = 0.03
2629				elif rng <= 1.:
2630					cinterval = 0.1
2631				else:
2632					cinterval = 0.5
2633			else:
2634				cinterval = error_contour_interval
2635
2636			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2637			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2638			out.clabel = ppl.clabel(out.contour)
2639			contour = (XI, YI, SI, cval, cinterval)
2640
2641		if fig == None:
2642			return {
2643			'anchors':anchors,
2644			'unknowns':unknowns,
2645			'anchors_d':anchors_d,
2646			'anchors_D':anchors_D,
2647			'unknowns_d':unknowns_d,
2648			'unknowns_D':unknowns_D,
2649			'anchor_avg':anchor_avg,
2650			'unknown_avg':unknown_avg,
2651			'contour':contour,
2652			}
2653
2654		ppl.xlabel(x_label)
2655		ppl.ylabel(y_label)
2656		ppl.title(session, weight = 'bold')
2657		ppl.grid(alpha = .2)
2658		out.ax = ppl.gca()		
2659
2660		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2662	def plot_residuals(
2663		self,
2664		kde = False,
2665		hist = False,
2666		binwidth = 2/3,
2667		dir = 'output',
2668		filename = None,
2669		highlight = [],
2670		colors = None,
2671		figsize = None,
2672		dpi = 100,
2673		yspan = None,
2674		):
2675		'''
2676		Plot residuals of each analysis as a function of time (actually, as a function of
2677		the order of analyses in the `D4xdata` object)
2678
2679		+ `kde`: whether to add a kernel density estimate of residuals
2680		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2681		+ `histbins`: specify bin edges for the histogram
2682		+ `dir`: the directory in which to save the plot
2683		+ `highlight`: a list of samples to highlight
2684		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2685		+ `figsize`: (width, height) of figure
2686		+ `dpi`: resolution for PNG output
2687		+ `yspan`: factor controlling the range of y values shown in plot
2688		  (by default: `yspan = 1.5 if kde else 1.0`)
2689		'''
2690		
2691		from matplotlib import ticker
2692
2693		if yspan is None:
2694			if kde:
2695				yspan = 1.5
2696			else:
2697				yspan = 1.0
2698		
2699		# Layout
2700		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2701		if hist or kde:
2702			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2703			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2704		else:
2705			ppl.subplots_adjust(.08,.05,.78,.8)
2706			ax1 = ppl.subplot(111)
2707		
2708		# Colors
2709		N = len(self.anchors)
2710		if colors is None:
2711			if len(highlight) > 0:
2712				Nh = len(highlight)
2713				if Nh == 1:
2714					colors = {highlight[0]: (0,0,0)}
2715				elif Nh == 3:
2716					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2717				elif Nh == 4:
2718					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2719				else:
2720					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2721			else:
2722				if N == 3:
2723					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2724				elif N == 4:
2725					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2726				else:
2727					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2728
2729		ppl.sca(ax1)
2730		
2731		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2732
2733		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2734
2735		session = self[0]['Session']
2736		x1 = 0
2737# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2738		x_sessions = {}
2739		one_or_more_singlets = False
2740		one_or_more_multiplets = False
2741		multiplets = set()
2742		for k,r in enumerate(self):
2743			if r['Session'] != session:
2744				x2 = k-1
2745				x_sessions[session] = (x1+x2)/2
2746				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2747				session = r['Session']
2748				x1 = k
2749			singlet = len(self.samples[r['Sample']]['data']) == 1
2750			if not singlet:
2751				multiplets.add(r['Sample'])
2752			if r['Sample'] in self.unknowns:
2753				if singlet:
2754					one_or_more_singlets = True
2755				else:
2756					one_or_more_multiplets = True
2757			kw = dict(
2758				marker = 'x' if singlet else '+',
2759				ms = 4 if singlet else 5,
2760				ls = 'None',
2761				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2762				mew = 1,
2763				alpha = 0.2 if singlet else 1,
2764				)
2765			if highlight and r['Sample'] not in highlight:
2766				kw['alpha'] = 0.2
2767			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2768		x2 = k
2769		x_sessions[session] = (x1+x2)/2
2770
2771		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2772		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2773		if not (hist or kde):
2774			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2775			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2776
2777		xmin, xmax, ymin, ymax = ppl.axis()
2778		if yspan != 1:
2779			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2780		for s in x_sessions:
2781			ppl.text(
2782				x_sessions[s],
2783				ymax +1,
2784				s,
2785				va = 'bottom',
2786				**(
2787					dict(ha = 'center')
2788					if len(self.sessions[s]['data']) > (0.15 * len(self))
2789					else dict(ha = 'left', rotation = 45)
2790					)
2791				)
2792
2793		if hist or kde:
2794			ppl.sca(ax2)
2795
2796		for s in colors:
2797			kw['marker'] = '+'
2798			kw['ms'] = 5
2799			kw['mec'] = colors[s]
2800			kw['label'] = s
2801			kw['alpha'] = 1
2802			ppl.plot([], [], **kw)
2803
2804		kw['mec'] = (0,0,0)
2805
2806		if one_or_more_singlets:
2807			kw['marker'] = 'x'
2808			kw['ms'] = 4
2809			kw['alpha'] = .2
2810			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2811			ppl.plot([], [], **kw)
2812
2813		if one_or_more_multiplets:
2814			kw['marker'] = '+'
2815			kw['ms'] = 4
2816			kw['alpha'] = 1
2817			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2818			ppl.plot([], [], **kw)
2819
2820		if hist or kde:
2821			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2822		else:
2823			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2824		leg.set_zorder(-1000)
2825
2826		ppl.sca(ax1)
2827
2828		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2829		ppl.xticks([])
2830		ppl.axis([-1, len(self), None, None])
2831
2832		if hist or kde:
2833			ppl.sca(ax2)
2834			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2835
2836			if kde:
2837				from scipy.stats import gaussian_kde
2838				yi = np.linspace(ymin, ymax, 201)
2839				xi = gaussian_kde(X).evaluate(yi)
2840				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2841# 				ppl.plot(xi, yi, 'k-', lw = 1)
2842			elif hist:
2843				ppl.hist(
2844					X,
2845					orientation = 'horizontal',
2846					histtype = 'stepfilled',
2847					ec = [.4]*3,
2848					fc = [.25]*3,
2849					alpha = .25,
2850					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2851					)
2852			ppl.text(0, 0,
2853				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2854				size = 7.5,
2855				alpha = 1,
2856				va = 'center',
2857				ha = 'left',
2858				)
2859
2860			ppl.axis([0, None, ymin, ymax])
2861			ppl.xticks([])
2862			ppl.yticks([])
2863# 			ax2.spines['left'].set_visible(False)
2864			ax2.spines['right'].set_visible(False)
2865			ax2.spines['top'].set_visible(False)
2866			ax2.spines['bottom'].set_visible(False)
2867
2868		ax1.axis([None, None, ymin, ymax])
2869
2870		if not os.path.exists(dir):
2871			os.makedirs(dir)
2872		if filename is None:
2873			return fig
2874		elif filename == '':
2875			filename = f'D{self._4x}_residuals.pdf'
2876		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2877		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: <color>} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2880	def simulate(self, *args, **kwargs):
2881		'''
2882		Legacy function with warning message pointing to `virtual_data()`
2883		'''
2884		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
2886	def plot_distribution_of_analyses(
2887		self,
2888		dir = 'output',
2889		filename = None,
2890		vs_time = False,
2891		figsize = (6,4),
2892		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2893		output = None,
2894		dpi = 100,
2895		):
2896		'''
2897		Plot temporal distribution of all analyses in the data set.
2898		
2899		**Parameters**
2900
2901		+ `dir`: the directory in which to save the plot
2902		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2903		+ `dpi`: resolution for PNG output
2904		+ `figsize`: (width, height) of figure
2905		+ `dpi`: resolution for PNG output
2906		'''
2907
2908		asamples = [s for s in self.anchors]
2909		usamples = [s for s in self.unknowns]
2910		if output is None or output == 'fig':
2911			fig = ppl.figure(figsize = figsize)
2912			ppl.subplots_adjust(*subplots_adjust)
2913		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2914		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2915		Xmax += (Xmax-Xmin)/40
2916		Xmin -= (Xmax-Xmin)/41
2917		for k, s in enumerate(asamples + usamples):
2918			if vs_time:
2919				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2920			else:
2921				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2922			Y = [-k for x in X]
2923			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2924			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2925			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2926		ppl.axis([Xmin, Xmax, -k-1, 1])
2927		ppl.xlabel('\ntime')
2928		ppl.gca().annotate('',
2929			xy = (0.6, -0.02),
2930			xycoords = 'axes fraction',
2931			xytext = (.4, -0.02), 
2932            arrowprops = dict(arrowstyle = "->", color = 'k'),
2933            )
2934			
2935
2936		x2 = -1
2937		for session in self.sessions:
2938			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2939			if vs_time:
2940				ppl.axvline(x1, color = 'k', lw = .75)
2941			if x2 > -1:
2942				if not vs_time:
2943					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2944			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2945# 			from xlrd import xldate_as_datetime
2946# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2947			if vs_time:
2948				ppl.axvline(x2, color = 'k', lw = .75)
2949				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2950			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2951
2952		ppl.xticks([])
2953		ppl.yticks([])
2954
2955		if output is None:
2956			if not os.path.exists(dir):
2957				os.makedirs(dir)
2958			if filename == None:
2959				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2960			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2961			ppl.close(fig)
2962		elif output == 'ax':
2963			return ppl.gca()
2964		elif output == 'fig':
2965			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
2968	def plot_bulk_compositions(
2969		self,
2970		samples = None,
2971		dir = 'output/bulk_compositions',
2972		figsize = (6,6),
2973		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
2974		show = False,
2975		sample_color = (0,.5,1),
2976		analysis_color = (.7,.7,.7),
2977		labeldist = 0.3,
2978		radius = 0.05,
2979		):
2980		'''
2981		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
2982		
2983		By default, creates a directory `./output/bulk_compositions` where plots for
2984		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
2985		
2986		
2987		**Parameters**
2988
2989		+ `samples`: Only these samples are processed (by default: all samples).
2990		+ `dir`: where to save the plots
2991		+ `figsize`: (width, height) of figure
2992		+ `subplots_adjust`: passed to `subplots_adjust()`
2993		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
2994		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
2995		+ `sample_color`: color used for replicate markers/labels
2996		+ `analysis_color`: color used for sample markers/labels
2997		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
2998		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
2999		'''
3000
3001		from matplotlib.patches import Ellipse
3002
3003		if samples is None:
3004			samples = [_ for _ in self.samples]
3005
3006		saved = {}
3007
3008		for s in samples:
3009
3010			fig = ppl.figure(figsize = figsize)
3011			fig.subplots_adjust(*subplots_adjust)
3012			ax = ppl.subplot(111)
3013			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3014			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3015			ppl.title(s)
3016
3017
3018			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3019			UID = [_['UID'] for _ in self.samples[s]['data']]
3020			XY0 = XY.mean(0)
3021
3022			for xy in XY:
3023				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3024				
3025			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3026			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3027			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3028			saved[s] = [XY, XY0]
3029			
3030			x1, x2, y1, y2 = ppl.axis()
3031			x0, dx = (x1+x2)/2, (x2-x1)/2
3032			y0, dy = (y1+y2)/2, (y2-y1)/2
3033			dx, dy = [max(max(dx, dy), radius)]*2
3034
3035			ppl.axis([
3036				x0 - 1.2*dx,
3037				x0 + 1.2*dx,
3038				y0 - 1.2*dy,
3039				y0 + 1.2*dy,
3040				])			
3041
3042			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3043
3044			for xy, uid in zip(XY, UID):
3045
3046				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3047				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3048
3049				if (vector_in_display_space**2).sum() > 0:
3050
3051					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3052					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3053					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3054					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3055
3056					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3057
3058				else:
3059
3060					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3061
3062			if radius:
3063				ax.add_artist(Ellipse(
3064					xy = XY0,
3065					width = radius*2,
3066					height = radius*2,
3067					ls = (0, (2,2)),
3068					lw = .7,
3069					ec = analysis_color,
3070					fc = 'None',
3071					))
3072				ppl.text(
3073					XY0[0],
3074					XY0[1]-radius,
3075					f'\n± {radius*1e3:.0f} ppm',
3076					color = analysis_color,
3077					va = 'top',
3078					ha = 'center',
3079					linespacing = 0.4,
3080					size = 8,
3081					)
3082
3083			if not os.path.exists(dir):
3084				os.makedirs(dir)
3085			fig.savefig(f'{dir}/{s}.pdf')
3086			ppl.close(fig)
3087
3088		fig = ppl.figure(figsize = figsize)
3089		fig.subplots_adjust(*subplots_adjust)
3090		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3091		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3092
3093		for s in saved:
3094			for xy in saved[s][0]:
3095				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3096			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3097			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3098			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3099
3100		x1, x2, y1, y2 = ppl.axis()
3101		ppl.axis([
3102			x1 - (x2-x1)/10,
3103			x2 + (x2-x1)/10,
3104			y1 - (y2-y1)/10,
3105			y2 + (y2-y1)/10,
3106			])			
3107
3108
3109		if not os.path.exists(dir):
3110			os.makedirs(dir)
3111		fig.savefig(f'{dir}/__all__.pdf')
3112		if show:
3113			ppl.show()
3114		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3156class D47data(D4xdata):
3157	'''
3158	Store and process data for a large set of Δ47 analyses,
3159	usually comprising more than one analytical session.
3160	'''
3161
3162	Nominal_D4x = {
3163		'ETH-1':   0.2052,
3164		'ETH-2':   0.2085,
3165		'ETH-3':   0.6132,
3166		'ETH-4':   0.4511,
3167		'IAEA-C1': 0.3018,
3168		'IAEA-C2': 0.6409,
3169		'MERCK':   0.5135,
3170		} # I-CDES (Bernasconi et al., 2021)
3171	'''
3172	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3173	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3174	reference frame.
3175
3176	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3177	```py
3178	{
3179		'ETH-1'   : 0.2052,
3180		'ETH-2'   : 0.2085,
3181		'ETH-3'   : 0.6132,
3182		'ETH-4'   : 0.4511,
3183		'IAEA-C1' : 0.3018,
3184		'IAEA-C2' : 0.6409,
3185		'MERCK'   : 0.5135,
3186	}
3187	```
3188	'''
3189
3190
3191	@property
3192	def Nominal_D47(self):
3193		return self.Nominal_D4x
3194	
3195
3196	@Nominal_D47.setter
3197	def Nominal_D47(self, new):
3198		self.Nominal_D4x = dict(**new)
3199		self.refresh()
3200
3201
3202	def __init__(self, l = [], **kwargs):
3203		'''
3204		**Parameters:** same as `D4xdata.__init__()`
3205		'''
3206		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3207
3208
3209	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3210		'''
3211		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3212		value for that temperature, and add treat these samples as additional anchors.
3213
3214		**Parameters**
3215
3216		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3217		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3218		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3219		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3220		if `new`: keep pre-existing anchors but update them in case of conflict
3221		between old and new Δ47 values;
3222		if `old`: keep pre-existing anchors but preserve their original Δ47
3223		values in case of conflict.
3224		'''
3225		f = {
3226			'petersen': fCO2eqD47_Petersen,
3227			'wang': fCO2eqD47_Wang,
3228			}[fCo2eqD47]
3229		foo = {}
3230		for r in self:
3231			if 'Teq' in r:
3232				if r['Sample'] in foo:
3233					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3234				else:
3235					foo[r['Sample']] = f(r['Teq'])
3236			else:
3237					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3238
3239		if priority == 'replace':
3240			self.Nominal_D47 = {}
3241		for s in foo:
3242			if priority != 'old' or s not in self.Nominal_D47:
3243				self.Nominal_D47[s] = foo[s]
3244	
3245	def save_D47_correl(self, *args, **kwargs):
3246		return self._save_D4x_correl(*args, **kwargs)
3247
3248	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3202	def __init__(self, l = [], **kwargs):
3203		'''
3204		**Parameters:** same as `D4xdata.__init__()`
3205		'''
3206		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3191	@property
3192	def Nominal_D47(self):
3193		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3209	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3210		'''
3211		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3212		value for that temperature, and add treat these samples as additional anchors.
3213
3214		**Parameters**
3215
3216		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3217		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3218		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3219		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3220		if `new`: keep pre-existing anchors but update them in case of conflict
3221		between old and new Δ47 values;
3222		if `old`: keep pre-existing anchors but preserve their original Δ47
3223		values in case of conflict.
3224		'''
3225		f = {
3226			'petersen': fCO2eqD47_Petersen,
3227			'wang': fCO2eqD47_Wang,
3228			}[fCo2eqD47]
3229		foo = {}
3230		for r in self:
3231			if 'Teq' in r:
3232				if r['Sample'] in foo:
3233					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3234				else:
3235					foo[r['Sample']] = f(r['Teq'])
3236			else:
3237					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3238
3239		if priority == 'replace':
3240			self.Nominal_D47 = {}
3241		for s in foo:
3242			if priority != 'old' or s not in self.Nominal_D47:
3243				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3245	def save_D47_correl(self, *args, **kwargs):
3246		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D48data(D4xdata):
3251class D48data(D4xdata):
3252	'''
3253	Store and process data for a large set of Δ48 analyses,
3254	usually comprising more than one analytical session.
3255	'''
3256
3257	Nominal_D4x = {
3258		'ETH-1':  0.138,
3259		'ETH-2':  0.138,
3260		'ETH-3':  0.270,
3261		'ETH-4':  0.223,
3262		'GU-1':  -0.419,
3263		} # (Fiebig et al., 2019, 2021)
3264	'''
3265	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3266	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3267	reference frame.
3268
3269	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3270	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3271
3272	```py
3273	{
3274		'ETH-1' :  0.138,
3275		'ETH-2' :  0.138,
3276		'ETH-3' :  0.270,
3277		'ETH-4' :  0.223,
3278		'GU-1'  : -0.419,
3279	}
3280	```
3281	'''
3282
3283
3284	@property
3285	def Nominal_D48(self):
3286		return self.Nominal_D4x
3287
3288	
3289	@Nominal_D48.setter
3290	def Nominal_D48(self, new):
3291		self.Nominal_D4x = dict(**new)
3292		self.refresh()
3293
3294
3295	def __init__(self, l = [], **kwargs):
3296		'''
3297		**Parameters:** same as `D4xdata.__init__()`
3298		'''
3299		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3300
3301	def save_D48_correl(self, *args, **kwargs):
3302		return self._save_D4x_correl(*args, **kwargs)
3303
3304	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3295	def __init__(self, l = [], **kwargs):
3296		'''
3297		**Parameters:** same as `D4xdata.__init__()`
3298		'''
3299		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3284	@property
3285	def Nominal_D48(self):
3286		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3301	def save_D48_correl(self, *args, **kwargs):
3302		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D49data(D4xdata):
3307class D49data(D4xdata):
3308	'''
3309	Store and process data for a large set of Δ49 analyses,
3310	usually comprising more than one analytical session.
3311	'''
3312	
3313	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3314	'''
3315	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3316	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3317	reference frame.
3318
3319	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3320
3321	```py
3322	{
3323		"1000C": 0.0,
3324		"25C": 2.228
3325	}
3326	```
3327	'''
3328	
3329	@property
3330	def Nominal_D49(self):
3331		return self.Nominal_D4x
3332	
3333	@Nominal_D49.setter
3334	def Nominal_D49(self, new):
3335		self.Nominal_D4x = dict(**new)
3336		self.refresh()
3337	
3338	def __init__(self, l=[], **kwargs):
3339		'''
3340		**Parameters:** same as `D4xdata.__init__()`
3341		'''
3342		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3343	
3344	def save_D49_correl(self, *args, **kwargs):
3345		return self._save_D4x_correl(*args, **kwargs)
3346	
3347	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3338	def __init__(self, l=[], **kwargs):
3339		'''
3340		**Parameters:** same as `D4xdata.__init__()`
3341		'''
3342		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3329	@property
3330	def Nominal_D49(self):
3331		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3344	def save_D49_correl(self, *args, **kwargs):
3345		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)