D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16<h1>API Documentation</h1>
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) Mathieu Daëron'
  23__license__   = 'MIT License - https://opensource.org/licenses/MIT'
  24__date__      = '2025-03-16'
  25__version__   = '2.4.2'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42
  43typer.rich_utils.STYLE_HELPTEXT = ''
  44
  45rcParams['font.family'] = 'sans-serif'
  46rcParams['font.sans-serif'] = 'Helvetica'
  47rcParams['font.size'] = 10
  48rcParams['mathtext.fontset'] = 'custom'
  49rcParams['mathtext.rm'] = 'sans'
  50rcParams['mathtext.bf'] = 'sans:bold'
  51rcParams['mathtext.it'] = 'sans:italic'
  52rcParams['mathtext.cal'] = 'sans:italic'
  53rcParams['mathtext.default'] = 'rm'
  54rcParams['xtick.major.size'] = 4
  55rcParams['xtick.major.width'] = 1
  56rcParams['ytick.major.size'] = 4
  57rcParams['ytick.major.width'] = 1
  58rcParams['axes.grid'] = False
  59rcParams['axes.linewidth'] = 1
  60rcParams['grid.linewidth'] = .75
  61rcParams['grid.linestyle'] = '-'
  62rcParams['grid.alpha'] = .15
  63rcParams['savefig.dpi'] = 150
  64
  65Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  66_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  67def fCO2eqD47_Petersen(T):
  68	'''
  69	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  70	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  71
  72	'''
  73	return float(_fCO2eqD47_Petersen(T))
  74
  75
  76Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  77_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  78def fCO2eqD47_Wang(T):
  79	'''
  80	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  81	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  82	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  83	'''
  84	return float(_fCO2eqD47_Wang(T))
  85
  86
  87def correlated_sum(X, C, w = None):
  88	'''
  89	Compute covariance-aware linear combinations
  90
  91	**Parameters**
  92	
  93	+ `X`: list or 1-D array of values to sum
  94	+ `C`: covariance matrix for the elements of `X`
  95	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  96	       (all equal to 1 by default)
  97
  98	Return the sum (and its SE) of the elements of `X`, with optional weights equal
  99	to the elements of `w`, accounting for covariances between the elements of `X`.
 100	'''
 101	if w is None:
 102		w = [1 for x in X]
 103	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 104
 105
 106def make_csv(x, hsep = ',', vsep = '\n'):
 107	'''
 108	Formats a list of lists of strings as a CSV
 109
 110	**Parameters**
 111
 112	+ `x`: the list of lists of strings to format
 113	+ `hsep`: the field separator (`,` by default)
 114	+ `vsep`: the line-ending convention to use (`\\n` by default)
 115
 116	**Example**
 117
 118	```py
 119	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 120	```
 121
 122	outputs:
 123
 124	```py
 125	a,b,c
 126	d,e,f
 127	```
 128	'''
 129	return vsep.join([hsep.join(l) for l in x])
 130
 131
 132def pf(txt):
 133	'''
 134	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 135	'''
 136	return txt.replace('-','_').replace('.','_').replace(' ','_')
 137
 138
 139def smart_type(x):
 140	'''
 141	Tries to convert string `x` to a float if it includes a decimal point, or
 142	to an integer if it does not. If both attempts fail, return the original
 143	string unchanged.
 144	'''
 145	try:
 146		y = float(x)
 147	except ValueError:
 148		return x
 149	if '.' not in x:
 150		return int(y)
 151	return y
 152
 153class _Defaults():
 154	def __init__(self):
 155		pass
 156
 157D47crunch_defaults = _Defaults()
 158D47crunch_defaults.PRETTY_TABLE_VSEP = '—'
 159
 160def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
 161	'''
 162	Reads a list of lists of strings and outputs an ascii table
 163
 164	**Parameters**
 165
 166	+ `x`: a list of lists of strings
 167	+ `header`: the number of lines to treat as header lines
 168	+ `hsep`: the horizontal separator between columns
 169	+ `vsep`: the character to use as vertical separator
 170	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 171
 172	**Example**
 173
 174	```py
 175	print(pretty_table([
 176		['A', 'B', 'C'],
 177		['1', '1.9999', 'foo'],
 178		['10', 'x', 'bar'],
 179	]))
 180	```
 181	yields:	
 182	```
 183	——  ——————  ———
 184	A        B    C
 185	——  ——————  ———
 186	1   1.9999  foo
 187	10       x  bar
 188	——  ——————  ———
 189	```
 190
 191	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
 192	
 193	```py
 194	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
 195	print(pretty_table([
 196		['A', 'B', 'C'],
 197		['1', '1.9999', 'foo'],
 198		['10', 'x', 'bar'],
 199	]))
 200	```
 201	yields:	
 202	```
 203	==  ======  ===
 204	A        B    C
 205	==  ======  ===
 206	1   1.9999  foo
 207	10       x  bar
 208	==  ======  ===
 209	```
 210	'''
 211	
 212	if vsep is None:
 213		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
 214	
 215	txt = []
 216	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 217
 218	if len(widths) > len(align):
 219		align += '>' * (len(widths)-len(align))
 220	sepline = hsep.join([vsep*w for w in widths])
 221	txt += [sepline]
 222	for k,l in enumerate(x):
 223		if k and k == header:
 224			txt += [sepline]
 225		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 226	txt += [sepline]
 227	txt += ['']
 228	return '\n'.join(txt)
 229
 230
 231def transpose_table(x):
 232	'''
 233	Transpose a list if lists
 234
 235	**Parameters**
 236
 237	+ `x`: a list of lists
 238
 239	**Example**
 240
 241	```py
 242	x = [[1, 2], [3, 4]]
 243	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 244	```
 245	'''
 246	return [[e for e in c] for c in zip(*x)]
 247
 248
 249def w_avg(X, sX) :
 250	'''
 251	Compute variance-weighted average
 252
 253	Returns the value and SE of the weighted average of the elements of `X`,
 254	with relative weights equal to their inverse variances (`1/sX**2`).
 255
 256	**Parameters**
 257
 258	+ `X`: array-like of elements to average
 259	+ `sX`: array-like of the corresponding SE values
 260
 261	**Tip**
 262
 263	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 264	they may be rearranged using `zip()`:
 265
 266	```python
 267	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 268	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 269	```
 270	'''
 271	X = [ x for x in X ]
 272	sX = [ sx for sx in sX ]
 273	W = [ sx**-2 for sx in sX ]
 274	W = [ w/sum(W) for w in W ]
 275	Xavg = sum([ w*x for w,x in zip(W,X) ])
 276	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 277	return Xavg, sXavg
 278
 279
 280def read_csv(filename, sep = ''):
 281	'''
 282	Read contents of `filename` in csv format and return a list of dictionaries.
 283
 284	In the csv string, spaces before and after field separators (`','` by default)
 285	are optional.
 286
 287	**Parameters**
 288
 289	+ `filename`: the csv file to read
 290	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 291	whichever appers most often in the contents of `filename`.
 292	'''
 293	with open(filename) as fid:
 294		txt = fid.read()
 295
 296	if sep == '':
 297		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 298	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 299	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 300
 301
 302def simulate_single_analysis(
 303	sample = 'MYSAMPLE',
 304	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 305	d13C_VPDB = None, d18O_VPDB = None,
 306	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 307	a47 = 1., b47 = 0., c47 = -0.9,
 308	a48 = 1., b48 = 0., c48 = -0.45,
 309	Nominal_D47 = None,
 310	Nominal_D48 = None,
 311	Nominal_d13C_VPDB = None,
 312	Nominal_d18O_VPDB = None,
 313	ALPHA_18O_ACID_REACTION = None,
 314	R13_VPDB = None,
 315	R17_VSMOW = None,
 316	R18_VSMOW = None,
 317	LAMBDA_17 = None,
 318	R18_VPDB = None,
 319	):
 320	'''
 321	Compute working-gas delta values for a single analysis, assuming a stochastic working
 322	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 323	
 324	**Parameters**
 325
 326	+ `sample`: sample name
 327	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 328		(respectively –4 and +26 ‰ by default)
 329	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 330	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 331		of the carbonate sample
 332	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 333		Δ48 values if `D47` or `D48` are not specified
 334	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 335		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 336	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 337	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 338		correction parameters (by default equal to the `D4xdata` default values)
 339	
 340	Returns a dictionary with fields
 341	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 342	'''
 343
 344	if Nominal_d13C_VPDB is None:
 345		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 346
 347	if Nominal_d18O_VPDB is None:
 348		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 349
 350	if ALPHA_18O_ACID_REACTION is None:
 351		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 352
 353	if R13_VPDB is None:
 354		R13_VPDB = D4xdata().R13_VPDB
 355
 356	if R17_VSMOW is None:
 357		R17_VSMOW = D4xdata().R17_VSMOW
 358
 359	if R18_VSMOW is None:
 360		R18_VSMOW = D4xdata().R18_VSMOW
 361
 362	if LAMBDA_17 is None:
 363		LAMBDA_17 = D4xdata().LAMBDA_17
 364
 365	if R18_VPDB is None:
 366		R18_VPDB = D4xdata().R18_VPDB
 367	
 368	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 369	
 370	if Nominal_D47 is None:
 371		Nominal_D47 = D47data().Nominal_D47
 372
 373	if Nominal_D48 is None:
 374		Nominal_D48 = D48data().Nominal_D48
 375	
 376	if d13C_VPDB is None:
 377		if sample in Nominal_d13C_VPDB:
 378			d13C_VPDB = Nominal_d13C_VPDB[sample]
 379		else:
 380			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
 381
 382	if d18O_VPDB is None:
 383		if sample in Nominal_d18O_VPDB:
 384			d18O_VPDB = Nominal_d18O_VPDB[sample]
 385		else:
 386			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 387
 388	if D47 is None:
 389		if sample in Nominal_D47:
 390			D47 = Nominal_D47[sample]
 391		else:
 392			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 393
 394	if D48 is None:
 395		if sample in Nominal_D48:
 396			D48 = Nominal_D48[sample]
 397		else:
 398			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 399
 400	X = D4xdata()
 401	X.R13_VPDB = R13_VPDB
 402	X.R17_VSMOW = R17_VSMOW
 403	X.R18_VSMOW = R18_VSMOW
 404	X.LAMBDA_17 = LAMBDA_17
 405	X.R18_VPDB = R18_VPDB
 406	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 407
 408	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 409		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 410		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 411		)
 412	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 413		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 414		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 415		D17O=D17O, D47=D47, D48=D48, D49=D49,
 416		)
 417	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 418		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 419		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 420		D17O=D17O,
 421		)
 422	
 423	d45 = 1000 * (R45/R45wg - 1)
 424	d46 = 1000 * (R46/R46wg - 1)
 425	d47 = 1000 * (R47/R47wg - 1)
 426	d48 = 1000 * (R48/R48wg - 1)
 427	d49 = 1000 * (R49/R49wg - 1)
 428
 429	for k in range(3): # dumb iteration to adjust for small changes in d47
 430		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 431		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 432		d47 = 1000 * (R47raw/R47wg - 1)
 433		d48 = 1000 * (R48raw/R48wg - 1)
 434
 435	return dict(
 436		Sample = sample,
 437		D17O = D17O,
 438		d13Cwg_VPDB = d13Cwg_VPDB,
 439		d18Owg_VSMOW = d18Owg_VSMOW,
 440		d45 = d45,
 441		d46 = d46,
 442		d47 = d47,
 443		d48 = d48,
 444		d49 = d49,
 445		)
 446
 447
 448def virtual_data(
 449	samples = [],
 450	a47 = 1., b47 = 0., c47 = -0.9,
 451	a48 = 1., b48 = 0., c48 = -0.45,
 452	rd45 = 0.020, rd46 = 0.060,
 453	rD47 = 0.015, rD48 = 0.045,
 454	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 455	session = None,
 456	Nominal_D47 = None, Nominal_D48 = None,
 457	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 458	ALPHA_18O_ACID_REACTION = None,
 459	R13_VPDB = None,
 460	R17_VSMOW = None,
 461	R18_VSMOW = None,
 462	LAMBDA_17 = None,
 463	R18_VPDB = None,
 464	seed = 0,
 465	shuffle = True,
 466	):
 467	'''
 468	Return list with simulated analyses from a single session.
 469	
 470	**Parameters**
 471	
 472	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 473	    * `Sample`: the name of the sample
 474	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 475	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 476	    * `N`: how many analyses to generate for this sample
 477	+ `a47`: scrambling factor for Δ47
 478	+ `b47`: compositional nonlinearity for Δ47
 479	+ `c47`: working gas offset for Δ47
 480	+ `a48`: scrambling factor for Δ48
 481	+ `b48`: compositional nonlinearity for Δ48
 482	+ `c48`: working gas offset for Δ48
 483	+ `rd45`: analytical repeatability of δ45
 484	+ `rd46`: analytical repeatability of δ46
 485	+ `rD47`: analytical repeatability of Δ47
 486	+ `rD48`: analytical repeatability of Δ48
 487	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 488		(by default equal to the `simulate_single_analysis` default values)
 489	+ `session`: name of the session (no name by default)
 490	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 491		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 492	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 493		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 494		(by default equal to the `simulate_single_analysis` defaults)
 495	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 496		(by default equal to the `simulate_single_analysis` defaults)
 497	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 498		correction parameters (by default equal to the `simulate_single_analysis` default)
 499	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 500	+ `shuffle`: randomly reorder the sequence of analyses
 501	
 502		
 503	Here is an example of using this method to generate an arbitrary combination of
 504	anchors and unknowns for a bunch of sessions:
 505
 506	```py
 507	.. include:: ../../code_examples/virtual_data/example.py
 508	```
 509	
 510	This should output something like:
 511	
 512	```
 513	.. include:: ../../code_examples/virtual_data/output.txt
 514	```
 515	'''
 516	
 517	kwargs = locals().copy()
 518
 519	from numpy import random as nprandom
 520	if seed:
 521		nprandom.seed(seed)
 522		rng = nprandom.default_rng(seed)
 523	else:
 524		rng = nprandom.default_rng()
 525	
 526	N = sum([s['N'] for s in samples])
 527	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 528	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 529	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 530	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 531	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 532	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 533	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 534	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 535	
 536	k = 0
 537	out = []
 538	for s in samples:
 539		kw = {}
 540		kw['sample'] = s['Sample']
 541		kw = {
 542			**kw,
 543			**{var: kwargs[var]
 544				for var in [
 545					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 546					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 547					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 548					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 549					]
 550				if kwargs[var] is not None},
 551			**{var: s[var]
 552				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 553				if var in s},
 554			}
 555
 556		sN = s['N']
 557		while sN:
 558			out.append(simulate_single_analysis(**kw))
 559			out[-1]['d45'] += errors45[k]
 560			out[-1]['d46'] += errors46[k]
 561			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 562			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 563			sN -= 1
 564			k += 1
 565
 566		if session is not None:
 567			for r in out:
 568				r['Session'] = session
 569
 570		if shuffle:
 571			nprandom.shuffle(out)
 572
 573	return out
 574
 575def table_of_samples(
 576	data47 = None,
 577	data48 = None,
 578	dir = 'output',
 579	filename = None,
 580	save_to_file = True,
 581	print_out = True,
 582	output = None,
 583	):
 584	'''
 585	Print out, save to disk and/or return a combined table of samples
 586	for a pair of `D47data` and `D48data` objects.
 587
 588	**Parameters**
 589
 590	+ `data47`: `D47data` instance
 591	+ `data48`: `D48data` instance
 592	+ `dir`: the directory in which to save the table
 593	+ `filename`: the name to the csv file to write to
 594	+ `save_to_file`: whether to save the table to disk
 595	+ `print_out`: whether to print out the table
 596	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 597		if set to `'raw'`: return a list of list of strings
 598		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 599	'''
 600	if data47 is None:
 601		if data48 is None:
 602			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 603		else:
 604			return data48.table_of_samples(
 605				dir = dir,
 606				filename = filename,
 607				save_to_file = save_to_file,
 608				print_out = print_out,
 609				output = output
 610				)
 611	else:
 612		if data48 is None:
 613			return data47.table_of_samples(
 614				dir = dir,
 615				filename = filename,
 616				save_to_file = save_to_file,
 617				print_out = print_out,
 618				output = output
 619				)
 620		else:
 621			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 622			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 623			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 624
 625			if save_to_file:
 626				if not os.path.exists(dir):
 627					os.makedirs(dir)
 628				if filename is None:
 629					filename = f'D47D48_samples.csv'
 630				with open(f'{dir}/{filename}', 'w') as fid:
 631					fid.write(make_csv(out))
 632			if print_out:
 633				print('\n'+pretty_table(out))
 634			if output == 'raw':
 635				return out
 636			elif output == 'pretty':
 637				return pretty_table(out)
 638
 639
 640def table_of_sessions(
 641	data47 = None,
 642	data48 = None,
 643	dir = 'output',
 644	filename = None,
 645	save_to_file = True,
 646	print_out = True,
 647	output = None,
 648	):
 649	'''
 650	Print out, save to disk and/or return a combined table of sessions
 651	for a pair of `D47data` and `D48data` objects.
 652	***Only applicable if the sessions in `data47` and those in `data48`
 653	consist of the exact same sets of analyses.***
 654
 655	**Parameters**
 656
 657	+ `data47`: `D47data` instance
 658	+ `data48`: `D48data` instance
 659	+ `dir`: the directory in which to save the table
 660	+ `filename`: the name to the csv file to write to
 661	+ `save_to_file`: whether to save the table to disk
 662	+ `print_out`: whether to print out the table
 663	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 664		if set to `'raw'`: return a list of list of strings
 665		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 666	'''
 667	if data47 is None:
 668		if data48 is None:
 669			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 670		else:
 671			return data48.table_of_sessions(
 672				dir = dir,
 673				filename = filename,
 674				save_to_file = save_to_file,
 675				print_out = print_out,
 676				output = output
 677				)
 678	else:
 679		if data48 is None:
 680			return data47.table_of_sessions(
 681				dir = dir,
 682				filename = filename,
 683				save_to_file = save_to_file,
 684				print_out = print_out,
 685				output = output
 686				)
 687		else:
 688			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 689			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 690			for k,x in enumerate(out47[0]):
 691				if k>7:
 692					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 693					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 694			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 695
 696			if save_to_file:
 697				if not os.path.exists(dir):
 698					os.makedirs(dir)
 699				if filename is None:
 700					filename = f'D47D48_sessions.csv'
 701				with open(f'{dir}/{filename}', 'w') as fid:
 702					fid.write(make_csv(out))
 703			if print_out:
 704				print('\n'+pretty_table(out))
 705			if output == 'raw':
 706				return out
 707			elif output == 'pretty':
 708				return pretty_table(out)
 709
 710
 711def table_of_analyses(
 712	data47 = None,
 713	data48 = None,
 714	dir = 'output',
 715	filename = None,
 716	save_to_file = True,
 717	print_out = True,
 718	output = None,
 719	):
 720	'''
 721	Print out, save to disk and/or return a combined table of analyses
 722	for a pair of `D47data` and `D48data` objects.
 723
 724	If the sessions in `data47` and those in `data48` do not consist of
 725	the exact same sets of analyses, the table will have two columns
 726	`Session_47` and `Session_48` instead of a single `Session` column.
 727
 728	**Parameters**
 729
 730	+ `data47`: `D47data` instance
 731	+ `data48`: `D48data` instance
 732	+ `dir`: the directory in which to save the table
 733	+ `filename`: the name to the csv file to write to
 734	+ `save_to_file`: whether to save the table to disk
 735	+ `print_out`: whether to print out the table
 736	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 737		if set to `'raw'`: return a list of list of strings
 738		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 739	'''
 740	if data47 is None:
 741		if data48 is None:
 742			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 743		else:
 744			return data48.table_of_analyses(
 745				dir = dir,
 746				filename = filename,
 747				save_to_file = save_to_file,
 748				print_out = print_out,
 749				output = output
 750				)
 751	else:
 752		if data48 is None:
 753			return data47.table_of_analyses(
 754				dir = dir,
 755				filename = filename,
 756				save_to_file = save_to_file,
 757				print_out = print_out,
 758				output = output
 759				)
 760		else:
 761			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 762			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 763			
 764			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 765				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 766			else:
 767				out47[0][1] = 'Session_47'
 768				out48[0][1] = 'Session_48'
 769				out47 = transpose_table(out47)
 770				out48 = transpose_table(out48)
 771				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 772
 773			if save_to_file:
 774				if not os.path.exists(dir):
 775					os.makedirs(dir)
 776				if filename is None:
 777					filename = f'D47D48_sessions.csv'
 778				with open(f'{dir}/{filename}', 'w') as fid:
 779					fid.write(make_csv(out))
 780			if print_out:
 781				print('\n'+pretty_table(out))
 782			if output == 'raw':
 783				return out
 784			elif output == 'pretty':
 785				return pretty_table(out)
 786
 787
 788def _fullcovar(minresult, epsilon = 0.01, named = False):
 789	'''
 790	Construct full covariance matrix in the case of constrained parameters
 791	'''
 792	
 793	import asteval
 794	
 795	def f(values):
 796		interp = asteval.Interpreter()
 797		for n,v in zip(minresult.var_names, values):
 798			interp(f'{n} = {v}')
 799		for q in minresult.params:
 800			if minresult.params[q].expr:
 801				interp(f'{q} = {minresult.params[q].expr}')
 802		return np.array([interp.symtable[q] for q in minresult.params])
 803
 804	# construct Jacobian
 805	J = np.zeros((minresult.nvarys, len(minresult.params)))
 806	X = np.array([minresult.params[p].value for p in minresult.var_names])
 807	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 808
 809	for j in range(minresult.nvarys):
 810		x1 = [_ for _ in X]
 811		x1[j] += epsilon * sX[j]
 812		x2 = [_ for _ in X]
 813		x2[j] -= epsilon * sX[j]
 814		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 815
 816	_names = [q for q in minresult.params]
 817	_covar = J.T @ minresult.covar @ J
 818	_se = np.diag(_covar)**.5
 819	_correl = _covar.copy()
 820	for k,s in enumerate(_se):
 821		if s:
 822			_correl[k,:] /= s
 823			_correl[:,k] /= s
 824
 825	if named:
 826		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 827		_se = {i: _se[i] for i in minresult.params}
 828		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 829
 830	return _names, _covar, _se, _correl
 831
 832
 833class D4xdata(list):
 834	'''
 835	Store and process data for a large set of Δ47 and/or Δ48
 836	analyses, usually comprising more than one analytical session.
 837	'''
 838
 839	### 17O CORRECTION PARAMETERS
 840	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 841	'''
 842	Absolute (13C/12C) ratio of VPDB.
 843	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 844	'''
 845
 846	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 847	'''
 848	Absolute (18O/16C) ratio of VSMOW.
 849	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 850	'''
 851
 852	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 853	'''
 854	Mass-dependent exponent for triple oxygen isotopes.
 855	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 856	'''
 857
 858	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 859	'''
 860	Absolute (17O/16C) ratio of VSMOW.
 861	By default equal to 0.00038475
 862	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 863	rescaled to `R13_VPDB`)
 864	'''
 865
 866	R18_VPDB = R18_VSMOW * 1.03092
 867	'''
 868	Absolute (18O/16C) ratio of VPDB.
 869	By definition equal to `R18_VSMOW * 1.03092`.
 870	'''
 871
 872	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 873	'''
 874	Absolute (17O/16C) ratio of VPDB.
 875	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 876	'''
 877
 878	LEVENE_REF_SAMPLE = 'ETH-3'
 879	'''
 880	After the Δ4x standardization step, each sample is tested to
 881	assess whether the Δ4x variance within all analyses for that
 882	sample differs significantly from that observed for a given reference
 883	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 884	which yields a p-value corresponding to the null hypothesis that the
 885	underlying variances are equal).
 886
 887	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 888	sample should be used as a reference for this test.
 889	'''
 890
 891	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 892	'''
 893	Specifies the 18O/16O fractionation factor generally applicable
 894	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 895	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 896
 897	By default equal to 1.008129 (calcite reacted at 90 °C,
 898	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 899	'''
 900
 901	Nominal_d13C_VPDB = {
 902		'ETH-1': 2.02,
 903		'ETH-2': -10.17,
 904		'ETH-3': 1.71,
 905		}	# (Bernasconi et al., 2018)
 906	'''
 907	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 908	`D4xdata.standardize_d13C()`.
 909
 910	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 911	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 912	'''
 913
 914	Nominal_d18O_VPDB = {
 915		'ETH-1': -2.19,
 916		'ETH-2': -18.69,
 917		'ETH-3': -1.78,
 918		}	# (Bernasconi et al., 2018)
 919	'''
 920	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 921	`D4xdata.standardize_d18O()`.
 922
 923	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 924	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 925	'''
 926
 927	d13C_STANDARDIZATION_METHOD = '2pt'
 928	'''
 929	Method by which to standardize δ13C values:
 930	
 931	+ `none`: do not apply any δ13C standardization.
 932	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 933	minimize the difference between final δ13C_VPDB values and
 934	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 935	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 936	values so as to minimize the difference between final δ13C_VPDB
 937	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 938	is defined).
 939	'''
 940
 941	d18O_STANDARDIZATION_METHOD = '2pt'
 942	'''
 943	Method by which to standardize δ18O values:
 944	
 945	+ `none`: do not apply any δ18O standardization.
 946	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 947	minimize the difference between final δ18O_VPDB values and
 948	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 949	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 950	values so as to minimize the difference between final δ18O_VPDB
 951	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 952	is defined).
 953	'''
 954
 955	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 956		'''
 957		**Parameters**
 958
 959		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 960		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 961		+ `mass`: `'47'` or `'48'`
 962		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 963		+ `session`: define session name for analyses without a `Session` key
 964		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 965
 966		Returns a `D4xdata` object derived from `list`.
 967		'''
 968		self._4x = mass
 969		self.verbose = verbose
 970		self.prefix = 'D4xdata'
 971		self.logfile = logfile
 972		list.__init__(self, l)
 973		self.Nf = None
 974		self.repeatability = {}
 975		self.refresh(session = session)
 976
 977
 978	def make_verbal(oldfun):
 979		'''
 980		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 981		'''
 982		@wraps(oldfun)
 983		def newfun(*args, verbose = '', **kwargs):
 984			myself = args[0]
 985			oldprefix = myself.prefix
 986			myself.prefix = oldfun.__name__
 987			if verbose != '':
 988				oldverbose = myself.verbose
 989				myself.verbose = verbose
 990			out = oldfun(*args, **kwargs)
 991			myself.prefix = oldprefix
 992			if verbose != '':
 993				myself.verbose = oldverbose
 994			return out
 995		return newfun
 996
 997
 998	def msg(self, txt):
 999		'''
1000		Log a message to `self.logfile`, and print it out if `verbose = True`
1001		'''
1002		self.log(txt)
1003		if self.verbose:
1004			print(f'{f"[{self.prefix}]":<16} {txt}')
1005
1006
1007	def vmsg(self, txt):
1008		'''
1009		Log a message to `self.logfile` and print it out
1010		'''
1011		self.log(txt)
1012		print(txt)
1013
1014
1015	def log(self, *txts):
1016		'''
1017		Log a message to `self.logfile`
1018		'''
1019		if self.logfile:
1020			with open(self.logfile, 'a') as fid:
1021				for txt in txts:
1022					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1023
1024
1025	def refresh(self, session = 'mySession'):
1026		'''
1027		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1028		'''
1029		self.fill_in_missing_info(session = session)
1030		self.refresh_sessions()
1031		self.refresh_samples()
1032
1033
1034	def refresh_sessions(self):
1035		'''
1036		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1037		to `False` for all sessions.
1038		'''
1039		self.sessions = {
1040			s: {'data': [r for r in self if r['Session'] == s]}
1041			for s in sorted({r['Session'] for r in self})
1042			}
1043		for s in self.sessions:
1044			self.sessions[s]['scrambling_drift'] = False
1045			self.sessions[s]['slope_drift'] = False
1046			self.sessions[s]['wg_drift'] = False
1047			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1048			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1049
1050
1051	def refresh_samples(self):
1052		'''
1053		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1054		'''
1055		self.samples = {
1056			s: {'data': [r for r in self if r['Sample'] == s]}
1057			for s in sorted({r['Sample'] for r in self})
1058			}
1059		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1060		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1061
1062
1063	def read(self, filename, sep = '', session = ''):
1064		'''
1065		Read file in csv format to load data into a `D47data` object.
1066
1067		In the csv file, spaces before and after field separators (`','` by default)
1068		are optional. Each line corresponds to a single analysis.
1069
1070		The required fields are:
1071
1072		+ `UID`: a unique identifier
1073		+ `Session`: an identifier for the analytical session
1074		+ `Sample`: a sample identifier
1075		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1076
1077		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1078		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1079		and `d49` are optional, and set to NaN by default.
1080
1081		**Parameters**
1082
1083		+ `fileneme`: the path of the file to read
1084		+ `sep`: csv separator delimiting the fields
1085		+ `session`: set `Session` field to this string for all analyses
1086		'''
1087		with open(filename) as fid:
1088			self.input(fid.read(), sep = sep, session = session)
1089
1090
1091	def input(self, txt, sep = '', session = ''):
1092		'''
1093		Read `txt` string in csv format to load analysis data into a `D47data` object.
1094
1095		In the csv string, spaces before and after field separators (`','` by default)
1096		are optional. Each line corresponds to a single analysis.
1097
1098		The required fields are:
1099
1100		+ `UID`: a unique identifier
1101		+ `Session`: an identifier for the analytical session
1102		+ `Sample`: a sample identifier
1103		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1104
1105		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1106		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1107		and `d49` are optional, and set to NaN by default.
1108
1109		**Parameters**
1110
1111		+ `txt`: the csv string to read
1112		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1113		whichever appers most often in `txt`.
1114		+ `session`: set `Session` field to this string for all analyses
1115		'''
1116		if sep == '':
1117			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1118		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1119		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1120
1121		if session != '':
1122			for r in data:
1123				r['Session'] = session
1124
1125		self += data
1126		self.refresh()
1127
1128
1129	@make_verbal
1130	def wg(self, samples = None, a18_acid = None):
1131		'''
1132		Compute bulk composition of the working gas for each session based on
1133		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1134		`self.Nominal_d18O_VPDB`.
1135		'''
1136
1137		self.msg('Computing WG composition:')
1138
1139		if a18_acid is None:
1140			a18_acid = self.ALPHA_18O_ACID_REACTION
1141		if samples is None:
1142			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1143
1144		assert a18_acid, f'Acid fractionation factor should not be zero.'
1145
1146		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1147		R45R46_standards = {}
1148		for sample in samples:
1149			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1150			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1151			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1152			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1153			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1154
1155			C12_s = 1 / (1 + R13_s)
1156			C13_s = R13_s / (1 + R13_s)
1157			C16_s = 1 / (1 + R17_s + R18_s)
1158			C17_s = R17_s / (1 + R17_s + R18_s)
1159			C18_s = R18_s / (1 + R17_s + R18_s)
1160
1161			C626_s = C12_s * C16_s ** 2
1162			C627_s = 2 * C12_s * C16_s * C17_s
1163			C628_s = 2 * C12_s * C16_s * C18_s
1164			C636_s = C13_s * C16_s ** 2
1165			C637_s = 2 * C13_s * C16_s * C17_s
1166			C727_s = C12_s * C17_s ** 2
1167
1168			R45_s = (C627_s + C636_s) / C626_s
1169			R46_s = (C628_s + C637_s + C727_s) / C626_s
1170			R45R46_standards[sample] = (R45_s, R46_s)
1171		
1172		for s in self.sessions:
1173			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1174			assert db, f'No sample from {samples} found in session "{s}".'
1175# 			dbsamples = sorted({r['Sample'] for r in db})
1176
1177			X = [r['d45'] for r in db]
1178			Y = [R45R46_standards[r['Sample']][0] for r in db]
1179			x1, x2 = np.min(X), np.max(X)
1180
1181			if x1 < x2:
1182				wgcoord = x1/(x1-x2)
1183			else:
1184				wgcoord = 999
1185
1186			if wgcoord < -.5 or wgcoord > 1.5:
1187				# unreasonable to extrapolate to d45 = 0
1188				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1189			else :
1190				# d45 = 0 is reasonably well bracketed
1191				R45_wg = np.polyfit(X, Y, 1)[1]
1192
1193			X = [r['d46'] for r in db]
1194			Y = [R45R46_standards[r['Sample']][1] for r in db]
1195			x1, x2 = np.min(X), np.max(X)
1196
1197			if x1 < x2:
1198				wgcoord = x1/(x1-x2)
1199			else:
1200				wgcoord = 999
1201
1202			if wgcoord < -.5 or wgcoord > 1.5:
1203				# unreasonable to extrapolate to d46 = 0
1204				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1205			else :
1206				# d46 = 0 is reasonably well bracketed
1207				R46_wg = np.polyfit(X, Y, 1)[1]
1208
1209			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1210
1211			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1212
1213			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1214			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1215			for r in self.sessions[s]['data']:
1216				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1217				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1218
1219
1220	def compute_bulk_delta(self, R45, R46, D17O = 0):
1221		'''
1222		Compute δ13C_VPDB and δ18O_VSMOW,
1223		by solving the generalized form of equation (17) from
1224		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1225		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1226		solving the corresponding second-order Taylor polynomial.
1227		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1228		'''
1229
1230		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1231
1232		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1233		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1234		C = 2 * self.R18_VSMOW
1235		D = -R46
1236
1237		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1238		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1239		cc = A + B + C + D
1240
1241		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1242
1243		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1244		R17 = K * R18 ** self.LAMBDA_17
1245		R13 = R45 - 2 * R17
1246
1247		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1248
1249		return d13C_VPDB, d18O_VSMOW
1250
1251
1252	@make_verbal
1253	def crunch(self, verbose = ''):
1254		'''
1255		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1256		'''
1257		for r in self:
1258			self.compute_bulk_and_clumping_deltas(r)
1259		self.standardize_d13C()
1260		self.standardize_d18O()
1261		self.msg(f"Crunched {len(self)} analyses.")
1262
1263
1264	def fill_in_missing_info(self, session = 'mySession'):
1265		'''
1266		Fill in optional fields with default values
1267		'''
1268		for i,r in enumerate(self):
1269			if 'D17O' not in r:
1270				r['D17O'] = 0.
1271			if 'UID' not in r:
1272				r['UID'] = f'{i+1}'
1273			if 'Session' not in r:
1274				r['Session'] = session
1275			for k in ['d47', 'd48', 'd49']:
1276				if k not in r:
1277					r[k] = np.nan
1278
1279
1280	def standardize_d13C(self):
1281		'''
1282		Perform δ13C standadization within each session `s` according to
1283		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1284		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1285		may be redefined abitrarily at a later stage.
1286		'''
1287		for s in self.sessions:
1288			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1289				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1290				X,Y = zip(*XY)
1291				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1292					offset = np.mean(Y) - np.mean(X)
1293					for r in self.sessions[s]['data']:
1294						r['d13C_VPDB'] += offset				
1295				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1296					a,b = np.polyfit(X,Y,1)
1297					for r in self.sessions[s]['data']:
1298						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1299
1300	def standardize_d18O(self):
1301		'''
1302		Perform δ18O standadization within each session `s` according to
1303		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1304		which is defined by default by `D47data.refresh_sessions()`as equal to
1305		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1306		'''
1307		for s in self.sessions:
1308			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1309				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1310				X,Y = zip(*XY)
1311				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1312				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1313					offset = np.mean(Y) - np.mean(X)
1314					for r in self.sessions[s]['data']:
1315						r['d18O_VSMOW'] += offset				
1316				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1317					a,b = np.polyfit(X,Y,1)
1318					for r in self.sessions[s]['data']:
1319						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1320	
1321
1322	def compute_bulk_and_clumping_deltas(self, r):
1323		'''
1324		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1325		'''
1326
1327		# Compute working gas R13, R18, and isobar ratios
1328		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1329		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1330		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1331
1332		# Compute analyte isobar ratios
1333		R45 = (1 + r['d45'] / 1000) * R45_wg
1334		R46 = (1 + r['d46'] / 1000) * R46_wg
1335		R47 = (1 + r['d47'] / 1000) * R47_wg
1336		R48 = (1 + r['d48'] / 1000) * R48_wg
1337		R49 = (1 + r['d49'] / 1000) * R49_wg
1338
1339		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1340		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1341		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1342
1343		# Compute stochastic isobar ratios of the analyte
1344		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1345			R13, R18, D17O = r['D17O']
1346		)
1347
1348		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1349		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1350		if (R45 / R45stoch - 1) > 5e-8:
1351			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1352		if (R46 / R46stoch - 1) > 5e-8:
1353			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1354
1355		# Compute raw clumped isotope anomalies
1356		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1357		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1358		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1359
1360
1361	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1362		'''
1363		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1364		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1365		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1366		'''
1367
1368		# Compute R17
1369		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1370
1371		# Compute isotope concentrations
1372		C12 = (1 + R13) ** -1
1373		C13 = C12 * R13
1374		C16 = (1 + R17 + R18) ** -1
1375		C17 = C16 * R17
1376		C18 = C16 * R18
1377
1378		# Compute stochastic isotopologue concentrations
1379		C626 = C16 * C12 * C16
1380		C627 = C16 * C12 * C17 * 2
1381		C628 = C16 * C12 * C18 * 2
1382		C636 = C16 * C13 * C16
1383		C637 = C16 * C13 * C17 * 2
1384		C638 = C16 * C13 * C18 * 2
1385		C727 = C17 * C12 * C17
1386		C728 = C17 * C12 * C18 * 2
1387		C737 = C17 * C13 * C17
1388		C738 = C17 * C13 * C18 * 2
1389		C828 = C18 * C12 * C18
1390		C838 = C18 * C13 * C18
1391
1392		# Compute stochastic isobar ratios
1393		R45 = (C636 + C627) / C626
1394		R46 = (C628 + C637 + C727) / C626
1395		R47 = (C638 + C728 + C737) / C626
1396		R48 = (C738 + C828) / C626
1397		R49 = C838 / C626
1398
1399		# Account for stochastic anomalies
1400		R47 *= 1 + D47 / 1000
1401		R48 *= 1 + D48 / 1000
1402		R49 *= 1 + D49 / 1000
1403
1404		# Return isobar ratios
1405		return R45, R46, R47, R48, R49
1406
1407
1408	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1409		'''
1410		Split unknown samples by UID (treat all analyses as different samples)
1411		or by session (treat analyses of a given sample in different sessions as
1412		different samples).
1413
1414		**Parameters**
1415
1416		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1417		+ `grouping`: `by_uid` | `by_session`
1418		'''
1419		if samples_to_split == 'all':
1420			samples_to_split = [s for s in self.unknowns]
1421		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1422		self.grouping = grouping.lower()
1423		if self.grouping in gkeys:
1424			gkey = gkeys[self.grouping]
1425		for r in self:
1426			if r['Sample'] in samples_to_split:
1427				r['Sample_original'] = r['Sample']
1428				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1429			elif r['Sample'] in self.unknowns:
1430				r['Sample_original'] = r['Sample']
1431		self.refresh_samples()
1432
1433
1434	def unsplit_samples(self, tables = False):
1435		'''
1436		Reverse the effects of `D47data.split_samples()`.
1437		
1438		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1439		
1440		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1441		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1442		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1443		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1444		that case session-averaged Δ4x values are statistically independent).
1445		'''
1446		unknowns_old = sorted({s for s in self.unknowns})
1447		CM_old = self.standardization.covar[:,:]
1448		VD_old = self.standardization.params.valuesdict().copy()
1449		vars_old = self.standardization.var_names
1450
1451		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1452
1453		Ns = len(vars_old) - len(unknowns_old)
1454		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1455		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1456
1457		W = np.zeros((len(vars_new), len(vars_old)))
1458		W[:Ns,:Ns] = np.eye(Ns)
1459		for u in unknowns_new:
1460			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1461			if self.grouping == 'by_session':
1462				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1463			elif self.grouping == 'by_uid':
1464				weights = [1 for s in splits]
1465			sw = sum(weights)
1466			weights = [w/sw for w in weights]
1467			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1468
1469		CM_new = W @ CM_old @ W.T
1470		V = W @ np.array([[VD_old[k]] for k in vars_old])
1471		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1472
1473		self.standardization.covar = CM_new
1474		self.standardization.params.valuesdict = lambda : VD_new
1475		self.standardization.var_names = vars_new
1476
1477		for r in self:
1478			if r['Sample'] in self.unknowns:
1479				r['Sample_split'] = r['Sample']
1480				r['Sample'] = r['Sample_original']
1481
1482		self.refresh_samples()
1483		self.consolidate_samples()
1484		self.repeatabilities()
1485
1486		if tables:
1487			self.table_of_analyses()
1488			self.table_of_samples()
1489
1490	def assign_timestamps(self):
1491		'''
1492		Assign a time field `t` of type `float` to each analysis.
1493
1494		If `TimeTag` is one of the data fields, `t` is equal within a given session
1495		to `TimeTag` minus the mean value of `TimeTag` for that session.
1496		Otherwise, `TimeTag` is by default equal to the index of each analysis
1497		in the dataset and `t` is defined as above.
1498		'''
1499		for session in self.sessions:
1500			sdata = self.sessions[session]['data']
1501			try:
1502				t0 = np.mean([r['TimeTag'] for r in sdata])
1503				for r in sdata:
1504					r['t'] = r['TimeTag'] - t0
1505			except KeyError:
1506				t0 = (len(sdata)-1)/2
1507				for t,r in enumerate(sdata):
1508					r['t'] = t - t0
1509
1510
1511	def report(self):
1512		'''
1513		Prints a report on the standardization fit.
1514		Only applicable after `D4xdata.standardize(method='pooled')`.
1515		'''
1516		report_fit(self.standardization)
1517
1518
1519	def combine_samples(self, sample_groups):
1520		'''
1521		Combine analyses of different samples to compute weighted average Δ4x
1522		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1523		dictionary.
1524		
1525		Caution: samples are weighted by number of replicate analyses, which is a
1526		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1527		correlated analytical errors for one or more samples).
1528		
1529		Returns a tuplet of:
1530		
1531		+ the list of group names
1532		+ an array of the corresponding Δ4x values
1533		+ the corresponding (co)variance matrix
1534		
1535		**Parameters**
1536
1537		+ `sample_groups`: a dictionary of the form:
1538		```py
1539		{'group1': ['sample_1', 'sample_2'],
1540		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1541		```
1542		'''
1543		
1544		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1545		groups = sorted(sample_groups.keys())
1546		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1547		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1548		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1549		W = np.array([
1550			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1551			for j in groups])
1552		D4x_new = W @ D4x_old
1553		CM_new = W @ CM_old @ W.T
1554
1555		return groups, D4x_new[:,0], CM_new
1556		
1557
1558	@make_verbal
1559	def standardize(self,
1560		method = 'pooled',
1561		weighted_sessions = [],
1562		consolidate = True,
1563		consolidate_tables = False,
1564		consolidate_plots = False,
1565		constraints = {},
1566		):
1567		'''
1568		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1569		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1570		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1571		i.e. that their true Δ4x value does not change between sessions,
1572		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1573		`'indep_sessions'`, the standardization processes each session independently, based only
1574		on anchors analyses.
1575		'''
1576
1577		self.standardization_method = method
1578		self.assign_timestamps()
1579
1580		if method == 'pooled':
1581			if weighted_sessions:
1582				for session_group in weighted_sessions:
1583					if self._4x == '47':
1584						X = D47data([r for r in self if r['Session'] in session_group])
1585					elif self._4x == '48':
1586						X = D48data([r for r in self if r['Session'] in session_group])
1587					X.Nominal_D4x = self.Nominal_D4x.copy()
1588					X.refresh()
1589					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1590					w = np.sqrt(result.redchi)
1591					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1592					for r in X:
1593						r[f'wD{self._4x}raw'] *= w
1594			else:
1595				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1596				for r in self:
1597					r[f'wD{self._4x}raw'] = 1.
1598
1599			params = Parameters()
1600			for k,session in enumerate(self.sessions):
1601				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1602				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1603				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1604				s = pf(session)
1605				params.add(f'a_{s}', value = 0.9)
1606				params.add(f'b_{s}', value = 0.)
1607				params.add(f'c_{s}', value = -0.9)
1608				params.add(f'a2_{s}', value = 0.,
1609# 					vary = self.sessions[session]['scrambling_drift'],
1610					)
1611				params.add(f'b2_{s}', value = 0.,
1612# 					vary = self.sessions[session]['slope_drift'],
1613					)
1614				params.add(f'c2_{s}', value = 0.,
1615# 					vary = self.sessions[session]['wg_drift'],
1616					)
1617				if not self.sessions[session]['scrambling_drift']:
1618					params[f'a2_{s}'].expr = '0'
1619				if not self.sessions[session]['slope_drift']:
1620					params[f'b2_{s}'].expr = '0'
1621				if not self.sessions[session]['wg_drift']:
1622					params[f'c2_{s}'].expr = '0'
1623
1624			for sample in self.unknowns:
1625				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1626
1627			for k in constraints:
1628				params[k].expr = constraints[k]
1629
1630			def residuals(p):
1631				R = []
1632				for r in self:
1633					session = pf(r['Session'])
1634					sample = pf(r['Sample'])
1635					if r['Sample'] in self.Nominal_D4x:
1636						R += [ (
1637							r[f'D{self._4x}raw'] - (
1638								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1639								+ p[f'b_{session}'] * r[f'd{self._4x}']
1640								+	p[f'c_{session}']
1641								+ r['t'] * (
1642									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1643									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1644									+	p[f'c2_{session}']
1645									)
1646								)
1647							) / r[f'wD{self._4x}raw'] ]
1648					else:
1649						R += [ (
1650							r[f'D{self._4x}raw'] - (
1651								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1652								+ p[f'b_{session}'] * r[f'd{self._4x}']
1653								+	p[f'c_{session}']
1654								+ r['t'] * (
1655									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1656									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1657									+	p[f'c2_{session}']
1658									)
1659								)
1660							) / r[f'wD{self._4x}raw'] ]
1661				return R
1662
1663			M = Minimizer(residuals, params)
1664			result = M.least_squares()
1665			self.Nf = result.nfree
1666			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1667			new_names, new_covar, new_se = _fullcovar(result)[:3]
1668			result.var_names = new_names
1669			result.covar = new_covar
1670
1671			for r in self:
1672				s = pf(r["Session"])
1673				a = result.params.valuesdict()[f'a_{s}']
1674				b = result.params.valuesdict()[f'b_{s}']
1675				c = result.params.valuesdict()[f'c_{s}']
1676				a2 = result.params.valuesdict()[f'a2_{s}']
1677				b2 = result.params.valuesdict()[f'b2_{s}']
1678				c2 = result.params.valuesdict()[f'c2_{s}']
1679				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1680				
1681
1682			self.standardization = result
1683
1684			for session in self.sessions:
1685				self.sessions[session]['Np'] = 3
1686				for k in ['scrambling', 'slope', 'wg']:
1687					if self.sessions[session][f'{k}_drift']:
1688						self.sessions[session]['Np'] += 1
1689
1690			if consolidate:
1691				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1692			return result
1693
1694
1695		elif method == 'indep_sessions':
1696
1697			if weighted_sessions:
1698				for session_group in weighted_sessions:
1699					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1700					X.Nominal_D4x = self.Nominal_D4x.copy()
1701					X.refresh()
1702					# This is only done to assign r['wD47raw'] for r in X:
1703					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1704					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1705			else:
1706				self.msg('All weights set to 1 ‰')
1707				for r in self:
1708					r[f'wD{self._4x}raw'] = 1
1709
1710			for session in self.sessions:
1711				s = self.sessions[session]
1712				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1713				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1714				s['Np'] = sum(p_active)
1715				sdata = s['data']
1716
1717				A = np.array([
1718					[
1719						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1720						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1721						1 / r[f'wD{self._4x}raw'],
1722						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1723						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1724						r['t'] / r[f'wD{self._4x}raw']
1725						]
1726					for r in sdata if r['Sample'] in self.anchors
1727					])[:,p_active] # only keep columns for the active parameters
1728				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1729				s['Na'] = Y.size
1730				CM = linalg.inv(A.T @ A)
1731				bf = (CM @ A.T @ Y).T[0,:]
1732				k = 0
1733				for n,a in zip(p_names, p_active):
1734					if a:
1735						s[n] = bf[k]
1736# 						self.msg(f'{n} = {bf[k]}')
1737						k += 1
1738					else:
1739						s[n] = 0.
1740# 						self.msg(f'{n} = 0.0')
1741
1742				for r in sdata :
1743					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1744					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1745					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1746
1747				s['CM'] = np.zeros((6,6))
1748				i = 0
1749				k_active = [j for j,a in enumerate(p_active) if a]
1750				for j,a in enumerate(p_active):
1751					if a:
1752						s['CM'][j,k_active] = CM[i,:]
1753						i += 1
1754
1755			if not weighted_sessions:
1756				w = self.rmswd()['rmswd']
1757				for r in self:
1758						r[f'wD{self._4x}'] *= w
1759						r[f'wD{self._4x}raw'] *= w
1760				for session in self.sessions:
1761					self.sessions[session]['CM'] *= w**2
1762
1763			for session in self.sessions:
1764				s = self.sessions[session]
1765				s['SE_a'] = s['CM'][0,0]**.5
1766				s['SE_b'] = s['CM'][1,1]**.5
1767				s['SE_c'] = s['CM'][2,2]**.5
1768				s['SE_a2'] = s['CM'][3,3]**.5
1769				s['SE_b2'] = s['CM'][4,4]**.5
1770				s['SE_c2'] = s['CM'][5,5]**.5
1771
1772			if not weighted_sessions:
1773				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1774			else:
1775				self.Nf = 0
1776				for sg in weighted_sessions:
1777					self.Nf += self.rmswd(sessions = sg)['Nf']
1778
1779			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1780
1781			avgD4x = {
1782				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1783				for sample in self.samples
1784				}
1785			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1786			rD4x = (chi2/self.Nf)**.5
1787			self.repeatability[f'sigma_{self._4x}'] = rD4x
1788
1789			if consolidate:
1790				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1791
1792
1793	def standardization_error(self, session, d4x, D4x, t = 0):
1794		'''
1795		Compute standardization error for a given session and
1796		(δ47, Δ47) composition.
1797		'''
1798		a = self.sessions[session]['a']
1799		b = self.sessions[session]['b']
1800		c = self.sessions[session]['c']
1801		a2 = self.sessions[session]['a2']
1802		b2 = self.sessions[session]['b2']
1803		c2 = self.sessions[session]['c2']
1804		CM = self.sessions[session]['CM']
1805
1806		x, y = D4x, d4x
1807		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1808# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1809		dxdy = -(b+b2*t) / (a+a2*t)
1810		dxdz = 1. / (a+a2*t)
1811		dxda = -x / (a+a2*t)
1812		dxdb = -y / (a+a2*t)
1813		dxdc = -1. / (a+a2*t)
1814		dxda2 = -x * a2 / (a+a2*t)
1815		dxdb2 = -y * t / (a+a2*t)
1816		dxdc2 = -t / (a+a2*t)
1817		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1818		sx = (V @ CM @ V.T) ** .5
1819		return sx
1820
1821
1822	@make_verbal
1823	def summary(self,
1824		dir = 'output',
1825		filename = None,
1826		save_to_file = True,
1827		print_out = True,
1828		):
1829		'''
1830		Print out an/or save to disk a summary of the standardization results.
1831
1832		**Parameters**
1833
1834		+ `dir`: the directory in which to save the table
1835		+ `filename`: the name to the csv file to write to
1836		+ `save_to_file`: whether to save the table to disk
1837		+ `print_out`: whether to print out the table
1838		'''
1839
1840		out = []
1841		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1842		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1843		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1844		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1845		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1848		out += [['Model degrees of freedom', f"{self.Nf}"]]
1849		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1850		out += [['Standardization method', self.standardization_method]]
1851
1852		if save_to_file:
1853			if not os.path.exists(dir):
1854				os.makedirs(dir)
1855			if filename is None:
1856				filename = f'D{self._4x}_summary.csv'
1857			with open(f'{dir}/{filename}', 'w') as fid:
1858				fid.write(make_csv(out))
1859		if print_out:
1860			self.msg('\n' + pretty_table(out, header = 0))
1861
1862
1863	@make_verbal
1864	def table_of_sessions(self,
1865		dir = 'output',
1866		filename = None,
1867		save_to_file = True,
1868		print_out = True,
1869		output = None,
1870		):
1871		'''
1872		Print out an/or save to disk a table of sessions.
1873
1874		**Parameters**
1875
1876		+ `dir`: the directory in which to save the table
1877		+ `filename`: the name to the csv file to write to
1878		+ `save_to_file`: whether to save the table to disk
1879		+ `print_out`: whether to print out the table
1880		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1881		    if set to `'raw'`: return a list of list of strings
1882		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1883		'''
1884		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1885		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1886		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1887
1888		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1889		if include_a2:
1890			out[-1] += ['a2 ± SE']
1891		if include_b2:
1892			out[-1] += ['b2 ± SE']
1893		if include_c2:
1894			out[-1] += ['c2 ± SE']
1895		for session in self.sessions:
1896			out += [[
1897				session,
1898				f"{self.sessions[session]['Na']}",
1899				f"{self.sessions[session]['Nu']}",
1900				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1901				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1902				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1903				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1904				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1905				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1906				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1907				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1908				]]
1909			if include_a2:
1910				if self.sessions[session]['scrambling_drift']:
1911					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1912				else:
1913					out[-1] += ['']
1914			if include_b2:
1915				if self.sessions[session]['slope_drift']:
1916					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1917				else:
1918					out[-1] += ['']
1919			if include_c2:
1920				if self.sessions[session]['wg_drift']:
1921					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1922				else:
1923					out[-1] += ['']
1924
1925		if save_to_file:
1926			if not os.path.exists(dir):
1927				os.makedirs(dir)
1928			if filename is None:
1929				filename = f'D{self._4x}_sessions.csv'
1930			with open(f'{dir}/{filename}', 'w') as fid:
1931				fid.write(make_csv(out))
1932		if print_out:
1933			self.msg('\n' + pretty_table(out))
1934		if output == 'raw':
1935			return out
1936		elif output == 'pretty':
1937			return pretty_table(out)
1938
1939
1940	@make_verbal
1941	def table_of_analyses(
1942		self,
1943		dir = 'output',
1944		filename = None,
1945		save_to_file = True,
1946		print_out = True,
1947		output = None,
1948		):
1949		'''
1950		Print out an/or save to disk a table of analyses.
1951
1952		**Parameters**
1953
1954		+ `dir`: the directory in which to save the table
1955		+ `filename`: the name to the csv file to write to
1956		+ `save_to_file`: whether to save the table to disk
1957		+ `print_out`: whether to print out the table
1958		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1959		    if set to `'raw'`: return a list of list of strings
1960		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1961		'''
1962
1963		out = [['UID','Session','Sample']]
1964		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1965		for f in extra_fields:
1966			out[-1] += [f[0]]
1967		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1968		for r in self:
1969			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1970			for f in extra_fields:
1971				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1972			out[-1] += [
1973				f"{r['d13Cwg_VPDB']:.3f}",
1974				f"{r['d18Owg_VSMOW']:.3f}",
1975				f"{r['d45']:.6f}",
1976				f"{r['d46']:.6f}",
1977				f"{r['d47']:.6f}",
1978				f"{r['d48']:.6f}",
1979				f"{r['d49']:.6f}",
1980				f"{r['d13C_VPDB']:.6f}",
1981				f"{r['d18O_VSMOW']:.6f}",
1982				f"{r['D47raw']:.6f}",
1983				f"{r['D48raw']:.6f}",
1984				f"{r['D49raw']:.6f}",
1985				f"{r[f'D{self._4x}']:.6f}"
1986				]
1987		if save_to_file:
1988			if not os.path.exists(dir):
1989				os.makedirs(dir)
1990			if filename is None:
1991				filename = f'D{self._4x}_analyses.csv'
1992			with open(f'{dir}/{filename}', 'w') as fid:
1993				fid.write(make_csv(out))
1994		if print_out:
1995			self.msg('\n' + pretty_table(out))
1996		return out
1997
1998	@make_verbal
1999	def covar_table(
2000		self,
2001		correl = False,
2002		dir = 'output',
2003		filename = None,
2004		save_to_file = True,
2005		print_out = True,
2006		output = None,
2007		):
2008		'''
2009		Print out, save to disk and/or return the variance-covariance matrix of D4x
2010		for all unknown samples.
2011
2012		**Parameters**
2013
2014		+ `dir`: the directory in which to save the csv
2015		+ `filename`: the name of the csv file to write to
2016		+ `save_to_file`: whether to save the csv
2017		+ `print_out`: whether to print out the matrix
2018		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2019		    if set to `'raw'`: return a list of list of strings
2020		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2021		'''
2022		samples = sorted([u for u in self.unknowns])
2023		out = [[''] + samples]
2024		for s1 in samples:
2025			out.append([s1])
2026			for s2 in samples:
2027				if correl:
2028					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2029				else:
2030					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2031
2032		if save_to_file:
2033			if not os.path.exists(dir):
2034				os.makedirs(dir)
2035			if filename is None:
2036				if correl:
2037					filename = f'D{self._4x}_correl.csv'
2038				else:
2039					filename = f'D{self._4x}_covar.csv'
2040			with open(f'{dir}/{filename}', 'w') as fid:
2041				fid.write(make_csv(out))
2042		if print_out:
2043			self.msg('\n'+pretty_table(out))
2044		if output == 'raw':
2045			return out
2046		elif output == 'pretty':
2047			return pretty_table(out)
2048
2049	@make_verbal
2050	def table_of_samples(
2051		self,
2052		dir = 'output',
2053		filename = None,
2054		save_to_file = True,
2055		print_out = True,
2056		output = None,
2057		):
2058		'''
2059		Print out, save to disk and/or return a table of samples.
2060
2061		**Parameters**
2062
2063		+ `dir`: the directory in which to save the csv
2064		+ `filename`: the name of the csv file to write to
2065		+ `save_to_file`: whether to save the csv
2066		+ `print_out`: whether to print out the table
2067		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2068		    if set to `'raw'`: return a list of list of strings
2069		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2070		'''
2071
2072		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2073		for sample in self.anchors:
2074			out += [[
2075				f"{sample}",
2076				f"{self.samples[sample]['N']}",
2077				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2078				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2079				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2080				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2081				]]
2082		for sample in self.unknowns:
2083			out += [[
2084				f"{sample}",
2085				f"{self.samples[sample]['N']}",
2086				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2087				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2088				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2089				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2090				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2091				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2092				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2093				]]
2094		if save_to_file:
2095			if not os.path.exists(dir):
2096				os.makedirs(dir)
2097			if filename is None:
2098				filename = f'D{self._4x}_samples.csv'
2099			with open(f'{dir}/{filename}', 'w') as fid:
2100				fid.write(make_csv(out))
2101		if print_out:
2102			self.msg('\n'+pretty_table(out))
2103		if output == 'raw':
2104			return out
2105		elif output == 'pretty':
2106			return pretty_table(out)
2107
2108
2109	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2110		'''
2111		Generate session plots and save them to disk.
2112
2113		**Parameters**
2114
2115		+ `dir`: the directory in which to save the plots
2116		+ `figsize`: the width and height (in inches) of each plot
2117		+ `filetype`: 'pdf' or 'png'
2118		+ `dpi`: resolution for PNG output
2119		'''
2120		if not os.path.exists(dir):
2121			os.makedirs(dir)
2122
2123		for session in self.sessions:
2124			sp = self.plot_single_session(session, xylimits = 'constant')
2125			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2126			ppl.close(sp.fig)
2127			
2128
2129
2130	@make_verbal
2131	def consolidate_samples(self):
2132		'''
2133		Compile various statistics for each sample.
2134
2135		For each anchor sample:
2136
2137		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2138		+ `SE_D47` or `SE_D48`: set to zero by definition
2139
2140		For each unknown sample:
2141
2142		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2143		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2144
2145		For each anchor and unknown:
2146
2147		+ `N`: the total number of analyses of this sample
2148		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2149		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2150		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2151		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2152		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2153		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2154		'''
2155		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2156		for sample in self.samples:
2157			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2158			if self.samples[sample]['N'] > 1:
2159				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2160
2161			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2162			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2163
2164			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2165			if len(D4x_pop) > 2:
2166				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2167			
2168		if self.standardization_method == 'pooled':
2169			for sample in self.anchors:
2170				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2171				self.samples[sample][f'SE_D{self._4x}'] = 0.
2172			for sample in self.unknowns:
2173				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2174				try:
2175					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2176				except ValueError:
2177					# when `sample` is constrained by self.standardize(constraints = {...}),
2178					# it is no longer listed in self.standardization.var_names.
2179					# Temporary fix: define SE as zero for now
2180					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2181
2182		elif self.standardization_method == 'indep_sessions':
2183			for sample in self.anchors:
2184				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2185				self.samples[sample][f'SE_D{self._4x}'] = 0.
2186			for sample in self.unknowns:
2187				self.msg(f'Consolidating sample {sample}')
2188				self.unknowns[sample][f'session_D{self._4x}'] = {}
2189				session_avg = []
2190				for session in self.sessions:
2191					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2192					if sdata:
2193						self.msg(f'{sample} found in session {session}')
2194						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2195						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2196						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2197						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2198						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2199						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2200						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2201				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2202				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2203				wsum = sum([weights[s] for s in weights])
2204				for s in weights:
2205					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2206
2207		for r in self:
2208			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2209
2210
2211
2212	def consolidate_sessions(self):
2213		'''
2214		Compute various statistics for each session.
2215
2216		+ `Na`: Number of anchor analyses in the session
2217		+ `Nu`: Number of unknown analyses in the session
2218		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2219		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2220		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2221		+ `a`: scrambling factor
2222		+ `b`: compositional slope
2223		+ `c`: WG offset
2224		+ `SE_a`: Model stadard erorr of `a`
2225		+ `SE_b`: Model stadard erorr of `b`
2226		+ `SE_c`: Model stadard erorr of `c`
2227		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2228		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2229		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2230		+ `a2`: scrambling factor drift
2231		+ `b2`: compositional slope drift
2232		+ `c2`: WG offset drift
2233		+ `Np`: Number of standardization parameters to fit
2234		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2235		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2236		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2237		'''
2238		for session in self.sessions:
2239			if 'd13Cwg_VPDB' not in self.sessions[session]:
2240				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2241			if 'd18Owg_VSMOW' not in self.sessions[session]:
2242				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2243			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2244			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2245
2246			self.msg(f'Computing repeatabilities for session {session}')
2247			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2248			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2249			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2250
2251		if self.standardization_method == 'pooled':
2252			for session in self.sessions:
2253
2254				# different (better?) computation of D4x repeatability for each session:
2255				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2256				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2257
2258				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2259				i = self.standardization.var_names.index(f'a_{pf(session)}')
2260				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2261
2262				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2263				i = self.standardization.var_names.index(f'b_{pf(session)}')
2264				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2265
2266				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2267				i = self.standardization.var_names.index(f'c_{pf(session)}')
2268				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2269
2270				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2271				if self.sessions[session]['scrambling_drift']:
2272					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2273					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2274				else:
2275					self.sessions[session]['SE_a2'] = 0.
2276
2277				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2278				if self.sessions[session]['slope_drift']:
2279					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2280					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2281				else:
2282					self.sessions[session]['SE_b2'] = 0.
2283
2284				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2285				if self.sessions[session]['wg_drift']:
2286					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2287					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2288				else:
2289					self.sessions[session]['SE_c2'] = 0.
2290
2291				i = self.standardization.var_names.index(f'a_{pf(session)}')
2292				j = self.standardization.var_names.index(f'b_{pf(session)}')
2293				k = self.standardization.var_names.index(f'c_{pf(session)}')
2294				CM = np.zeros((6,6))
2295				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2296				try:
2297					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2298					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2299					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2300					try:
2301						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2302						CM[3,4] = self.standardization.covar[i2,j2]
2303						CM[4,3] = self.standardization.covar[j2,i2]
2304					except ValueError:
2305						pass
2306					try:
2307						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2308						CM[3,5] = self.standardization.covar[i2,k2]
2309						CM[5,3] = self.standardization.covar[k2,i2]
2310					except ValueError:
2311						pass
2312				except ValueError:
2313					pass
2314				try:
2315					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2316					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2317					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2318					try:
2319						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2320						CM[4,5] = self.standardization.covar[j2,k2]
2321						CM[5,4] = self.standardization.covar[k2,j2]
2322					except ValueError:
2323						pass
2324				except ValueError:
2325					pass
2326				try:
2327					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2328					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2329					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2330				except ValueError:
2331					pass
2332
2333				self.sessions[session]['CM'] = CM
2334
2335		elif self.standardization_method == 'indep_sessions':
2336			pass # Not implemented yet
2337
2338
2339	@make_verbal
2340	def repeatabilities(self):
2341		'''
2342		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2343		(for all samples, for anchors, and for unknowns).
2344		'''
2345		self.msg('Computing reproducibilities for all sessions')
2346
2347		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2348		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2349		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2351		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2352
2353
2354	@make_verbal
2355	def consolidate(self, tables = True, plots = True):
2356		'''
2357		Collect information about samples, sessions and repeatabilities.
2358		'''
2359		self.consolidate_samples()
2360		self.consolidate_sessions()
2361		self.repeatabilities()
2362
2363		if tables:
2364			self.summary()
2365			self.table_of_sessions()
2366			self.table_of_analyses()
2367			self.table_of_samples()
2368
2369		if plots:
2370			self.plot_sessions()
2371
2372
2373	@make_verbal
2374	def rmswd(self,
2375		samples = 'all samples',
2376		sessions = 'all sessions',
2377		):
2378		'''
2379		Compute the χ2, root mean squared weighted deviation
2380		(i.e. reduced χ2), and corresponding degrees of freedom of the
2381		Δ4x values for samples in `samples` and sessions in `sessions`.
2382		
2383		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2384		'''
2385		if samples == 'all samples':
2386			mysamples = [k for k in self.samples]
2387		elif samples == 'anchors':
2388			mysamples = [k for k in self.anchors]
2389		elif samples == 'unknowns':
2390			mysamples = [k for k in self.unknowns]
2391		else:
2392			mysamples = samples
2393
2394		if sessions == 'all sessions':
2395			sessions = [k for k in self.sessions]
2396
2397		chisq, Nf = 0, 0
2398		for sample in mysamples :
2399			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2400			if len(G) > 1 :
2401				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2402				Nf += (len(G) - 1)
2403				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2404		r = (chisq / Nf)**.5 if Nf > 0 else 0
2405		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2406		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2407
2408	
2409	@make_verbal
2410	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2411		'''
2412		Compute the repeatability of `[r[key] for r in self]`
2413		'''
2414
2415		if samples == 'all samples':
2416			mysamples = [k for k in self.samples]
2417		elif samples == 'anchors':
2418			mysamples = [k for k in self.anchors]
2419		elif samples == 'unknowns':
2420			mysamples = [k for k in self.unknowns]
2421		else:
2422			mysamples = samples
2423
2424		if sessions == 'all sessions':
2425			sessions = [k for k in self.sessions]
2426
2427		if key in ['D47', 'D48']:
2428			# Full disclosure: the definition of Nf is tricky/debatable
2429			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2430			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2431			Nf = len(G)
2432# 			print(f'len(G) = {Nf}')
2433			Nf -= len([s for s in mysamples if s in self.unknowns])
2434# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2435			for session in sessions:
2436				Np = len([
2437					_ for _ in self.standardization.params
2438					if (
2439						self.standardization.params[_].expr is not None
2440						and (
2441							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2442							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2443							)
2444						)
2445					])
2446# 				print(f'session {session}: {Np} parameters to consider')
2447				Na = len({
2448					r['Sample'] for r in self.sessions[session]['data']
2449					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2450					})
2451# 				print(f'session {session}: {Na} different anchors in that session')
2452				Nf -= min(Np, Na)
2453# 			print(f'Nf = {Nf}')
2454
2455# 			for sample in mysamples :
2456# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2457# 				if len(X) > 1 :
2458# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2459# 					if sample in self.unknowns:
2460# 						Nf += len(X) - 1
2461# 					else:
2462# 						Nf += len(X)
2463# 			if samples in ['anchors', 'all samples']:
2464# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2465			r = (chisq / Nf)**.5 if Nf > 0 else 0
2466
2467		else: # if key not in ['D47', 'D48']
2468			chisq, Nf = 0, 0
2469			for sample in mysamples :
2470				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2471				if len(X) > 1 :
2472					Nf += len(X) - 1
2473					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2474			r = (chisq / Nf)**.5 if Nf > 0 else 0
2475
2476		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2477		return r
2478
2479	def sample_average(self, samples, weights = 'equal', normalize = True):
2480		'''
2481		Weighted average Δ4x value of a group of samples, accounting for covariance.
2482
2483		Returns the weighed average Δ4x value and associated SE
2484		of a group of samples. Weights are equal by default. If `normalize` is
2485		true, `weights` will be rescaled so that their sum equals 1.
2486
2487		**Examples**
2488
2489		```python
2490		self.sample_average(['X','Y'], [1, 2])
2491		```
2492
2493		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2494		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2495		values of samples X and Y, respectively.
2496
2497		```python
2498		self.sample_average(['X','Y'], [1, -1], normalize = False)
2499		```
2500
2501		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2502		'''
2503		if weights == 'equal':
2504			weights = [1/len(samples)] * len(samples)
2505
2506		if normalize:
2507			s = sum(weights)
2508			if s:
2509				weights = [w/s for w in weights]
2510
2511		try:
2512# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2513# 			C = self.standardization.covar[indices,:][:,indices]
2514			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2515			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2516			return correlated_sum(X, C, weights)
2517		except ValueError:
2518			return (0., 0.)
2519
2520
2521	def sample_D4x_covar(self, sample1, sample2 = None):
2522		'''
2523		Covariance between Δ4x values of samples
2524
2525		Returns the error covariance between the average Δ4x values of two
2526		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2527		returns the Δ4x variance for that sample.
2528		'''
2529		if sample2 is None:
2530			sample2 = sample1
2531		if self.standardization_method == 'pooled':
2532			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2533			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2534			return self.standardization.covar[i, j]
2535		elif self.standardization_method == 'indep_sessions':
2536			if sample1 == sample2:
2537				return self.samples[sample1][f'SE_D{self._4x}']**2
2538			else:
2539				c = 0
2540				for session in self.sessions:
2541					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2542					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2543					if sdata1 and sdata2:
2544						a = self.sessions[session]['a']
2545						# !! TODO: CM below does not account for temporal changes in standardization parameters
2546						CM = self.sessions[session]['CM'][:3,:3]
2547						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2548						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2549						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2550						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2551						c += (
2552							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2553							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2554							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2555							@ CM
2556							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2557							) / a**2
2558				return float(c)
2559
2560	def sample_D4x_correl(self, sample1, sample2 = None):
2561		'''
2562		Correlation between Δ4x errors of samples
2563
2564		Returns the error correlation between the average Δ4x values of two samples.
2565		'''
2566		if sample2 is None or sample2 == sample1:
2567			return 1.
2568		return (
2569			self.sample_D4x_covar(sample1, sample2)
2570			/ self.unknowns[sample1][f'SE_D{self._4x}']
2571			/ self.unknowns[sample2][f'SE_D{self._4x}']
2572			)
2573
2574	def plot_single_session(self,
2575		session,
2576		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2577		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2578		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2579		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2580		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2581		xylimits = 'free', # | 'constant'
2582		x_label = None,
2583		y_label = None,
2584		error_contour_interval = 'auto',
2585		fig = 'new',
2586		):
2587		'''
2588		Generate plot for a single session
2589		'''
2590		if x_label is None:
2591			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2592		if y_label is None:
2593			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2594
2595		out = _SessionPlot()
2596		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2597		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2598		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2599		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2601		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		anchor_avg = (np.array([ np.array([
2603				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2604				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2605				]) for sample in anchors]).T,
2606			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2607		unknown_avg = (np.array([ np.array([
2608				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2609				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2610				]) for sample in unknowns]).T,
2611			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2612		
2613		
2614		if fig == 'new':
2615			out.fig = ppl.figure(figsize = (6,6))
2616			ppl.subplots_adjust(.1,.1,.9,.9)
2617
2618		out.anchor_analyses, = ppl.plot(
2619			anchors_d,
2620			anchors_D,
2621			**kw_plot_anchors)
2622		out.unknown_analyses, = ppl.plot(
2623			unknowns_d,
2624			unknowns_D,
2625			**kw_plot_unknowns)
2626		out.anchor_avg = ppl.plot(
2627			*anchor_avg,
2628			**kw_plot_anchor_avg)
2629		out.unknown_avg = ppl.plot(
2630			*unknown_avg,
2631			**kw_plot_unknown_avg)
2632		if xylimits == 'constant':
2633			x = [r[f'd{self._4x}'] for r in self]
2634			y = [r[f'D{self._4x}'] for r in self]
2635			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2636			w, h = x2-x1, y2-y1
2637			x1 -= w/20
2638			x2 += w/20
2639			y1 -= h/20
2640			y2 += h/20
2641			ppl.axis([x1, x2, y1, y2])
2642		elif xylimits == 'free':
2643			x1, x2, y1, y2 = ppl.axis()
2644		else:
2645			x1, x2, y1, y2 = ppl.axis(xylimits)
2646				
2647		if error_contour_interval != 'none':
2648			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2649			XI,YI = np.meshgrid(xi, yi)
2650			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2651			if error_contour_interval == 'auto':
2652				rng = np.max(SI) - np.min(SI)
2653				if rng <= 0.01:
2654					cinterval = 0.001
2655				elif rng <= 0.03:
2656					cinterval = 0.004
2657				elif rng <= 0.1:
2658					cinterval = 0.01
2659				elif rng <= 0.3:
2660					cinterval = 0.03
2661				elif rng <= 1.:
2662					cinterval = 0.1
2663				else:
2664					cinterval = 0.5
2665			else:
2666				cinterval = error_contour_interval
2667
2668			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2669			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2670			out.clabel = ppl.clabel(out.contour)
2671			contour = (XI, YI, SI, cval, cinterval)
2672
2673		if fig == None:
2674			return {
2675			'anchors':anchors,
2676			'unknowns':unknowns,
2677			'anchors_d':anchors_d,
2678			'anchors_D':anchors_D,
2679			'unknowns_d':unknowns_d,
2680			'unknowns_D':unknowns_D,
2681			'anchor_avg':anchor_avg,
2682			'unknown_avg':unknown_avg,
2683			'contour':contour,
2684			}
2685
2686		ppl.xlabel(x_label)
2687		ppl.ylabel(y_label)
2688		ppl.title(session, weight = 'bold')
2689		ppl.grid(alpha = .2)
2690		out.ax = ppl.gca()		
2691
2692		return out
2693
2694	def plot_residuals(
2695		self,
2696		kde = False,
2697		hist = False,
2698		binwidth = 2/3,
2699		dir = 'output',
2700		filename = None,
2701		highlight = [],
2702		colors = None,
2703		figsize = None,
2704		dpi = 100,
2705		yspan = None,
2706		):
2707		'''
2708		Plot residuals of each analysis as a function of time (actually, as a function of
2709		the order of analyses in the `D4xdata` object)
2710
2711		+ `kde`: whether to add a kernel density estimate of residuals
2712		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2713		+ `histbins`: specify bin edges for the histogram
2714		+ `dir`: the directory in which to save the plot
2715		+ `highlight`: a list of samples to highlight
2716		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2717		+ `figsize`: (width, height) of figure
2718		+ `dpi`: resolution for PNG output
2719		+ `yspan`: factor controlling the range of y values shown in plot
2720		  (by default: `yspan = 1.5 if kde else 1.0`)
2721		'''
2722		
2723		from matplotlib import ticker
2724
2725		if yspan is None:
2726			if kde:
2727				yspan = 1.5
2728			else:
2729				yspan = 1.0
2730		
2731		# Layout
2732		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2733		if hist or kde:
2734			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2735			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2736		else:
2737			ppl.subplots_adjust(.08,.05,.78,.8)
2738			ax1 = ppl.subplot(111)
2739		
2740		# Colors
2741		N = len(self.anchors)
2742		if colors is None:
2743			if len(highlight) > 0:
2744				Nh = len(highlight)
2745				if Nh == 1:
2746					colors = {highlight[0]: (0,0,0)}
2747				elif Nh == 3:
2748					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2749				elif Nh == 4:
2750					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2751				else:
2752					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2753			else:
2754				if N == 3:
2755					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2756				elif N == 4:
2757					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2758				else:
2759					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2760
2761		ppl.sca(ax1)
2762		
2763		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2764
2765		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2766
2767		session = self[0]['Session']
2768		x1 = 0
2769# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2770		x_sessions = {}
2771		one_or_more_singlets = False
2772		one_or_more_multiplets = False
2773		multiplets = set()
2774		for k,r in enumerate(self):
2775			if r['Session'] != session:
2776				x2 = k-1
2777				x_sessions[session] = (x1+x2)/2
2778				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2779				session = r['Session']
2780				x1 = k
2781			singlet = len(self.samples[r['Sample']]['data']) == 1
2782			if not singlet:
2783				multiplets.add(r['Sample'])
2784			if r['Sample'] in self.unknowns:
2785				if singlet:
2786					one_or_more_singlets = True
2787				else:
2788					one_or_more_multiplets = True
2789			kw = dict(
2790				marker = 'x' if singlet else '+',
2791				ms = 4 if singlet else 5,
2792				ls = 'None',
2793				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2794				mew = 1,
2795				alpha = 0.2 if singlet else 1,
2796				)
2797			if highlight and r['Sample'] not in highlight:
2798				kw['alpha'] = 0.2
2799			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2800		x2 = k
2801		x_sessions[session] = (x1+x2)/2
2802
2803		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2805		if not (hist or kde):
2806			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808
2809		xmin, xmax, ymin, ymax = ppl.axis()
2810		if yspan != 1:
2811			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2812		for s in x_sessions:
2813			ppl.text(
2814				x_sessions[s],
2815				ymax +1,
2816				s,
2817				va = 'bottom',
2818				**(
2819					dict(ha = 'center')
2820					if len(self.sessions[s]['data']) > (0.15 * len(self))
2821					else dict(ha = 'left', rotation = 45)
2822					)
2823				)
2824
2825		if hist or kde:
2826			ppl.sca(ax2)
2827
2828		for s in colors:
2829			kw['marker'] = '+'
2830			kw['ms'] = 5
2831			kw['mec'] = colors[s]
2832			kw['label'] = s
2833			kw['alpha'] = 1
2834			ppl.plot([], [], **kw)
2835
2836		kw['mec'] = (0,0,0)
2837
2838		if one_or_more_singlets:
2839			kw['marker'] = 'x'
2840			kw['ms'] = 4
2841			kw['alpha'] = .2
2842			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2843			ppl.plot([], [], **kw)
2844
2845		if one_or_more_multiplets:
2846			kw['marker'] = '+'
2847			kw['ms'] = 4
2848			kw['alpha'] = 1
2849			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2850			ppl.plot([], [], **kw)
2851
2852		if hist or kde:
2853			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2854		else:
2855			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2856		leg.set_zorder(-1000)
2857
2858		ppl.sca(ax1)
2859
2860		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2861		ppl.xticks([])
2862		ppl.axis([-1, len(self), None, None])
2863
2864		if hist or kde:
2865			ppl.sca(ax2)
2866			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2867
2868			if kde:
2869				from scipy.stats import gaussian_kde
2870				yi = np.linspace(ymin, ymax, 201)
2871				xi = gaussian_kde(X).evaluate(yi)
2872				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2873# 				ppl.plot(xi, yi, 'k-', lw = 1)
2874			elif hist:
2875				ppl.hist(
2876					X,
2877					orientation = 'horizontal',
2878					histtype = 'stepfilled',
2879					ec = [.4]*3,
2880					fc = [.25]*3,
2881					alpha = .25,
2882					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2883					)
2884			ppl.text(0, 0,
2885				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2886				size = 7.5,
2887				alpha = 1,
2888				va = 'center',
2889				ha = 'left',
2890				)
2891
2892			ppl.axis([0, None, ymin, ymax])
2893			ppl.xticks([])
2894			ppl.yticks([])
2895# 			ax2.spines['left'].set_visible(False)
2896			ax2.spines['right'].set_visible(False)
2897			ax2.spines['top'].set_visible(False)
2898			ax2.spines['bottom'].set_visible(False)
2899
2900		ax1.axis([None, None, ymin, ymax])
2901
2902		if not os.path.exists(dir):
2903			os.makedirs(dir)
2904		if filename is None:
2905			return fig
2906		elif filename == '':
2907			filename = f'D{self._4x}_residuals.pdf'
2908		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2909		ppl.close(fig)
2910				
2911
2912	def simulate(self, *args, **kwargs):
2913		'''
2914		Legacy function with warning message pointing to `virtual_data()`
2915		'''
2916		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2917
2918	def plot_distribution_of_analyses(
2919		self,
2920		dir = 'output',
2921		filename = None,
2922		vs_time = False,
2923		figsize = (6,4),
2924		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2925		output = None,
2926		dpi = 100,
2927		):
2928		'''
2929		Plot temporal distribution of all analyses in the data set.
2930		
2931		**Parameters**
2932
2933		+ `dir`: the directory in which to save the plot
2934		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2935		+ `dpi`: resolution for PNG output
2936		+ `figsize`: (width, height) of figure
2937		+ `dpi`: resolution for PNG output
2938		'''
2939
2940		asamples = [s for s in self.anchors]
2941		usamples = [s for s in self.unknowns]
2942		if output is None or output == 'fig':
2943			fig = ppl.figure(figsize = figsize)
2944			ppl.subplots_adjust(*subplots_adjust)
2945		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2946		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax += (Xmax-Xmin)/40
2948		Xmin -= (Xmax-Xmin)/41
2949		for k, s in enumerate(asamples + usamples):
2950			if vs_time:
2951				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2952			else:
2953				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2954			Y = [-k for x in X]
2955			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2956			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2957			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2958		ppl.axis([Xmin, Xmax, -k-1, 1])
2959		ppl.xlabel('\ntime')
2960		ppl.gca().annotate('',
2961			xy = (0.6, -0.02),
2962			xycoords = 'axes fraction',
2963			xytext = (.4, -0.02), 
2964            arrowprops = dict(arrowstyle = "->", color = 'k'),
2965            )
2966			
2967
2968		x2 = -1
2969		for session in self.sessions:
2970			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2971			if vs_time:
2972				ppl.axvline(x1, color = 'k', lw = .75)
2973			if x2 > -1:
2974				if not vs_time:
2975					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2976			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2977# 			from xlrd import xldate_as_datetime
2978# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2979			if vs_time:
2980				ppl.axvline(x2, color = 'k', lw = .75)
2981				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2982			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2983
2984		ppl.xticks([])
2985		ppl.yticks([])
2986
2987		if output is None:
2988			if not os.path.exists(dir):
2989				os.makedirs(dir)
2990			if filename == None:
2991				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2992			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2993			ppl.close(fig)
2994		elif output == 'ax':
2995			return ppl.gca()
2996		elif output == 'fig':
2997			return fig
2998
2999
3000	def plot_bulk_compositions(
3001		self,
3002		samples = None,
3003		dir = 'output/bulk_compositions',
3004		figsize = (6,6),
3005		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3006		show = False,
3007		sample_color = (0,.5,1),
3008		analysis_color = (.7,.7,.7),
3009		labeldist = 0.3,
3010		radius = 0.05,
3011		):
3012		'''
3013		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3014		
3015		By default, creates a directory `./output/bulk_compositions` where plots for
3016		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3017		
3018		
3019		**Parameters**
3020
3021		+ `samples`: Only these samples are processed (by default: all samples).
3022		+ `dir`: where to save the plots
3023		+ `figsize`: (width, height) of figure
3024		+ `subplots_adjust`: passed to `subplots_adjust()`
3025		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3026		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3027		+ `sample_color`: color used for replicate markers/labels
3028		+ `analysis_color`: color used for sample markers/labels
3029		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3030		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3031		'''
3032
3033		from matplotlib.patches import Ellipse
3034
3035		if samples is None:
3036			samples = [_ for _ in self.samples]
3037
3038		saved = {}
3039
3040		for s in samples:
3041
3042			fig = ppl.figure(figsize = figsize)
3043			fig.subplots_adjust(*subplots_adjust)
3044			ax = ppl.subplot(111)
3045			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3046			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3047			ppl.title(s)
3048
3049
3050			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3051			UID = [_['UID'] for _ in self.samples[s]['data']]
3052			XY0 = XY.mean(0)
3053
3054			for xy in XY:
3055				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3056				
3057			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3058			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3059			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3060			saved[s] = [XY, XY0]
3061			
3062			x1, x2, y1, y2 = ppl.axis()
3063			x0, dx = (x1+x2)/2, (x2-x1)/2
3064			y0, dy = (y1+y2)/2, (y2-y1)/2
3065			dx, dy = [max(max(dx, dy), radius)]*2
3066
3067			ppl.axis([
3068				x0 - 1.2*dx,
3069				x0 + 1.2*dx,
3070				y0 - 1.2*dy,
3071				y0 + 1.2*dy,
3072				])			
3073
3074			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3075
3076			for xy, uid in zip(XY, UID):
3077
3078				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3079				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3080
3081				if (vector_in_display_space**2).sum() > 0:
3082
3083					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3084					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3085					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3086					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3087
3088					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3089
3090				else:
3091
3092					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3093
3094			if radius:
3095				ax.add_artist(Ellipse(
3096					xy = XY0,
3097					width = radius*2,
3098					height = radius*2,
3099					ls = (0, (2,2)),
3100					lw = .7,
3101					ec = analysis_color,
3102					fc = 'None',
3103					))
3104				ppl.text(
3105					XY0[0],
3106					XY0[1]-radius,
3107					f'\n± {radius*1e3:.0f} ppm',
3108					color = analysis_color,
3109					va = 'top',
3110					ha = 'center',
3111					linespacing = 0.4,
3112					size = 8,
3113					)
3114
3115			if not os.path.exists(dir):
3116				os.makedirs(dir)
3117			fig.savefig(f'{dir}/{s}.pdf')
3118			ppl.close(fig)
3119
3120		fig = ppl.figure(figsize = figsize)
3121		fig.subplots_adjust(*subplots_adjust)
3122		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3123		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3124
3125		for s in saved:
3126			for xy in saved[s][0]:
3127				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3128			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3129			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3130			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3131
3132		x1, x2, y1, y2 = ppl.axis()
3133		ppl.axis([
3134			x1 - (x2-x1)/10,
3135			x2 + (x2-x1)/10,
3136			y1 - (y2-y1)/10,
3137			y2 + (y2-y1)/10,
3138			])			
3139
3140
3141		if not os.path.exists(dir):
3142			os.makedirs(dir)
3143		fig.savefig(f'{dir}/__all__.pdf')
3144		if show:
3145			ppl.show()
3146		ppl.close(fig)
3147		
3148
3149	def _save_D4x_correl(
3150		self,
3151		samples = None,
3152		dir = 'output',
3153		filename = None,
3154		D4x_precision = 4,
3155		correl_precision = 4,
3156		):
3157		'''
3158		Save D4x values along with their SE and correlation matrix.
3159
3160		**Parameters**
3161
3162		+ `samples`: Only these samples are output (by default: all samples).
3163		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3164		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3165		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3166		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3167		'''
3168		if samples is None:
3169			samples = sorted([s for s in self.unknowns])
3170		
3171		out = [['Sample']] + [[s] for s in samples]
3172		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3173		for k,s in enumerate(samples):
3174			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3175			for s2 in samples:
3176				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3177		
3178		if not os.path.exists(dir):
3179			os.makedirs(dir)
3180		if filename is None:
3181			filename = f'D{self._4x}_correl.csv'
3182		with open(f'{dir}/{filename}', 'w') as fid:
3183			fid.write(make_csv(out))
3184		
3185		
3186		
3187
3188class D47data(D4xdata):
3189	'''
3190	Store and process data for a large set of Δ47 analyses,
3191	usually comprising more than one analytical session.
3192	'''
3193
3194	Nominal_D4x = {
3195		'ETH-1':   0.2052,
3196		'ETH-2':   0.2085,
3197		'ETH-3':   0.6132,
3198		'ETH-4':   0.4511,
3199		'IAEA-C1': 0.3018,
3200		'IAEA-C2': 0.6409,
3201		'MERCK':   0.5135,
3202		} # I-CDES (Bernasconi et al., 2021)
3203	'''
3204	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3205	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3206	reference frame.
3207
3208	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3209	```py
3210	{
3211		'ETH-1'   : 0.2052,
3212		'ETH-2'   : 0.2085,
3213		'ETH-3'   : 0.6132,
3214		'ETH-4'   : 0.4511,
3215		'IAEA-C1' : 0.3018,
3216		'IAEA-C2' : 0.6409,
3217		'MERCK'   : 0.5135,
3218	}
3219	```
3220	'''
3221
3222
3223	@property
3224	def Nominal_D47(self):
3225		return self.Nominal_D4x
3226	
3227
3228	@Nominal_D47.setter
3229	def Nominal_D47(self, new):
3230		self.Nominal_D4x = dict(**new)
3231		self.refresh()
3232
3233
3234	def __init__(self, l = [], **kwargs):
3235		'''
3236		**Parameters:** same as `D4xdata.__init__()`
3237		'''
3238		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3239
3240
3241	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3242		'''
3243		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3244		value for that temperature, and add treat these samples as additional anchors.
3245
3246		**Parameters**
3247
3248		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3249		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3250		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3251		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3252		if `new`: keep pre-existing anchors but update them in case of conflict
3253		between old and new Δ47 values;
3254		if `old`: keep pre-existing anchors but preserve their original Δ47
3255		values in case of conflict.
3256		'''
3257		f = {
3258			'petersen': fCO2eqD47_Petersen,
3259			'wang': fCO2eqD47_Wang,
3260			}[fCo2eqD47]
3261		foo = {}
3262		for r in self:
3263			if 'Teq' in r:
3264				if r['Sample'] in foo:
3265					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3266				else:
3267					foo[r['Sample']] = f(r['Teq'])
3268			else:
3269					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3270
3271		if priority == 'replace':
3272			self.Nominal_D47 = {}
3273		for s in foo:
3274			if priority != 'old' or s not in self.Nominal_D47:
3275				self.Nominal_D47[s] = foo[s]
3276	
3277	def save_D47_correl(self, *args, **kwargs):
3278		return self._save_D4x_correl(*args, **kwargs)
3279
3280	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3281
3282
3283class D48data(D4xdata):
3284	'''
3285	Store and process data for a large set of Δ48 analyses,
3286	usually comprising more than one analytical session.
3287	'''
3288
3289	Nominal_D4x = {
3290		'ETH-1':  0.138,
3291		'ETH-2':  0.138,
3292		'ETH-3':  0.270,
3293		'ETH-4':  0.223,
3294		'GU-1':  -0.419,
3295		} # (Fiebig et al., 2019, 2021)
3296	'''
3297	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3298	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3299	reference frame.
3300
3301	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3302	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3303
3304	```py
3305	{
3306		'ETH-1' :  0.138,
3307		'ETH-2' :  0.138,
3308		'ETH-3' :  0.270,
3309		'ETH-4' :  0.223,
3310		'GU-1'  : -0.419,
3311	}
3312	```
3313	'''
3314
3315
3316	@property
3317	def Nominal_D48(self):
3318		return self.Nominal_D4x
3319
3320	
3321	@Nominal_D48.setter
3322	def Nominal_D48(self, new):
3323		self.Nominal_D4x = dict(**new)
3324		self.refresh()
3325
3326
3327	def __init__(self, l = [], **kwargs):
3328		'''
3329		**Parameters:** same as `D4xdata.__init__()`
3330		'''
3331		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3332
3333	def save_D48_correl(self, *args, **kwargs):
3334		return self._save_D4x_correl(*args, **kwargs)
3335
3336	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3337
3338
3339class D49data(D4xdata):
3340	'''
3341	Store and process data for a large set of Δ49 analyses,
3342	usually comprising more than one analytical session.
3343	'''
3344	
3345	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3346	'''
3347	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3348	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3349	reference frame.
3350
3351	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3352
3353	```py
3354	{
3355		"1000C": 0.0,
3356		"25C": 2.228
3357	}
3358	```
3359	'''
3360	
3361	@property
3362	def Nominal_D49(self):
3363		return self.Nominal_D4x
3364	
3365	@Nominal_D49.setter
3366	def Nominal_D49(self, new):
3367		self.Nominal_D4x = dict(**new)
3368		self.refresh()
3369	
3370	def __init__(self, l=[], **kwargs):
3371		'''
3372		**Parameters:** same as `D4xdata.__init__()`
3373		'''
3374		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3375	
3376	def save_D49_correl(self, *args, **kwargs):
3377		return self._save_D4x_correl(*args, **kwargs)
3378	
3379	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3380
3381class _SessionPlot():
3382	'''
3383	Simple placeholder class
3384	'''
3385	def __init__(self):
3386		pass
3387
3388_app = typer.Typer(
3389	add_completion = False,
3390	context_settings={'help_option_names': ['-h', '--help']},
3391	rich_markup_mode = 'rich',
3392	)
3393
3394@_app.command()
3395def _cli(
3396	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3397	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3398	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3399	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3400	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3401	):
3402	"""
3403	Process raw D47 data and return standardized results.
3404	
3405	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3406	
3407	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3408	the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3409	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3410	the following sequence is applied:
3411	
3412	* [b]D47data.wg()[/b]
3413	* [b]D47data.crunch()[/b]
3414	* [b]D47data.standardize()[/b]
3415	* [b]D47data.summary()[/b]
3416	* [b]D47data.table_of_samples()[/b]
3417	* [b]D47data.table_of_sessions()[/b]
3418	* [b]D47data.plot_sessions()[/b]
3419	* [b]D47data.plot_residuals()[/b]
3420	* [b]D47data.table_of_analyses()[/b]
3421	* [b]D47data.plot_distribution_of_analyses()[/b]
3422	* [b]D47data.plot_bulk_compositions()[/b]
3423	* [b]D47data.save_D47_correl()[/b]
3424	
3425	Optionally, also apply similar methods for [b]]D48[/b].
3426	
3427	[b]Example CSV file for --anchors option:[/b]	
3428	[i]
3429	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3430	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3431	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3432	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3433	ETH-4,            ,           ,  0.4511,  0.223
3434	[/i]
3435	Except for [i]Sample[/i], none of the columns above are mandatory.
3436
3437	[b]Example CSV file for --exclude option:[/b]	
3438	[i]
3439	Sample,  UID
3440	 FOO-1,
3441	 BAR-2,
3442	      ,  A04
3443	      ,  A17
3444	      ,  A88
3445	[/i]
3446	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3447	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3448	Neither column is mandatory.
3449	"""
3450
3451	data = D47data()
3452	data.read(rawdata)
3453
3454	if exclude != 'none':
3455		exclude = read_csv(exclude)
3456		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3457		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3458	else:
3459		exclude_uid = []
3460		exclude_sample = []
3461	
3462	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3463
3464	if anchors != 'none':
3465		anchors = read_csv(anchors)
3466		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3467			data.Nominal_d13C_VPDB = {
3468				_['Sample']: _['d13C_VPDB']
3469				for _ in anchors
3470				if 'd13C_VPDB' in _
3471				}
3472		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3473			data.Nominal_d18O_VPDB = {
3474				_['Sample']: _['d18O_VPDB']
3475				for _ in anchors
3476				if 'd18O_VPDB' in _
3477				}
3478		if len([_ for _ in anchors if 'D47' in _]):
3479			data.Nominal_D4x = {
3480				_['Sample']: _['D47']
3481				for _ in anchors
3482				if 'D47' in _
3483				}
3484
3485	data.refresh()
3486	data.wg()
3487	data.crunch()
3488	data.standardize()
3489	data.summary(dir = output_dir)
3490	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3491	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3492	data.plot_sessions(dir = output_dir)
3493	data.save_D47_correl(dir = output_dir)
3494	
3495	if not run_D48:
3496		data.table_of_samples(dir = output_dir)
3497		data.table_of_analyses(dir = output_dir)
3498		data.table_of_sessions(dir = output_dir)
3499
3500
3501	if run_D48:
3502		data2 = D48data()
3503		print(rawdata)
3504		data2.read(rawdata)
3505
3506		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3507
3508		if anchors != 'none':
3509			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3510				data2.Nominal_d13C_VPDB = {
3511					_['Sample']: _['d13C_VPDB']
3512					for _ in anchors
3513					if 'd13C_VPDB' in _
3514					}
3515			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3516				data2.Nominal_d18O_VPDB = {
3517					_['Sample']: _['d18O_VPDB']
3518					for _ in anchors
3519					if 'd18O_VPDB' in _
3520					}
3521			if len([_ for _ in anchors if 'D48' in _]):
3522				data2.Nominal_D4x = {
3523					_['Sample']: _['D48']
3524					for _ in anchors
3525					if 'D48' in _
3526					}
3527
3528		data2.refresh()
3529		data2.wg()
3530		data2.crunch()
3531		data2.standardize()
3532		data2.summary(dir = output_dir)
3533		data2.plot_sessions(dir = output_dir)
3534		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3535		data2.plot_distribution_of_analyses(dir = output_dir)
3536		data2.save_D48_correl(dir = output_dir)
3537
3538		table_of_analyses(data, data2, dir = output_dir)
3539		table_of_samples(data, data2, dir = output_dir)
3540		table_of_sessions(data, data2, dir = output_dir)
3541		
3542def __cli():
3543	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
68def fCO2eqD47_Petersen(T):
69	'''
70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
72
73	'''
74	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
79def fCO2eqD47_Wang(T):
80	'''
81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
84	'''
85	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 88def correlated_sum(X, C, w = None):
 89	'''
 90	Compute covariance-aware linear combinations
 91
 92	**Parameters**
 93	
 94	+ `X`: list or 1-D array of values to sum
 95	+ `C`: covariance matrix for the elements of `X`
 96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 97	       (all equal to 1 by default)
 98
 99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
100	to the elements of `w`, accounting for covariances between the elements of `X`.
101	'''
102	if w is None:
103		w = [1 for x in X]
104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
107def make_csv(x, hsep = ',', vsep = '\n'):
108	'''
109	Formats a list of lists of strings as a CSV
110
111	**Parameters**
112
113	+ `x`: the list of lists of strings to format
114	+ `hsep`: the field separator (`,` by default)
115	+ `vsep`: the line-ending convention to use (`\\n` by default)
116
117	**Example**
118
119	```py
120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
121	```
122
123	outputs:
124
125	```py
126	a,b,c
127	d,e,f
128	```
129	'''
130	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
133def pf(txt):
134	'''
135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
136	'''
137	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
140def smart_type(x):
141	'''
142	Tries to convert string `x` to a float if it includes a decimal point, or
143	to an integer if it does not. If both attempts fail, return the original
144	string unchanged.
145	'''
146	try:
147		y = float(x)
148	except ValueError:
149		return x
150	if '.' not in x:
151		return int(y)
152	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

D47crunch_defaults = <D47crunch._Defaults object>
def pretty_table(x, header=1, hsep=' ', vsep=None, align='<'):
161def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
162	'''
163	Reads a list of lists of strings and outputs an ascii table
164
165	**Parameters**
166
167	+ `x`: a list of lists of strings
168	+ `header`: the number of lines to treat as header lines
169	+ `hsep`: the horizontal separator between columns
170	+ `vsep`: the character to use as vertical separator
171	+ `align`: string of left (`<`) or right (`>`) alignment characters.
172
173	**Example**
174
175	```py
176	print(pretty_table([
177		['A', 'B', 'C'],
178		['1', '1.9999', 'foo'],
179		['10', 'x', 'bar'],
180	]))
181	```
182	yields:	
183	```
184	——  ——————  ———
185	A        B    C
186	——  ——————  ———
187	1   1.9999  foo
188	10       x  bar
189	——  ——————  ———
190	```
191
192	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
193	
194	```py
195	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
196	print(pretty_table([
197		['A', 'B', 'C'],
198		['1', '1.9999', 'foo'],
199		['10', 'x', 'bar'],
200	]))
201	```
202	yields:	
203	```
204	==  ======  ===
205	A        B    C
206	==  ======  ===
207	1   1.9999  foo
208	10       x  bar
209	==  ======  ===
210	```
211	'''
212	
213	if vsep is None:
214		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
215	
216	txt = []
217	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
218
219	if len(widths) > len(align):
220		align += '>' * (len(widths)-len(align))
221	sepline = hsep.join([vsep*w for w in widths])
222	txt += [sepline]
223	for k,l in enumerate(x):
224		if k and k == header:
225			txt += [sepline]
226		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
227	txt += [sepline]
228	txt += ['']
229	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

——  ——————  ———
A        B    C
——  ——————  ———
1   1.9999  foo
10       x  bar
——  ——————  ———

To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:

D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

==  ======  ===
A        B    C
==  ======  ===
1   1.9999  foo
10       x  bar
==  ======  ===
def transpose_table(x):
232def transpose_table(x):
233	'''
234	Transpose a list if lists
235
236	**Parameters**
237
238	+ `x`: a list of lists
239
240	**Example**
241
242	```py
243	x = [[1, 2], [3, 4]]
244	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
245	```
246	'''
247	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
250def w_avg(X, sX) :
251	'''
252	Compute variance-weighted average
253
254	Returns the value and SE of the weighted average of the elements of `X`,
255	with relative weights equal to their inverse variances (`1/sX**2`).
256
257	**Parameters**
258
259	+ `X`: array-like of elements to average
260	+ `sX`: array-like of the corresponding SE values
261
262	**Tip**
263
264	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
265	they may be rearranged using `zip()`:
266
267	```python
268	foo = [(0, 1), (1, 0.5), (2, 0.5)]
269	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
270	```
271	'''
272	X = [ x for x in X ]
273	sX = [ sx for sx in sX ]
274	W = [ sx**-2 for sx in sX ]
275	W = [ w/sum(W) for w in W ]
276	Xavg = sum([ w*x for w,x in zip(W,X) ])
277	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
278	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
281def read_csv(filename, sep = ''):
282	'''
283	Read contents of `filename` in csv format and return a list of dictionaries.
284
285	In the csv string, spaces before and after field separators (`','` by default)
286	are optional.
287
288	**Parameters**
289
290	+ `filename`: the csv file to read
291	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
292	whichever appers most often in the contents of `filename`.
293	'''
294	with open(filename) as fid:
295		txt = fid.read()
296
297	if sep == '':
298		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
299	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
300	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
303def simulate_single_analysis(
304	sample = 'MYSAMPLE',
305	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
306	d13C_VPDB = None, d18O_VPDB = None,
307	D47 = None, D48 = None, D49 = 0., D17O = 0.,
308	a47 = 1., b47 = 0., c47 = -0.9,
309	a48 = 1., b48 = 0., c48 = -0.45,
310	Nominal_D47 = None,
311	Nominal_D48 = None,
312	Nominal_d13C_VPDB = None,
313	Nominal_d18O_VPDB = None,
314	ALPHA_18O_ACID_REACTION = None,
315	R13_VPDB = None,
316	R17_VSMOW = None,
317	R18_VSMOW = None,
318	LAMBDA_17 = None,
319	R18_VPDB = None,
320	):
321	'''
322	Compute working-gas delta values for a single analysis, assuming a stochastic working
323	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
324	
325	**Parameters**
326
327	+ `sample`: sample name
328	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
329		(respectively –4 and +26 ‰ by default)
330	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
331	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
332		of the carbonate sample
333	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
334		Δ48 values if `D47` or `D48` are not specified
335	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
336		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
337	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
338	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
339		correction parameters (by default equal to the `D4xdata` default values)
340	
341	Returns a dictionary with fields
342	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
343	'''
344
345	if Nominal_d13C_VPDB is None:
346		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
347
348	if Nominal_d18O_VPDB is None:
349		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
350
351	if ALPHA_18O_ACID_REACTION is None:
352		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
353
354	if R13_VPDB is None:
355		R13_VPDB = D4xdata().R13_VPDB
356
357	if R17_VSMOW is None:
358		R17_VSMOW = D4xdata().R17_VSMOW
359
360	if R18_VSMOW is None:
361		R18_VSMOW = D4xdata().R18_VSMOW
362
363	if LAMBDA_17 is None:
364		LAMBDA_17 = D4xdata().LAMBDA_17
365
366	if R18_VPDB is None:
367		R18_VPDB = D4xdata().R18_VPDB
368	
369	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
370	
371	if Nominal_D47 is None:
372		Nominal_D47 = D47data().Nominal_D47
373
374	if Nominal_D48 is None:
375		Nominal_D48 = D48data().Nominal_D48
376	
377	if d13C_VPDB is None:
378		if sample in Nominal_d13C_VPDB:
379			d13C_VPDB = Nominal_d13C_VPDB[sample]
380		else:
381			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
382
383	if d18O_VPDB is None:
384		if sample in Nominal_d18O_VPDB:
385			d18O_VPDB = Nominal_d18O_VPDB[sample]
386		else:
387			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
388
389	if D47 is None:
390		if sample in Nominal_D47:
391			D47 = Nominal_D47[sample]
392		else:
393			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
394
395	if D48 is None:
396		if sample in Nominal_D48:
397			D48 = Nominal_D48[sample]
398		else:
399			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
400
401	X = D4xdata()
402	X.R13_VPDB = R13_VPDB
403	X.R17_VSMOW = R17_VSMOW
404	X.R18_VSMOW = R18_VSMOW
405	X.LAMBDA_17 = LAMBDA_17
406	X.R18_VPDB = R18_VPDB
407	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
408
409	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
410		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
411		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
412		)
413	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
414		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
415		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
416		D17O=D17O, D47=D47, D48=D48, D49=D49,
417		)
418	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
419		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
420		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
421		D17O=D17O,
422		)
423	
424	d45 = 1000 * (R45/R45wg - 1)
425	d46 = 1000 * (R46/R46wg - 1)
426	d47 = 1000 * (R47/R47wg - 1)
427	d48 = 1000 * (R48/R48wg - 1)
428	d49 = 1000 * (R49/R49wg - 1)
429
430	for k in range(3): # dumb iteration to adjust for small changes in d47
431		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
432		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
433		d47 = 1000 * (R47raw/R47wg - 1)
434		d48 = 1000 * (R48raw/R48wg - 1)
435
436	return dict(
437		Sample = sample,
438		D17O = D17O,
439		d13Cwg_VPDB = d13Cwg_VPDB,
440		d18Owg_VSMOW = d18Owg_VSMOW,
441		d45 = d45,
442		d46 = d46,
443		d47 = d47,
444		d48 = d48,
445		d49 = d49,
446		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
449def virtual_data(
450	samples = [],
451	a47 = 1., b47 = 0., c47 = -0.9,
452	a48 = 1., b48 = 0., c48 = -0.45,
453	rd45 = 0.020, rd46 = 0.060,
454	rD47 = 0.015, rD48 = 0.045,
455	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
456	session = None,
457	Nominal_D47 = None, Nominal_D48 = None,
458	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
459	ALPHA_18O_ACID_REACTION = None,
460	R13_VPDB = None,
461	R17_VSMOW = None,
462	R18_VSMOW = None,
463	LAMBDA_17 = None,
464	R18_VPDB = None,
465	seed = 0,
466	shuffle = True,
467	):
468	'''
469	Return list with simulated analyses from a single session.
470	
471	**Parameters**
472	
473	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
474	    * `Sample`: the name of the sample
475	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
476	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
477	    * `N`: how many analyses to generate for this sample
478	+ `a47`: scrambling factor for Δ47
479	+ `b47`: compositional nonlinearity for Δ47
480	+ `c47`: working gas offset for Δ47
481	+ `a48`: scrambling factor for Δ48
482	+ `b48`: compositional nonlinearity for Δ48
483	+ `c48`: working gas offset for Δ48
484	+ `rd45`: analytical repeatability of δ45
485	+ `rd46`: analytical repeatability of δ46
486	+ `rD47`: analytical repeatability of Δ47
487	+ `rD48`: analytical repeatability of Δ48
488	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
489		(by default equal to the `simulate_single_analysis` default values)
490	+ `session`: name of the session (no name by default)
491	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
492		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
493	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
494		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
495		(by default equal to the `simulate_single_analysis` defaults)
496	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
497		(by default equal to the `simulate_single_analysis` defaults)
498	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
499		correction parameters (by default equal to the `simulate_single_analysis` default)
500	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
501	+ `shuffle`: randomly reorder the sequence of analyses
502	
503		
504	Here is an example of using this method to generate an arbitrary combination of
505	anchors and unknowns for a bunch of sessions:
506
507	```py
508	.. include:: ../../code_examples/virtual_data/example.py
509	```
510	
511	This should output something like:
512	
513	```
514	.. include:: ../../code_examples/virtual_data/output.txt
515	```
516	'''
517	
518	kwargs = locals().copy()
519
520	from numpy import random as nprandom
521	if seed:
522		nprandom.seed(seed)
523		rng = nprandom.default_rng(seed)
524	else:
525		rng = nprandom.default_rng()
526	
527	N = sum([s['N'] for s in samples])
528	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
529	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
530	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
531	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
532	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
533	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
534	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
535	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
536	
537	k = 0
538	out = []
539	for s in samples:
540		kw = {}
541		kw['sample'] = s['Sample']
542		kw = {
543			**kw,
544			**{var: kwargs[var]
545				for var in [
546					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
547					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
548					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
549					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
550					]
551				if kwargs[var] is not None},
552			**{var: s[var]
553				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
554				if var in s},
555			}
556
557		sN = s['N']
558		while sN:
559			out.append(simulate_single_analysis(**kw))
560			out[-1]['d45'] += errors45[k]
561			out[-1]['d46'] += errors46[k]
562			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
563			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
564			sN -= 1
565			k += 1
566
567		if session is not None:
568			for r in out:
569				r['Session'] = session
570
571		if shuffle:
572			nprandom.shuffle(out)
573
574	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————

[table_of_samples] 
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————

[table_of_analyses] 
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
1    Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
2    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
3    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
4    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
5    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
6    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
7    Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
8    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
9    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
10   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
11   Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
12   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
15   Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
16   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
17   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
18   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
19   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
20   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
21   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
22   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
23   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
24   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
25   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
26   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
27   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
28   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
29   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
30   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
31   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
32   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
33   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
34   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
35   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
36   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
37   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
38   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
39   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
40   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
41   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
42   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
43   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
44   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
45   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
46   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
47   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
48   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
49   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
50   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
51   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
52   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
53   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
54   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
55   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
56   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
57   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
58   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
59   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
60   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
576def table_of_samples(
577	data47 = None,
578	data48 = None,
579	dir = 'output',
580	filename = None,
581	save_to_file = True,
582	print_out = True,
583	output = None,
584	):
585	'''
586	Print out, save to disk and/or return a combined table of samples
587	for a pair of `D47data` and `D48data` objects.
588
589	**Parameters**
590
591	+ `data47`: `D47data` instance
592	+ `data48`: `D48data` instance
593	+ `dir`: the directory in which to save the table
594	+ `filename`: the name to the csv file to write to
595	+ `save_to_file`: whether to save the table to disk
596	+ `print_out`: whether to print out the table
597	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
598		if set to `'raw'`: return a list of list of strings
599		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
600	'''
601	if data47 is None:
602		if data48 is None:
603			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
604		else:
605			return data48.table_of_samples(
606				dir = dir,
607				filename = filename,
608				save_to_file = save_to_file,
609				print_out = print_out,
610				output = output
611				)
612	else:
613		if data48 is None:
614			return data47.table_of_samples(
615				dir = dir,
616				filename = filename,
617				save_to_file = save_to_file,
618				print_out = print_out,
619				output = output
620				)
621		else:
622			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
623			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
624			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
625
626			if save_to_file:
627				if not os.path.exists(dir):
628					os.makedirs(dir)
629				if filename is None:
630					filename = f'D47D48_samples.csv'
631				with open(f'{dir}/{filename}', 'w') as fid:
632					fid.write(make_csv(out))
633			if print_out:
634				print('\n'+pretty_table(out))
635			if output == 'raw':
636				return out
637			elif output == 'pretty':
638				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
641def table_of_sessions(
642	data47 = None,
643	data48 = None,
644	dir = 'output',
645	filename = None,
646	save_to_file = True,
647	print_out = True,
648	output = None,
649	):
650	'''
651	Print out, save to disk and/or return a combined table of sessions
652	for a pair of `D47data` and `D48data` objects.
653	***Only applicable if the sessions in `data47` and those in `data48`
654	consist of the exact same sets of analyses.***
655
656	**Parameters**
657
658	+ `data47`: `D47data` instance
659	+ `data48`: `D48data` instance
660	+ `dir`: the directory in which to save the table
661	+ `filename`: the name to the csv file to write to
662	+ `save_to_file`: whether to save the table to disk
663	+ `print_out`: whether to print out the table
664	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
665		if set to `'raw'`: return a list of list of strings
666		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
667	'''
668	if data47 is None:
669		if data48 is None:
670			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
671		else:
672			return data48.table_of_sessions(
673				dir = dir,
674				filename = filename,
675				save_to_file = save_to_file,
676				print_out = print_out,
677				output = output
678				)
679	else:
680		if data48 is None:
681			return data47.table_of_sessions(
682				dir = dir,
683				filename = filename,
684				save_to_file = save_to_file,
685				print_out = print_out,
686				output = output
687				)
688		else:
689			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
690			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
691			for k,x in enumerate(out47[0]):
692				if k>7:
693					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
694					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
695			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
696
697			if save_to_file:
698				if not os.path.exists(dir):
699					os.makedirs(dir)
700				if filename is None:
701					filename = f'D47D48_sessions.csv'
702				with open(f'{dir}/{filename}', 'w') as fid:
703					fid.write(make_csv(out))
704			if print_out:
705				print('\n'+pretty_table(out))
706			if output == 'raw':
707				return out
708			elif output == 'pretty':
709				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
712def table_of_analyses(
713	data47 = None,
714	data48 = None,
715	dir = 'output',
716	filename = None,
717	save_to_file = True,
718	print_out = True,
719	output = None,
720	):
721	'''
722	Print out, save to disk and/or return a combined table of analyses
723	for a pair of `D47data` and `D48data` objects.
724
725	If the sessions in `data47` and those in `data48` do not consist of
726	the exact same sets of analyses, the table will have two columns
727	`Session_47` and `Session_48` instead of a single `Session` column.
728
729	**Parameters**
730
731	+ `data47`: `D47data` instance
732	+ `data48`: `D48data` instance
733	+ `dir`: the directory in which to save the table
734	+ `filename`: the name to the csv file to write to
735	+ `save_to_file`: whether to save the table to disk
736	+ `print_out`: whether to print out the table
737	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
738		if set to `'raw'`: return a list of list of strings
739		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
740	'''
741	if data47 is None:
742		if data48 is None:
743			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
744		else:
745			return data48.table_of_analyses(
746				dir = dir,
747				filename = filename,
748				save_to_file = save_to_file,
749				print_out = print_out,
750				output = output
751				)
752	else:
753		if data48 is None:
754			return data47.table_of_analyses(
755				dir = dir,
756				filename = filename,
757				save_to_file = save_to_file,
758				print_out = print_out,
759				output = output
760				)
761		else:
762			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
763			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
764			
765			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
766				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
767			else:
768				out47[0][1] = 'Session_47'
769				out48[0][1] = 'Session_48'
770				out47 = transpose_table(out47)
771				out48 = transpose_table(out48)
772				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
773
774			if save_to_file:
775				if not os.path.exists(dir):
776					os.makedirs(dir)
777				if filename is None:
778					filename = f'D47D48_sessions.csv'
779				with open(f'{dir}/{filename}', 'w') as fid:
780					fid.write(make_csv(out))
781			if print_out:
782				print('\n'+pretty_table(out))
783			if output == 'raw':
784				return out
785			elif output == 'pretty':
786				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 834class D4xdata(list):
 835	'''
 836	Store and process data for a large set of Δ47 and/or Δ48
 837	analyses, usually comprising more than one analytical session.
 838	'''
 839
 840	### 17O CORRECTION PARAMETERS
 841	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 842	'''
 843	Absolute (13C/12C) ratio of VPDB.
 844	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 845	'''
 846
 847	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 848	'''
 849	Absolute (18O/16C) ratio of VSMOW.
 850	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 851	'''
 852
 853	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 854	'''
 855	Mass-dependent exponent for triple oxygen isotopes.
 856	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 857	'''
 858
 859	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 860	'''
 861	Absolute (17O/16C) ratio of VSMOW.
 862	By default equal to 0.00038475
 863	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 864	rescaled to `R13_VPDB`)
 865	'''
 866
 867	R18_VPDB = R18_VSMOW * 1.03092
 868	'''
 869	Absolute (18O/16C) ratio of VPDB.
 870	By definition equal to `R18_VSMOW * 1.03092`.
 871	'''
 872
 873	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 874	'''
 875	Absolute (17O/16C) ratio of VPDB.
 876	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 877	'''
 878
 879	LEVENE_REF_SAMPLE = 'ETH-3'
 880	'''
 881	After the Δ4x standardization step, each sample is tested to
 882	assess whether the Δ4x variance within all analyses for that
 883	sample differs significantly from that observed for a given reference
 884	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 885	which yields a p-value corresponding to the null hypothesis that the
 886	underlying variances are equal).
 887
 888	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 889	sample should be used as a reference for this test.
 890	'''
 891
 892	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 893	'''
 894	Specifies the 18O/16O fractionation factor generally applicable
 895	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 896	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 897
 898	By default equal to 1.008129 (calcite reacted at 90 °C,
 899	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 900	'''
 901
 902	Nominal_d13C_VPDB = {
 903		'ETH-1': 2.02,
 904		'ETH-2': -10.17,
 905		'ETH-3': 1.71,
 906		}	# (Bernasconi et al., 2018)
 907	'''
 908	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 909	`D4xdata.standardize_d13C()`.
 910
 911	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 912	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 913	'''
 914
 915	Nominal_d18O_VPDB = {
 916		'ETH-1': -2.19,
 917		'ETH-2': -18.69,
 918		'ETH-3': -1.78,
 919		}	# (Bernasconi et al., 2018)
 920	'''
 921	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 922	`D4xdata.standardize_d18O()`.
 923
 924	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 925	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 926	'''
 927
 928	d13C_STANDARDIZATION_METHOD = '2pt'
 929	'''
 930	Method by which to standardize δ13C values:
 931	
 932	+ `none`: do not apply any δ13C standardization.
 933	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 934	minimize the difference between final δ13C_VPDB values and
 935	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 936	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 937	values so as to minimize the difference between final δ13C_VPDB
 938	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 939	is defined).
 940	'''
 941
 942	d18O_STANDARDIZATION_METHOD = '2pt'
 943	'''
 944	Method by which to standardize δ18O values:
 945	
 946	+ `none`: do not apply any δ18O standardization.
 947	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 948	minimize the difference between final δ18O_VPDB values and
 949	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 950	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 951	values so as to minimize the difference between final δ18O_VPDB
 952	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 953	is defined).
 954	'''
 955
 956	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 957		'''
 958		**Parameters**
 959
 960		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 961		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 962		+ `mass`: `'47'` or `'48'`
 963		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 964		+ `session`: define session name for analyses without a `Session` key
 965		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 966
 967		Returns a `D4xdata` object derived from `list`.
 968		'''
 969		self._4x = mass
 970		self.verbose = verbose
 971		self.prefix = 'D4xdata'
 972		self.logfile = logfile
 973		list.__init__(self, l)
 974		self.Nf = None
 975		self.repeatability = {}
 976		self.refresh(session = session)
 977
 978
 979	def make_verbal(oldfun):
 980		'''
 981		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 982		'''
 983		@wraps(oldfun)
 984		def newfun(*args, verbose = '', **kwargs):
 985			myself = args[0]
 986			oldprefix = myself.prefix
 987			myself.prefix = oldfun.__name__
 988			if verbose != '':
 989				oldverbose = myself.verbose
 990				myself.verbose = verbose
 991			out = oldfun(*args, **kwargs)
 992			myself.prefix = oldprefix
 993			if verbose != '':
 994				myself.verbose = oldverbose
 995			return out
 996		return newfun
 997
 998
 999	def msg(self, txt):
1000		'''
1001		Log a message to `self.logfile`, and print it out if `verbose = True`
1002		'''
1003		self.log(txt)
1004		if self.verbose:
1005			print(f'{f"[{self.prefix}]":<16} {txt}')
1006
1007
1008	def vmsg(self, txt):
1009		'''
1010		Log a message to `self.logfile` and print it out
1011		'''
1012		self.log(txt)
1013		print(txt)
1014
1015
1016	def log(self, *txts):
1017		'''
1018		Log a message to `self.logfile`
1019		'''
1020		if self.logfile:
1021			with open(self.logfile, 'a') as fid:
1022				for txt in txts:
1023					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1024
1025
1026	def refresh(self, session = 'mySession'):
1027		'''
1028		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1029		'''
1030		self.fill_in_missing_info(session = session)
1031		self.refresh_sessions()
1032		self.refresh_samples()
1033
1034
1035	def refresh_sessions(self):
1036		'''
1037		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1038		to `False` for all sessions.
1039		'''
1040		self.sessions = {
1041			s: {'data': [r for r in self if r['Session'] == s]}
1042			for s in sorted({r['Session'] for r in self})
1043			}
1044		for s in self.sessions:
1045			self.sessions[s]['scrambling_drift'] = False
1046			self.sessions[s]['slope_drift'] = False
1047			self.sessions[s]['wg_drift'] = False
1048			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1049			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1050
1051
1052	def refresh_samples(self):
1053		'''
1054		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1055		'''
1056		self.samples = {
1057			s: {'data': [r for r in self if r['Sample'] == s]}
1058			for s in sorted({r['Sample'] for r in self})
1059			}
1060		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1061		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1062
1063
1064	def read(self, filename, sep = '', session = ''):
1065		'''
1066		Read file in csv format to load data into a `D47data` object.
1067
1068		In the csv file, spaces before and after field separators (`','` by default)
1069		are optional. Each line corresponds to a single analysis.
1070
1071		The required fields are:
1072
1073		+ `UID`: a unique identifier
1074		+ `Session`: an identifier for the analytical session
1075		+ `Sample`: a sample identifier
1076		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1077
1078		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1079		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1080		and `d49` are optional, and set to NaN by default.
1081
1082		**Parameters**
1083
1084		+ `fileneme`: the path of the file to read
1085		+ `sep`: csv separator delimiting the fields
1086		+ `session`: set `Session` field to this string for all analyses
1087		'''
1088		with open(filename) as fid:
1089			self.input(fid.read(), sep = sep, session = session)
1090
1091
1092	def input(self, txt, sep = '', session = ''):
1093		'''
1094		Read `txt` string in csv format to load analysis data into a `D47data` object.
1095
1096		In the csv string, spaces before and after field separators (`','` by default)
1097		are optional. Each line corresponds to a single analysis.
1098
1099		The required fields are:
1100
1101		+ `UID`: a unique identifier
1102		+ `Session`: an identifier for the analytical session
1103		+ `Sample`: a sample identifier
1104		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1105
1106		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1107		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1108		and `d49` are optional, and set to NaN by default.
1109
1110		**Parameters**
1111
1112		+ `txt`: the csv string to read
1113		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1114		whichever appers most often in `txt`.
1115		+ `session`: set `Session` field to this string for all analyses
1116		'''
1117		if sep == '':
1118			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1119		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1120		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1121
1122		if session != '':
1123			for r in data:
1124				r['Session'] = session
1125
1126		self += data
1127		self.refresh()
1128
1129
1130	@make_verbal
1131	def wg(self, samples = None, a18_acid = None):
1132		'''
1133		Compute bulk composition of the working gas for each session based on
1134		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1135		`self.Nominal_d18O_VPDB`.
1136		'''
1137
1138		self.msg('Computing WG composition:')
1139
1140		if a18_acid is None:
1141			a18_acid = self.ALPHA_18O_ACID_REACTION
1142		if samples is None:
1143			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1144
1145		assert a18_acid, f'Acid fractionation factor should not be zero.'
1146
1147		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1148		R45R46_standards = {}
1149		for sample in samples:
1150			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1151			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1152			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1153			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1154			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1155
1156			C12_s = 1 / (1 + R13_s)
1157			C13_s = R13_s / (1 + R13_s)
1158			C16_s = 1 / (1 + R17_s + R18_s)
1159			C17_s = R17_s / (1 + R17_s + R18_s)
1160			C18_s = R18_s / (1 + R17_s + R18_s)
1161
1162			C626_s = C12_s * C16_s ** 2
1163			C627_s = 2 * C12_s * C16_s * C17_s
1164			C628_s = 2 * C12_s * C16_s * C18_s
1165			C636_s = C13_s * C16_s ** 2
1166			C637_s = 2 * C13_s * C16_s * C17_s
1167			C727_s = C12_s * C17_s ** 2
1168
1169			R45_s = (C627_s + C636_s) / C626_s
1170			R46_s = (C628_s + C637_s + C727_s) / C626_s
1171			R45R46_standards[sample] = (R45_s, R46_s)
1172		
1173		for s in self.sessions:
1174			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1175			assert db, f'No sample from {samples} found in session "{s}".'
1176# 			dbsamples = sorted({r['Sample'] for r in db})
1177
1178			X = [r['d45'] for r in db]
1179			Y = [R45R46_standards[r['Sample']][0] for r in db]
1180			x1, x2 = np.min(X), np.max(X)
1181
1182			if x1 < x2:
1183				wgcoord = x1/(x1-x2)
1184			else:
1185				wgcoord = 999
1186
1187			if wgcoord < -.5 or wgcoord > 1.5:
1188				# unreasonable to extrapolate to d45 = 0
1189				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1190			else :
1191				# d45 = 0 is reasonably well bracketed
1192				R45_wg = np.polyfit(X, Y, 1)[1]
1193
1194			X = [r['d46'] for r in db]
1195			Y = [R45R46_standards[r['Sample']][1] for r in db]
1196			x1, x2 = np.min(X), np.max(X)
1197
1198			if x1 < x2:
1199				wgcoord = x1/(x1-x2)
1200			else:
1201				wgcoord = 999
1202
1203			if wgcoord < -.5 or wgcoord > 1.5:
1204				# unreasonable to extrapolate to d46 = 0
1205				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1206			else :
1207				# d46 = 0 is reasonably well bracketed
1208				R46_wg = np.polyfit(X, Y, 1)[1]
1209
1210			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1211
1212			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1213
1214			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1215			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1216			for r in self.sessions[s]['data']:
1217				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1218				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1219
1220
1221	def compute_bulk_delta(self, R45, R46, D17O = 0):
1222		'''
1223		Compute δ13C_VPDB and δ18O_VSMOW,
1224		by solving the generalized form of equation (17) from
1225		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1226		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1227		solving the corresponding second-order Taylor polynomial.
1228		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1229		'''
1230
1231		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1232
1233		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1234		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1235		C = 2 * self.R18_VSMOW
1236		D = -R46
1237
1238		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1239		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1240		cc = A + B + C + D
1241
1242		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1243
1244		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1245		R17 = K * R18 ** self.LAMBDA_17
1246		R13 = R45 - 2 * R17
1247
1248		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1249
1250		return d13C_VPDB, d18O_VSMOW
1251
1252
1253	@make_verbal
1254	def crunch(self, verbose = ''):
1255		'''
1256		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1257		'''
1258		for r in self:
1259			self.compute_bulk_and_clumping_deltas(r)
1260		self.standardize_d13C()
1261		self.standardize_d18O()
1262		self.msg(f"Crunched {len(self)} analyses.")
1263
1264
1265	def fill_in_missing_info(self, session = 'mySession'):
1266		'''
1267		Fill in optional fields with default values
1268		'''
1269		for i,r in enumerate(self):
1270			if 'D17O' not in r:
1271				r['D17O'] = 0.
1272			if 'UID' not in r:
1273				r['UID'] = f'{i+1}'
1274			if 'Session' not in r:
1275				r['Session'] = session
1276			for k in ['d47', 'd48', 'd49']:
1277				if k not in r:
1278					r[k] = np.nan
1279
1280
1281	def standardize_d13C(self):
1282		'''
1283		Perform δ13C standadization within each session `s` according to
1284		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1285		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1286		may be redefined abitrarily at a later stage.
1287		'''
1288		for s in self.sessions:
1289			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1290				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1291				X,Y = zip(*XY)
1292				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1293					offset = np.mean(Y) - np.mean(X)
1294					for r in self.sessions[s]['data']:
1295						r['d13C_VPDB'] += offset				
1296				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1297					a,b = np.polyfit(X,Y,1)
1298					for r in self.sessions[s]['data']:
1299						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1300
1301	def standardize_d18O(self):
1302		'''
1303		Perform δ18O standadization within each session `s` according to
1304		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1305		which is defined by default by `D47data.refresh_sessions()`as equal to
1306		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1307		'''
1308		for s in self.sessions:
1309			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1310				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1311				X,Y = zip(*XY)
1312				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1313				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1314					offset = np.mean(Y) - np.mean(X)
1315					for r in self.sessions[s]['data']:
1316						r['d18O_VSMOW'] += offset				
1317				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1318					a,b = np.polyfit(X,Y,1)
1319					for r in self.sessions[s]['data']:
1320						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1321	
1322
1323	def compute_bulk_and_clumping_deltas(self, r):
1324		'''
1325		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1326		'''
1327
1328		# Compute working gas R13, R18, and isobar ratios
1329		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1330		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1331		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1332
1333		# Compute analyte isobar ratios
1334		R45 = (1 + r['d45'] / 1000) * R45_wg
1335		R46 = (1 + r['d46'] / 1000) * R46_wg
1336		R47 = (1 + r['d47'] / 1000) * R47_wg
1337		R48 = (1 + r['d48'] / 1000) * R48_wg
1338		R49 = (1 + r['d49'] / 1000) * R49_wg
1339
1340		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1341		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1342		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1343
1344		# Compute stochastic isobar ratios of the analyte
1345		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1346			R13, R18, D17O = r['D17O']
1347		)
1348
1349		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1350		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1351		if (R45 / R45stoch - 1) > 5e-8:
1352			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1353		if (R46 / R46stoch - 1) > 5e-8:
1354			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1355
1356		# Compute raw clumped isotope anomalies
1357		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1358		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1359		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1360
1361
1362	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1363		'''
1364		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1365		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1366		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1367		'''
1368
1369		# Compute R17
1370		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1371
1372		# Compute isotope concentrations
1373		C12 = (1 + R13) ** -1
1374		C13 = C12 * R13
1375		C16 = (1 + R17 + R18) ** -1
1376		C17 = C16 * R17
1377		C18 = C16 * R18
1378
1379		# Compute stochastic isotopologue concentrations
1380		C626 = C16 * C12 * C16
1381		C627 = C16 * C12 * C17 * 2
1382		C628 = C16 * C12 * C18 * 2
1383		C636 = C16 * C13 * C16
1384		C637 = C16 * C13 * C17 * 2
1385		C638 = C16 * C13 * C18 * 2
1386		C727 = C17 * C12 * C17
1387		C728 = C17 * C12 * C18 * 2
1388		C737 = C17 * C13 * C17
1389		C738 = C17 * C13 * C18 * 2
1390		C828 = C18 * C12 * C18
1391		C838 = C18 * C13 * C18
1392
1393		# Compute stochastic isobar ratios
1394		R45 = (C636 + C627) / C626
1395		R46 = (C628 + C637 + C727) / C626
1396		R47 = (C638 + C728 + C737) / C626
1397		R48 = (C738 + C828) / C626
1398		R49 = C838 / C626
1399
1400		# Account for stochastic anomalies
1401		R47 *= 1 + D47 / 1000
1402		R48 *= 1 + D48 / 1000
1403		R49 *= 1 + D49 / 1000
1404
1405		# Return isobar ratios
1406		return R45, R46, R47, R48, R49
1407
1408
1409	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1410		'''
1411		Split unknown samples by UID (treat all analyses as different samples)
1412		or by session (treat analyses of a given sample in different sessions as
1413		different samples).
1414
1415		**Parameters**
1416
1417		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1418		+ `grouping`: `by_uid` | `by_session`
1419		'''
1420		if samples_to_split == 'all':
1421			samples_to_split = [s for s in self.unknowns]
1422		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1423		self.grouping = grouping.lower()
1424		if self.grouping in gkeys:
1425			gkey = gkeys[self.grouping]
1426		for r in self:
1427			if r['Sample'] in samples_to_split:
1428				r['Sample_original'] = r['Sample']
1429				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1430			elif r['Sample'] in self.unknowns:
1431				r['Sample_original'] = r['Sample']
1432		self.refresh_samples()
1433
1434
1435	def unsplit_samples(self, tables = False):
1436		'''
1437		Reverse the effects of `D47data.split_samples()`.
1438		
1439		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1440		
1441		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1442		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1443		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1444		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1445		that case session-averaged Δ4x values are statistically independent).
1446		'''
1447		unknowns_old = sorted({s for s in self.unknowns})
1448		CM_old = self.standardization.covar[:,:]
1449		VD_old = self.standardization.params.valuesdict().copy()
1450		vars_old = self.standardization.var_names
1451
1452		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1453
1454		Ns = len(vars_old) - len(unknowns_old)
1455		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1456		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1457
1458		W = np.zeros((len(vars_new), len(vars_old)))
1459		W[:Ns,:Ns] = np.eye(Ns)
1460		for u in unknowns_new:
1461			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1462			if self.grouping == 'by_session':
1463				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1464			elif self.grouping == 'by_uid':
1465				weights = [1 for s in splits]
1466			sw = sum(weights)
1467			weights = [w/sw for w in weights]
1468			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1469
1470		CM_new = W @ CM_old @ W.T
1471		V = W @ np.array([[VD_old[k]] for k in vars_old])
1472		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1473
1474		self.standardization.covar = CM_new
1475		self.standardization.params.valuesdict = lambda : VD_new
1476		self.standardization.var_names = vars_new
1477
1478		for r in self:
1479			if r['Sample'] in self.unknowns:
1480				r['Sample_split'] = r['Sample']
1481				r['Sample'] = r['Sample_original']
1482
1483		self.refresh_samples()
1484		self.consolidate_samples()
1485		self.repeatabilities()
1486
1487		if tables:
1488			self.table_of_analyses()
1489			self.table_of_samples()
1490
1491	def assign_timestamps(self):
1492		'''
1493		Assign a time field `t` of type `float` to each analysis.
1494
1495		If `TimeTag` is one of the data fields, `t` is equal within a given session
1496		to `TimeTag` minus the mean value of `TimeTag` for that session.
1497		Otherwise, `TimeTag` is by default equal to the index of each analysis
1498		in the dataset and `t` is defined as above.
1499		'''
1500		for session in self.sessions:
1501			sdata = self.sessions[session]['data']
1502			try:
1503				t0 = np.mean([r['TimeTag'] for r in sdata])
1504				for r in sdata:
1505					r['t'] = r['TimeTag'] - t0
1506			except KeyError:
1507				t0 = (len(sdata)-1)/2
1508				for t,r in enumerate(sdata):
1509					r['t'] = t - t0
1510
1511
1512	def report(self):
1513		'''
1514		Prints a report on the standardization fit.
1515		Only applicable after `D4xdata.standardize(method='pooled')`.
1516		'''
1517		report_fit(self.standardization)
1518
1519
1520	def combine_samples(self, sample_groups):
1521		'''
1522		Combine analyses of different samples to compute weighted average Δ4x
1523		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1524		dictionary.
1525		
1526		Caution: samples are weighted by number of replicate analyses, which is a
1527		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1528		correlated analytical errors for one or more samples).
1529		
1530		Returns a tuplet of:
1531		
1532		+ the list of group names
1533		+ an array of the corresponding Δ4x values
1534		+ the corresponding (co)variance matrix
1535		
1536		**Parameters**
1537
1538		+ `sample_groups`: a dictionary of the form:
1539		```py
1540		{'group1': ['sample_1', 'sample_2'],
1541		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1542		```
1543		'''
1544		
1545		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1546		groups = sorted(sample_groups.keys())
1547		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1548		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1549		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1550		W = np.array([
1551			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1552			for j in groups])
1553		D4x_new = W @ D4x_old
1554		CM_new = W @ CM_old @ W.T
1555
1556		return groups, D4x_new[:,0], CM_new
1557		
1558
1559	@make_verbal
1560	def standardize(self,
1561		method = 'pooled',
1562		weighted_sessions = [],
1563		consolidate = True,
1564		consolidate_tables = False,
1565		consolidate_plots = False,
1566		constraints = {},
1567		):
1568		'''
1569		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1570		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1571		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1572		i.e. that their true Δ4x value does not change between sessions,
1573		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1574		`'indep_sessions'`, the standardization processes each session independently, based only
1575		on anchors analyses.
1576		'''
1577
1578		self.standardization_method = method
1579		self.assign_timestamps()
1580
1581		if method == 'pooled':
1582			if weighted_sessions:
1583				for session_group in weighted_sessions:
1584					if self._4x == '47':
1585						X = D47data([r for r in self if r['Session'] in session_group])
1586					elif self._4x == '48':
1587						X = D48data([r for r in self if r['Session'] in session_group])
1588					X.Nominal_D4x = self.Nominal_D4x.copy()
1589					X.refresh()
1590					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1591					w = np.sqrt(result.redchi)
1592					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1593					for r in X:
1594						r[f'wD{self._4x}raw'] *= w
1595			else:
1596				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1597				for r in self:
1598					r[f'wD{self._4x}raw'] = 1.
1599
1600			params = Parameters()
1601			for k,session in enumerate(self.sessions):
1602				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1603				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1604				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1605				s = pf(session)
1606				params.add(f'a_{s}', value = 0.9)
1607				params.add(f'b_{s}', value = 0.)
1608				params.add(f'c_{s}', value = -0.9)
1609				params.add(f'a2_{s}', value = 0.,
1610# 					vary = self.sessions[session]['scrambling_drift'],
1611					)
1612				params.add(f'b2_{s}', value = 0.,
1613# 					vary = self.sessions[session]['slope_drift'],
1614					)
1615				params.add(f'c2_{s}', value = 0.,
1616# 					vary = self.sessions[session]['wg_drift'],
1617					)
1618				if not self.sessions[session]['scrambling_drift']:
1619					params[f'a2_{s}'].expr = '0'
1620				if not self.sessions[session]['slope_drift']:
1621					params[f'b2_{s}'].expr = '0'
1622				if not self.sessions[session]['wg_drift']:
1623					params[f'c2_{s}'].expr = '0'
1624
1625			for sample in self.unknowns:
1626				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1627
1628			for k in constraints:
1629				params[k].expr = constraints[k]
1630
1631			def residuals(p):
1632				R = []
1633				for r in self:
1634					session = pf(r['Session'])
1635					sample = pf(r['Sample'])
1636					if r['Sample'] in self.Nominal_D4x:
1637						R += [ (
1638							r[f'D{self._4x}raw'] - (
1639								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1640								+ p[f'b_{session}'] * r[f'd{self._4x}']
1641								+	p[f'c_{session}']
1642								+ r['t'] * (
1643									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1644									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1645									+	p[f'c2_{session}']
1646									)
1647								)
1648							) / r[f'wD{self._4x}raw'] ]
1649					else:
1650						R += [ (
1651							r[f'D{self._4x}raw'] - (
1652								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1653								+ p[f'b_{session}'] * r[f'd{self._4x}']
1654								+	p[f'c_{session}']
1655								+ r['t'] * (
1656									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1657									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1658									+	p[f'c2_{session}']
1659									)
1660								)
1661							) / r[f'wD{self._4x}raw'] ]
1662				return R
1663
1664			M = Minimizer(residuals, params)
1665			result = M.least_squares()
1666			self.Nf = result.nfree
1667			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1668			new_names, new_covar, new_se = _fullcovar(result)[:3]
1669			result.var_names = new_names
1670			result.covar = new_covar
1671
1672			for r in self:
1673				s = pf(r["Session"])
1674				a = result.params.valuesdict()[f'a_{s}']
1675				b = result.params.valuesdict()[f'b_{s}']
1676				c = result.params.valuesdict()[f'c_{s}']
1677				a2 = result.params.valuesdict()[f'a2_{s}']
1678				b2 = result.params.valuesdict()[f'b2_{s}']
1679				c2 = result.params.valuesdict()[f'c2_{s}']
1680				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1681				
1682
1683			self.standardization = result
1684
1685			for session in self.sessions:
1686				self.sessions[session]['Np'] = 3
1687				for k in ['scrambling', 'slope', 'wg']:
1688					if self.sessions[session][f'{k}_drift']:
1689						self.sessions[session]['Np'] += 1
1690
1691			if consolidate:
1692				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1693			return result
1694
1695
1696		elif method == 'indep_sessions':
1697
1698			if weighted_sessions:
1699				for session_group in weighted_sessions:
1700					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1701					X.Nominal_D4x = self.Nominal_D4x.copy()
1702					X.refresh()
1703					# This is only done to assign r['wD47raw'] for r in X:
1704					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1705					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1706			else:
1707				self.msg('All weights set to 1 ‰')
1708				for r in self:
1709					r[f'wD{self._4x}raw'] = 1
1710
1711			for session in self.sessions:
1712				s = self.sessions[session]
1713				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1714				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1715				s['Np'] = sum(p_active)
1716				sdata = s['data']
1717
1718				A = np.array([
1719					[
1720						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1721						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1722						1 / r[f'wD{self._4x}raw'],
1723						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1724						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1725						r['t'] / r[f'wD{self._4x}raw']
1726						]
1727					for r in sdata if r['Sample'] in self.anchors
1728					])[:,p_active] # only keep columns for the active parameters
1729				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1730				s['Na'] = Y.size
1731				CM = linalg.inv(A.T @ A)
1732				bf = (CM @ A.T @ Y).T[0,:]
1733				k = 0
1734				for n,a in zip(p_names, p_active):
1735					if a:
1736						s[n] = bf[k]
1737# 						self.msg(f'{n} = {bf[k]}')
1738						k += 1
1739					else:
1740						s[n] = 0.
1741# 						self.msg(f'{n} = 0.0')
1742
1743				for r in sdata :
1744					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1745					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1746					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1747
1748				s['CM'] = np.zeros((6,6))
1749				i = 0
1750				k_active = [j for j,a in enumerate(p_active) if a]
1751				for j,a in enumerate(p_active):
1752					if a:
1753						s['CM'][j,k_active] = CM[i,:]
1754						i += 1
1755
1756			if not weighted_sessions:
1757				w = self.rmswd()['rmswd']
1758				for r in self:
1759						r[f'wD{self._4x}'] *= w
1760						r[f'wD{self._4x}raw'] *= w
1761				for session in self.sessions:
1762					self.sessions[session]['CM'] *= w**2
1763
1764			for session in self.sessions:
1765				s = self.sessions[session]
1766				s['SE_a'] = s['CM'][0,0]**.5
1767				s['SE_b'] = s['CM'][1,1]**.5
1768				s['SE_c'] = s['CM'][2,2]**.5
1769				s['SE_a2'] = s['CM'][3,3]**.5
1770				s['SE_b2'] = s['CM'][4,4]**.5
1771				s['SE_c2'] = s['CM'][5,5]**.5
1772
1773			if not weighted_sessions:
1774				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1775			else:
1776				self.Nf = 0
1777				for sg in weighted_sessions:
1778					self.Nf += self.rmswd(sessions = sg)['Nf']
1779
1780			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1781
1782			avgD4x = {
1783				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1784				for sample in self.samples
1785				}
1786			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1787			rD4x = (chi2/self.Nf)**.5
1788			self.repeatability[f'sigma_{self._4x}'] = rD4x
1789
1790			if consolidate:
1791				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1792
1793
1794	def standardization_error(self, session, d4x, D4x, t = 0):
1795		'''
1796		Compute standardization error for a given session and
1797		(δ47, Δ47) composition.
1798		'''
1799		a = self.sessions[session]['a']
1800		b = self.sessions[session]['b']
1801		c = self.sessions[session]['c']
1802		a2 = self.sessions[session]['a2']
1803		b2 = self.sessions[session]['b2']
1804		c2 = self.sessions[session]['c2']
1805		CM = self.sessions[session]['CM']
1806
1807		x, y = D4x, d4x
1808		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1809# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1810		dxdy = -(b+b2*t) / (a+a2*t)
1811		dxdz = 1. / (a+a2*t)
1812		dxda = -x / (a+a2*t)
1813		dxdb = -y / (a+a2*t)
1814		dxdc = -1. / (a+a2*t)
1815		dxda2 = -x * a2 / (a+a2*t)
1816		dxdb2 = -y * t / (a+a2*t)
1817		dxdc2 = -t / (a+a2*t)
1818		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1819		sx = (V @ CM @ V.T) ** .5
1820		return sx
1821
1822
1823	@make_verbal
1824	def summary(self,
1825		dir = 'output',
1826		filename = None,
1827		save_to_file = True,
1828		print_out = True,
1829		):
1830		'''
1831		Print out an/or save to disk a summary of the standardization results.
1832
1833		**Parameters**
1834
1835		+ `dir`: the directory in which to save the table
1836		+ `filename`: the name to the csv file to write to
1837		+ `save_to_file`: whether to save the table to disk
1838		+ `print_out`: whether to print out the table
1839		'''
1840
1841		out = []
1842		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1843		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1844		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1845		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1848		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1849		out += [['Model degrees of freedom', f"{self.Nf}"]]
1850		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1851		out += [['Standardization method', self.standardization_method]]
1852
1853		if save_to_file:
1854			if not os.path.exists(dir):
1855				os.makedirs(dir)
1856			if filename is None:
1857				filename = f'D{self._4x}_summary.csv'
1858			with open(f'{dir}/{filename}', 'w') as fid:
1859				fid.write(make_csv(out))
1860		if print_out:
1861			self.msg('\n' + pretty_table(out, header = 0))
1862
1863
1864	@make_verbal
1865	def table_of_sessions(self,
1866		dir = 'output',
1867		filename = None,
1868		save_to_file = True,
1869		print_out = True,
1870		output = None,
1871		):
1872		'''
1873		Print out an/or save to disk a table of sessions.
1874
1875		**Parameters**
1876
1877		+ `dir`: the directory in which to save the table
1878		+ `filename`: the name to the csv file to write to
1879		+ `save_to_file`: whether to save the table to disk
1880		+ `print_out`: whether to print out the table
1881		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1882		    if set to `'raw'`: return a list of list of strings
1883		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1884		'''
1885		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1886		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1887		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1888
1889		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1890		if include_a2:
1891			out[-1] += ['a2 ± SE']
1892		if include_b2:
1893			out[-1] += ['b2 ± SE']
1894		if include_c2:
1895			out[-1] += ['c2 ± SE']
1896		for session in self.sessions:
1897			out += [[
1898				session,
1899				f"{self.sessions[session]['Na']}",
1900				f"{self.sessions[session]['Nu']}",
1901				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1902				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1903				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1904				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1905				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1906				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1907				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1908				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1909				]]
1910			if include_a2:
1911				if self.sessions[session]['scrambling_drift']:
1912					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1913				else:
1914					out[-1] += ['']
1915			if include_b2:
1916				if self.sessions[session]['slope_drift']:
1917					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1918				else:
1919					out[-1] += ['']
1920			if include_c2:
1921				if self.sessions[session]['wg_drift']:
1922					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1923				else:
1924					out[-1] += ['']
1925
1926		if save_to_file:
1927			if not os.path.exists(dir):
1928				os.makedirs(dir)
1929			if filename is None:
1930				filename = f'D{self._4x}_sessions.csv'
1931			with open(f'{dir}/{filename}', 'w') as fid:
1932				fid.write(make_csv(out))
1933		if print_out:
1934			self.msg('\n' + pretty_table(out))
1935		if output == 'raw':
1936			return out
1937		elif output == 'pretty':
1938			return pretty_table(out)
1939
1940
1941	@make_verbal
1942	def table_of_analyses(
1943		self,
1944		dir = 'output',
1945		filename = None,
1946		save_to_file = True,
1947		print_out = True,
1948		output = None,
1949		):
1950		'''
1951		Print out an/or save to disk a table of analyses.
1952
1953		**Parameters**
1954
1955		+ `dir`: the directory in which to save the table
1956		+ `filename`: the name to the csv file to write to
1957		+ `save_to_file`: whether to save the table to disk
1958		+ `print_out`: whether to print out the table
1959		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1960		    if set to `'raw'`: return a list of list of strings
1961		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1962		'''
1963
1964		out = [['UID','Session','Sample']]
1965		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1966		for f in extra_fields:
1967			out[-1] += [f[0]]
1968		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1969		for r in self:
1970			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1971			for f in extra_fields:
1972				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1973			out[-1] += [
1974				f"{r['d13Cwg_VPDB']:.3f}",
1975				f"{r['d18Owg_VSMOW']:.3f}",
1976				f"{r['d45']:.6f}",
1977				f"{r['d46']:.6f}",
1978				f"{r['d47']:.6f}",
1979				f"{r['d48']:.6f}",
1980				f"{r['d49']:.6f}",
1981				f"{r['d13C_VPDB']:.6f}",
1982				f"{r['d18O_VSMOW']:.6f}",
1983				f"{r['D47raw']:.6f}",
1984				f"{r['D48raw']:.6f}",
1985				f"{r['D49raw']:.6f}",
1986				f"{r[f'D{self._4x}']:.6f}"
1987				]
1988		if save_to_file:
1989			if not os.path.exists(dir):
1990				os.makedirs(dir)
1991			if filename is None:
1992				filename = f'D{self._4x}_analyses.csv'
1993			with open(f'{dir}/{filename}', 'w') as fid:
1994				fid.write(make_csv(out))
1995		if print_out:
1996			self.msg('\n' + pretty_table(out))
1997		return out
1998
1999	@make_verbal
2000	def covar_table(
2001		self,
2002		correl = False,
2003		dir = 'output',
2004		filename = None,
2005		save_to_file = True,
2006		print_out = True,
2007		output = None,
2008		):
2009		'''
2010		Print out, save to disk and/or return the variance-covariance matrix of D4x
2011		for all unknown samples.
2012
2013		**Parameters**
2014
2015		+ `dir`: the directory in which to save the csv
2016		+ `filename`: the name of the csv file to write to
2017		+ `save_to_file`: whether to save the csv
2018		+ `print_out`: whether to print out the matrix
2019		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2020		    if set to `'raw'`: return a list of list of strings
2021		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2022		'''
2023		samples = sorted([u for u in self.unknowns])
2024		out = [[''] + samples]
2025		for s1 in samples:
2026			out.append([s1])
2027			for s2 in samples:
2028				if correl:
2029					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2030				else:
2031					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2032
2033		if save_to_file:
2034			if not os.path.exists(dir):
2035				os.makedirs(dir)
2036			if filename is None:
2037				if correl:
2038					filename = f'D{self._4x}_correl.csv'
2039				else:
2040					filename = f'D{self._4x}_covar.csv'
2041			with open(f'{dir}/{filename}', 'w') as fid:
2042				fid.write(make_csv(out))
2043		if print_out:
2044			self.msg('\n'+pretty_table(out))
2045		if output == 'raw':
2046			return out
2047		elif output == 'pretty':
2048			return pretty_table(out)
2049
2050	@make_verbal
2051	def table_of_samples(
2052		self,
2053		dir = 'output',
2054		filename = None,
2055		save_to_file = True,
2056		print_out = True,
2057		output = None,
2058		):
2059		'''
2060		Print out, save to disk and/or return a table of samples.
2061
2062		**Parameters**
2063
2064		+ `dir`: the directory in which to save the csv
2065		+ `filename`: the name of the csv file to write to
2066		+ `save_to_file`: whether to save the csv
2067		+ `print_out`: whether to print out the table
2068		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2069		    if set to `'raw'`: return a list of list of strings
2070		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2071		'''
2072
2073		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2074		for sample in self.anchors:
2075			out += [[
2076				f"{sample}",
2077				f"{self.samples[sample]['N']}",
2078				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2079				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2080				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2081				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2082				]]
2083		for sample in self.unknowns:
2084			out += [[
2085				f"{sample}",
2086				f"{self.samples[sample]['N']}",
2087				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2088				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2089				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2090				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2091				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2092				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2093				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2094				]]
2095		if save_to_file:
2096			if not os.path.exists(dir):
2097				os.makedirs(dir)
2098			if filename is None:
2099				filename = f'D{self._4x}_samples.csv'
2100			with open(f'{dir}/{filename}', 'w') as fid:
2101				fid.write(make_csv(out))
2102		if print_out:
2103			self.msg('\n'+pretty_table(out))
2104		if output == 'raw':
2105			return out
2106		elif output == 'pretty':
2107			return pretty_table(out)
2108
2109
2110	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2111		'''
2112		Generate session plots and save them to disk.
2113
2114		**Parameters**
2115
2116		+ `dir`: the directory in which to save the plots
2117		+ `figsize`: the width and height (in inches) of each plot
2118		+ `filetype`: 'pdf' or 'png'
2119		+ `dpi`: resolution for PNG output
2120		'''
2121		if not os.path.exists(dir):
2122			os.makedirs(dir)
2123
2124		for session in self.sessions:
2125			sp = self.plot_single_session(session, xylimits = 'constant')
2126			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2127			ppl.close(sp.fig)
2128			
2129
2130
2131	@make_verbal
2132	def consolidate_samples(self):
2133		'''
2134		Compile various statistics for each sample.
2135
2136		For each anchor sample:
2137
2138		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2139		+ `SE_D47` or `SE_D48`: set to zero by definition
2140
2141		For each unknown sample:
2142
2143		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2144		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2145
2146		For each anchor and unknown:
2147
2148		+ `N`: the total number of analyses of this sample
2149		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2150		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2151		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2152		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2153		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2154		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2155		'''
2156		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2157		for sample in self.samples:
2158			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2159			if self.samples[sample]['N'] > 1:
2160				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2161
2162			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2163			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2164
2165			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2166			if len(D4x_pop) > 2:
2167				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2168			
2169		if self.standardization_method == 'pooled':
2170			for sample in self.anchors:
2171				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2172				self.samples[sample][f'SE_D{self._4x}'] = 0.
2173			for sample in self.unknowns:
2174				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2175				try:
2176					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2177				except ValueError:
2178					# when `sample` is constrained by self.standardize(constraints = {...}),
2179					# it is no longer listed in self.standardization.var_names.
2180					# Temporary fix: define SE as zero for now
2181					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2182
2183		elif self.standardization_method == 'indep_sessions':
2184			for sample in self.anchors:
2185				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2186				self.samples[sample][f'SE_D{self._4x}'] = 0.
2187			for sample in self.unknowns:
2188				self.msg(f'Consolidating sample {sample}')
2189				self.unknowns[sample][f'session_D{self._4x}'] = {}
2190				session_avg = []
2191				for session in self.sessions:
2192					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2193					if sdata:
2194						self.msg(f'{sample} found in session {session}')
2195						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2196						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2197						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2198						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2199						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2200						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2201						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2202				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2203				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2204				wsum = sum([weights[s] for s in weights])
2205				for s in weights:
2206					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2207
2208		for r in self:
2209			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2210
2211
2212
2213	def consolidate_sessions(self):
2214		'''
2215		Compute various statistics for each session.
2216
2217		+ `Na`: Number of anchor analyses in the session
2218		+ `Nu`: Number of unknown analyses in the session
2219		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2220		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2221		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2222		+ `a`: scrambling factor
2223		+ `b`: compositional slope
2224		+ `c`: WG offset
2225		+ `SE_a`: Model stadard erorr of `a`
2226		+ `SE_b`: Model stadard erorr of `b`
2227		+ `SE_c`: Model stadard erorr of `c`
2228		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2229		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2230		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2231		+ `a2`: scrambling factor drift
2232		+ `b2`: compositional slope drift
2233		+ `c2`: WG offset drift
2234		+ `Np`: Number of standardization parameters to fit
2235		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2236		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2237		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2238		'''
2239		for session in self.sessions:
2240			if 'd13Cwg_VPDB' not in self.sessions[session]:
2241				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2242			if 'd18Owg_VSMOW' not in self.sessions[session]:
2243				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2244			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2245			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2246
2247			self.msg(f'Computing repeatabilities for session {session}')
2248			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2249			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2250			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2251
2252		if self.standardization_method == 'pooled':
2253			for session in self.sessions:
2254
2255				# different (better?) computation of D4x repeatability for each session:
2256				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2257				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2258
2259				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2260				i = self.standardization.var_names.index(f'a_{pf(session)}')
2261				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2262
2263				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2264				i = self.standardization.var_names.index(f'b_{pf(session)}')
2265				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2266
2267				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2268				i = self.standardization.var_names.index(f'c_{pf(session)}')
2269				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2270
2271				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2272				if self.sessions[session]['scrambling_drift']:
2273					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2274					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2275				else:
2276					self.sessions[session]['SE_a2'] = 0.
2277
2278				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2279				if self.sessions[session]['slope_drift']:
2280					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2281					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2282				else:
2283					self.sessions[session]['SE_b2'] = 0.
2284
2285				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2286				if self.sessions[session]['wg_drift']:
2287					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2288					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2289				else:
2290					self.sessions[session]['SE_c2'] = 0.
2291
2292				i = self.standardization.var_names.index(f'a_{pf(session)}')
2293				j = self.standardization.var_names.index(f'b_{pf(session)}')
2294				k = self.standardization.var_names.index(f'c_{pf(session)}')
2295				CM = np.zeros((6,6))
2296				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2297				try:
2298					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2299					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2300					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2301					try:
2302						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2303						CM[3,4] = self.standardization.covar[i2,j2]
2304						CM[4,3] = self.standardization.covar[j2,i2]
2305					except ValueError:
2306						pass
2307					try:
2308						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2309						CM[3,5] = self.standardization.covar[i2,k2]
2310						CM[5,3] = self.standardization.covar[k2,i2]
2311					except ValueError:
2312						pass
2313				except ValueError:
2314					pass
2315				try:
2316					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2317					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2318					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2319					try:
2320						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2321						CM[4,5] = self.standardization.covar[j2,k2]
2322						CM[5,4] = self.standardization.covar[k2,j2]
2323					except ValueError:
2324						pass
2325				except ValueError:
2326					pass
2327				try:
2328					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2329					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2330					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2331				except ValueError:
2332					pass
2333
2334				self.sessions[session]['CM'] = CM
2335
2336		elif self.standardization_method == 'indep_sessions':
2337			pass # Not implemented yet
2338
2339
2340	@make_verbal
2341	def repeatabilities(self):
2342		'''
2343		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2344		(for all samples, for anchors, and for unknowns).
2345		'''
2346		self.msg('Computing reproducibilities for all sessions')
2347
2348		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2349		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2351		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2352		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2353
2354
2355	@make_verbal
2356	def consolidate(self, tables = True, plots = True):
2357		'''
2358		Collect information about samples, sessions and repeatabilities.
2359		'''
2360		self.consolidate_samples()
2361		self.consolidate_sessions()
2362		self.repeatabilities()
2363
2364		if tables:
2365			self.summary()
2366			self.table_of_sessions()
2367			self.table_of_analyses()
2368			self.table_of_samples()
2369
2370		if plots:
2371			self.plot_sessions()
2372
2373
2374	@make_verbal
2375	def rmswd(self,
2376		samples = 'all samples',
2377		sessions = 'all sessions',
2378		):
2379		'''
2380		Compute the χ2, root mean squared weighted deviation
2381		(i.e. reduced χ2), and corresponding degrees of freedom of the
2382		Δ4x values for samples in `samples` and sessions in `sessions`.
2383		
2384		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2385		'''
2386		if samples == 'all samples':
2387			mysamples = [k for k in self.samples]
2388		elif samples == 'anchors':
2389			mysamples = [k for k in self.anchors]
2390		elif samples == 'unknowns':
2391			mysamples = [k for k in self.unknowns]
2392		else:
2393			mysamples = samples
2394
2395		if sessions == 'all sessions':
2396			sessions = [k for k in self.sessions]
2397
2398		chisq, Nf = 0, 0
2399		for sample in mysamples :
2400			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2401			if len(G) > 1 :
2402				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2403				Nf += (len(G) - 1)
2404				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2405		r = (chisq / Nf)**.5 if Nf > 0 else 0
2406		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2407		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2408
2409	
2410	@make_verbal
2411	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2412		'''
2413		Compute the repeatability of `[r[key] for r in self]`
2414		'''
2415
2416		if samples == 'all samples':
2417			mysamples = [k for k in self.samples]
2418		elif samples == 'anchors':
2419			mysamples = [k for k in self.anchors]
2420		elif samples == 'unknowns':
2421			mysamples = [k for k in self.unknowns]
2422		else:
2423			mysamples = samples
2424
2425		if sessions == 'all sessions':
2426			sessions = [k for k in self.sessions]
2427
2428		if key in ['D47', 'D48']:
2429			# Full disclosure: the definition of Nf is tricky/debatable
2430			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2431			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2432			Nf = len(G)
2433# 			print(f'len(G) = {Nf}')
2434			Nf -= len([s for s in mysamples if s in self.unknowns])
2435# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2436			for session in sessions:
2437				Np = len([
2438					_ for _ in self.standardization.params
2439					if (
2440						self.standardization.params[_].expr is not None
2441						and (
2442							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2443							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2444							)
2445						)
2446					])
2447# 				print(f'session {session}: {Np} parameters to consider')
2448				Na = len({
2449					r['Sample'] for r in self.sessions[session]['data']
2450					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2451					})
2452# 				print(f'session {session}: {Na} different anchors in that session')
2453				Nf -= min(Np, Na)
2454# 			print(f'Nf = {Nf}')
2455
2456# 			for sample in mysamples :
2457# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2458# 				if len(X) > 1 :
2459# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2460# 					if sample in self.unknowns:
2461# 						Nf += len(X) - 1
2462# 					else:
2463# 						Nf += len(X)
2464# 			if samples in ['anchors', 'all samples']:
2465# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2466			r = (chisq / Nf)**.5 if Nf > 0 else 0
2467
2468		else: # if key not in ['D47', 'D48']
2469			chisq, Nf = 0, 0
2470			for sample in mysamples :
2471				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2472				if len(X) > 1 :
2473					Nf += len(X) - 1
2474					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2475			r = (chisq / Nf)**.5 if Nf > 0 else 0
2476
2477		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2478		return r
2479
2480	def sample_average(self, samples, weights = 'equal', normalize = True):
2481		'''
2482		Weighted average Δ4x value of a group of samples, accounting for covariance.
2483
2484		Returns the weighed average Δ4x value and associated SE
2485		of a group of samples. Weights are equal by default. If `normalize` is
2486		true, `weights` will be rescaled so that their sum equals 1.
2487
2488		**Examples**
2489
2490		```python
2491		self.sample_average(['X','Y'], [1, 2])
2492		```
2493
2494		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2495		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2496		values of samples X and Y, respectively.
2497
2498		```python
2499		self.sample_average(['X','Y'], [1, -1], normalize = False)
2500		```
2501
2502		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2503		'''
2504		if weights == 'equal':
2505			weights = [1/len(samples)] * len(samples)
2506
2507		if normalize:
2508			s = sum(weights)
2509			if s:
2510				weights = [w/s for w in weights]
2511
2512		try:
2513# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2514# 			C = self.standardization.covar[indices,:][:,indices]
2515			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2516			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2517			return correlated_sum(X, C, weights)
2518		except ValueError:
2519			return (0., 0.)
2520
2521
2522	def sample_D4x_covar(self, sample1, sample2 = None):
2523		'''
2524		Covariance between Δ4x values of samples
2525
2526		Returns the error covariance between the average Δ4x values of two
2527		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2528		returns the Δ4x variance for that sample.
2529		'''
2530		if sample2 is None:
2531			sample2 = sample1
2532		if self.standardization_method == 'pooled':
2533			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2534			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2535			return self.standardization.covar[i, j]
2536		elif self.standardization_method == 'indep_sessions':
2537			if sample1 == sample2:
2538				return self.samples[sample1][f'SE_D{self._4x}']**2
2539			else:
2540				c = 0
2541				for session in self.sessions:
2542					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2543					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2544					if sdata1 and sdata2:
2545						a = self.sessions[session]['a']
2546						# !! TODO: CM below does not account for temporal changes in standardization parameters
2547						CM = self.sessions[session]['CM'][:3,:3]
2548						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2549						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2550						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2551						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2552						c += (
2553							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2554							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2555							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2556							@ CM
2557							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2558							) / a**2
2559				return float(c)
2560
2561	def sample_D4x_correl(self, sample1, sample2 = None):
2562		'''
2563		Correlation between Δ4x errors of samples
2564
2565		Returns the error correlation between the average Δ4x values of two samples.
2566		'''
2567		if sample2 is None or sample2 == sample1:
2568			return 1.
2569		return (
2570			self.sample_D4x_covar(sample1, sample2)
2571			/ self.unknowns[sample1][f'SE_D{self._4x}']
2572			/ self.unknowns[sample2][f'SE_D{self._4x}']
2573			)
2574
2575	def plot_single_session(self,
2576		session,
2577		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2578		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2579		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2580		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2581		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2582		xylimits = 'free', # | 'constant'
2583		x_label = None,
2584		y_label = None,
2585		error_contour_interval = 'auto',
2586		fig = 'new',
2587		):
2588		'''
2589		Generate plot for a single session
2590		'''
2591		if x_label is None:
2592			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2593		if y_label is None:
2594			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2595
2596		out = _SessionPlot()
2597		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2598		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2599		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2601		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2603		anchor_avg = (np.array([ np.array([
2604				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2605				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2606				]) for sample in anchors]).T,
2607			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2608		unknown_avg = (np.array([ np.array([
2609				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2610				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2611				]) for sample in unknowns]).T,
2612			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2613		
2614		
2615		if fig == 'new':
2616			out.fig = ppl.figure(figsize = (6,6))
2617			ppl.subplots_adjust(.1,.1,.9,.9)
2618
2619		out.anchor_analyses, = ppl.plot(
2620			anchors_d,
2621			anchors_D,
2622			**kw_plot_anchors)
2623		out.unknown_analyses, = ppl.plot(
2624			unknowns_d,
2625			unknowns_D,
2626			**kw_plot_unknowns)
2627		out.anchor_avg = ppl.plot(
2628			*anchor_avg,
2629			**kw_plot_anchor_avg)
2630		out.unknown_avg = ppl.plot(
2631			*unknown_avg,
2632			**kw_plot_unknown_avg)
2633		if xylimits == 'constant':
2634			x = [r[f'd{self._4x}'] for r in self]
2635			y = [r[f'D{self._4x}'] for r in self]
2636			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2637			w, h = x2-x1, y2-y1
2638			x1 -= w/20
2639			x2 += w/20
2640			y1 -= h/20
2641			y2 += h/20
2642			ppl.axis([x1, x2, y1, y2])
2643		elif xylimits == 'free':
2644			x1, x2, y1, y2 = ppl.axis()
2645		else:
2646			x1, x2, y1, y2 = ppl.axis(xylimits)
2647				
2648		if error_contour_interval != 'none':
2649			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2650			XI,YI = np.meshgrid(xi, yi)
2651			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2652			if error_contour_interval == 'auto':
2653				rng = np.max(SI) - np.min(SI)
2654				if rng <= 0.01:
2655					cinterval = 0.001
2656				elif rng <= 0.03:
2657					cinterval = 0.004
2658				elif rng <= 0.1:
2659					cinterval = 0.01
2660				elif rng <= 0.3:
2661					cinterval = 0.03
2662				elif rng <= 1.:
2663					cinterval = 0.1
2664				else:
2665					cinterval = 0.5
2666			else:
2667				cinterval = error_contour_interval
2668
2669			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2670			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2671			out.clabel = ppl.clabel(out.contour)
2672			contour = (XI, YI, SI, cval, cinterval)
2673
2674		if fig == None:
2675			return {
2676			'anchors':anchors,
2677			'unknowns':unknowns,
2678			'anchors_d':anchors_d,
2679			'anchors_D':anchors_D,
2680			'unknowns_d':unknowns_d,
2681			'unknowns_D':unknowns_D,
2682			'anchor_avg':anchor_avg,
2683			'unknown_avg':unknown_avg,
2684			'contour':contour,
2685			}
2686
2687		ppl.xlabel(x_label)
2688		ppl.ylabel(y_label)
2689		ppl.title(session, weight = 'bold')
2690		ppl.grid(alpha = .2)
2691		out.ax = ppl.gca()		
2692
2693		return out
2694
2695	def plot_residuals(
2696		self,
2697		kde = False,
2698		hist = False,
2699		binwidth = 2/3,
2700		dir = 'output',
2701		filename = None,
2702		highlight = [],
2703		colors = None,
2704		figsize = None,
2705		dpi = 100,
2706		yspan = None,
2707		):
2708		'''
2709		Plot residuals of each analysis as a function of time (actually, as a function of
2710		the order of analyses in the `D4xdata` object)
2711
2712		+ `kde`: whether to add a kernel density estimate of residuals
2713		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2714		+ `histbins`: specify bin edges for the histogram
2715		+ `dir`: the directory in which to save the plot
2716		+ `highlight`: a list of samples to highlight
2717		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2718		+ `figsize`: (width, height) of figure
2719		+ `dpi`: resolution for PNG output
2720		+ `yspan`: factor controlling the range of y values shown in plot
2721		  (by default: `yspan = 1.5 if kde else 1.0`)
2722		'''
2723		
2724		from matplotlib import ticker
2725
2726		if yspan is None:
2727			if kde:
2728				yspan = 1.5
2729			else:
2730				yspan = 1.0
2731		
2732		# Layout
2733		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2734		if hist or kde:
2735			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2736			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2737		else:
2738			ppl.subplots_adjust(.08,.05,.78,.8)
2739			ax1 = ppl.subplot(111)
2740		
2741		# Colors
2742		N = len(self.anchors)
2743		if colors is None:
2744			if len(highlight) > 0:
2745				Nh = len(highlight)
2746				if Nh == 1:
2747					colors = {highlight[0]: (0,0,0)}
2748				elif Nh == 3:
2749					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2750				elif Nh == 4:
2751					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2752				else:
2753					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2754			else:
2755				if N == 3:
2756					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2757				elif N == 4:
2758					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2759				else:
2760					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2761
2762		ppl.sca(ax1)
2763		
2764		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2765
2766		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2767
2768		session = self[0]['Session']
2769		x1 = 0
2770# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2771		x_sessions = {}
2772		one_or_more_singlets = False
2773		one_or_more_multiplets = False
2774		multiplets = set()
2775		for k,r in enumerate(self):
2776			if r['Session'] != session:
2777				x2 = k-1
2778				x_sessions[session] = (x1+x2)/2
2779				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2780				session = r['Session']
2781				x1 = k
2782			singlet = len(self.samples[r['Sample']]['data']) == 1
2783			if not singlet:
2784				multiplets.add(r['Sample'])
2785			if r['Sample'] in self.unknowns:
2786				if singlet:
2787					one_or_more_singlets = True
2788				else:
2789					one_or_more_multiplets = True
2790			kw = dict(
2791				marker = 'x' if singlet else '+',
2792				ms = 4 if singlet else 5,
2793				ls = 'None',
2794				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2795				mew = 1,
2796				alpha = 0.2 if singlet else 1,
2797				)
2798			if highlight and r['Sample'] not in highlight:
2799				kw['alpha'] = 0.2
2800			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2801		x2 = k
2802		x_sessions[session] = (x1+x2)/2
2803
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2805		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2806		if not (hist or kde):
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2809
2810		xmin, xmax, ymin, ymax = ppl.axis()
2811		if yspan != 1:
2812			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2813		for s in x_sessions:
2814			ppl.text(
2815				x_sessions[s],
2816				ymax +1,
2817				s,
2818				va = 'bottom',
2819				**(
2820					dict(ha = 'center')
2821					if len(self.sessions[s]['data']) > (0.15 * len(self))
2822					else dict(ha = 'left', rotation = 45)
2823					)
2824				)
2825
2826		if hist or kde:
2827			ppl.sca(ax2)
2828
2829		for s in colors:
2830			kw['marker'] = '+'
2831			kw['ms'] = 5
2832			kw['mec'] = colors[s]
2833			kw['label'] = s
2834			kw['alpha'] = 1
2835			ppl.plot([], [], **kw)
2836
2837		kw['mec'] = (0,0,0)
2838
2839		if one_or_more_singlets:
2840			kw['marker'] = 'x'
2841			kw['ms'] = 4
2842			kw['alpha'] = .2
2843			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2844			ppl.plot([], [], **kw)
2845
2846		if one_or_more_multiplets:
2847			kw['marker'] = '+'
2848			kw['ms'] = 4
2849			kw['alpha'] = 1
2850			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2851			ppl.plot([], [], **kw)
2852
2853		if hist or kde:
2854			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2855		else:
2856			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2857		leg.set_zorder(-1000)
2858
2859		ppl.sca(ax1)
2860
2861		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2862		ppl.xticks([])
2863		ppl.axis([-1, len(self), None, None])
2864
2865		if hist or kde:
2866			ppl.sca(ax2)
2867			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2868
2869			if kde:
2870				from scipy.stats import gaussian_kde
2871				yi = np.linspace(ymin, ymax, 201)
2872				xi = gaussian_kde(X).evaluate(yi)
2873				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2874# 				ppl.plot(xi, yi, 'k-', lw = 1)
2875			elif hist:
2876				ppl.hist(
2877					X,
2878					orientation = 'horizontal',
2879					histtype = 'stepfilled',
2880					ec = [.4]*3,
2881					fc = [.25]*3,
2882					alpha = .25,
2883					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2884					)
2885			ppl.text(0, 0,
2886				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2887				size = 7.5,
2888				alpha = 1,
2889				va = 'center',
2890				ha = 'left',
2891				)
2892
2893			ppl.axis([0, None, ymin, ymax])
2894			ppl.xticks([])
2895			ppl.yticks([])
2896# 			ax2.spines['left'].set_visible(False)
2897			ax2.spines['right'].set_visible(False)
2898			ax2.spines['top'].set_visible(False)
2899			ax2.spines['bottom'].set_visible(False)
2900
2901		ax1.axis([None, None, ymin, ymax])
2902
2903		if not os.path.exists(dir):
2904			os.makedirs(dir)
2905		if filename is None:
2906			return fig
2907		elif filename == '':
2908			filename = f'D{self._4x}_residuals.pdf'
2909		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2910		ppl.close(fig)
2911				
2912
2913	def simulate(self, *args, **kwargs):
2914		'''
2915		Legacy function with warning message pointing to `virtual_data()`
2916		'''
2917		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2918
2919	def plot_distribution_of_analyses(
2920		self,
2921		dir = 'output',
2922		filename = None,
2923		vs_time = False,
2924		figsize = (6,4),
2925		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2926		output = None,
2927		dpi = 100,
2928		):
2929		'''
2930		Plot temporal distribution of all analyses in the data set.
2931		
2932		**Parameters**
2933
2934		+ `dir`: the directory in which to save the plot
2935		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2936		+ `dpi`: resolution for PNG output
2937		+ `figsize`: (width, height) of figure
2938		+ `dpi`: resolution for PNG output
2939		'''
2940
2941		asamples = [s for s in self.anchors]
2942		usamples = [s for s in self.unknowns]
2943		if output is None or output == 'fig':
2944			fig = ppl.figure(figsize = figsize)
2945			ppl.subplots_adjust(*subplots_adjust)
2946		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2948		Xmax += (Xmax-Xmin)/40
2949		Xmin -= (Xmax-Xmin)/41
2950		for k, s in enumerate(asamples + usamples):
2951			if vs_time:
2952				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2953			else:
2954				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2955			Y = [-k for x in X]
2956			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2957			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2958			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2959		ppl.axis([Xmin, Xmax, -k-1, 1])
2960		ppl.xlabel('\ntime')
2961		ppl.gca().annotate('',
2962			xy = (0.6, -0.02),
2963			xycoords = 'axes fraction',
2964			xytext = (.4, -0.02), 
2965            arrowprops = dict(arrowstyle = "->", color = 'k'),
2966            )
2967			
2968
2969		x2 = -1
2970		for session in self.sessions:
2971			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2972			if vs_time:
2973				ppl.axvline(x1, color = 'k', lw = .75)
2974			if x2 > -1:
2975				if not vs_time:
2976					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2977			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2978# 			from xlrd import xldate_as_datetime
2979# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2980			if vs_time:
2981				ppl.axvline(x2, color = 'k', lw = .75)
2982				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2983			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2984
2985		ppl.xticks([])
2986		ppl.yticks([])
2987
2988		if output is None:
2989			if not os.path.exists(dir):
2990				os.makedirs(dir)
2991			if filename == None:
2992				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2993			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2994			ppl.close(fig)
2995		elif output == 'ax':
2996			return ppl.gca()
2997		elif output == 'fig':
2998			return fig
2999
3000
3001	def plot_bulk_compositions(
3002		self,
3003		samples = None,
3004		dir = 'output/bulk_compositions',
3005		figsize = (6,6),
3006		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3007		show = False,
3008		sample_color = (0,.5,1),
3009		analysis_color = (.7,.7,.7),
3010		labeldist = 0.3,
3011		radius = 0.05,
3012		):
3013		'''
3014		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3015		
3016		By default, creates a directory `./output/bulk_compositions` where plots for
3017		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3018		
3019		
3020		**Parameters**
3021
3022		+ `samples`: Only these samples are processed (by default: all samples).
3023		+ `dir`: where to save the plots
3024		+ `figsize`: (width, height) of figure
3025		+ `subplots_adjust`: passed to `subplots_adjust()`
3026		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3027		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3028		+ `sample_color`: color used for replicate markers/labels
3029		+ `analysis_color`: color used for sample markers/labels
3030		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3031		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3032		'''
3033
3034		from matplotlib.patches import Ellipse
3035
3036		if samples is None:
3037			samples = [_ for _ in self.samples]
3038
3039		saved = {}
3040
3041		for s in samples:
3042
3043			fig = ppl.figure(figsize = figsize)
3044			fig.subplots_adjust(*subplots_adjust)
3045			ax = ppl.subplot(111)
3046			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3047			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3048			ppl.title(s)
3049
3050
3051			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3052			UID = [_['UID'] for _ in self.samples[s]['data']]
3053			XY0 = XY.mean(0)
3054
3055			for xy in XY:
3056				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3057				
3058			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3059			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3060			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3061			saved[s] = [XY, XY0]
3062			
3063			x1, x2, y1, y2 = ppl.axis()
3064			x0, dx = (x1+x2)/2, (x2-x1)/2
3065			y0, dy = (y1+y2)/2, (y2-y1)/2
3066			dx, dy = [max(max(dx, dy), radius)]*2
3067
3068			ppl.axis([
3069				x0 - 1.2*dx,
3070				x0 + 1.2*dx,
3071				y0 - 1.2*dy,
3072				y0 + 1.2*dy,
3073				])			
3074
3075			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3076
3077			for xy, uid in zip(XY, UID):
3078
3079				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3080				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3081
3082				if (vector_in_display_space**2).sum() > 0:
3083
3084					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3085					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3086					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3087					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3088
3089					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3090
3091				else:
3092
3093					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3094
3095			if radius:
3096				ax.add_artist(Ellipse(
3097					xy = XY0,
3098					width = radius*2,
3099					height = radius*2,
3100					ls = (0, (2,2)),
3101					lw = .7,
3102					ec = analysis_color,
3103					fc = 'None',
3104					))
3105				ppl.text(
3106					XY0[0],
3107					XY0[1]-radius,
3108					f'\n± {radius*1e3:.0f} ppm',
3109					color = analysis_color,
3110					va = 'top',
3111					ha = 'center',
3112					linespacing = 0.4,
3113					size = 8,
3114					)
3115
3116			if not os.path.exists(dir):
3117				os.makedirs(dir)
3118			fig.savefig(f'{dir}/{s}.pdf')
3119			ppl.close(fig)
3120
3121		fig = ppl.figure(figsize = figsize)
3122		fig.subplots_adjust(*subplots_adjust)
3123		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3124		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3125
3126		for s in saved:
3127			for xy in saved[s][0]:
3128				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3129			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3130			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3131			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3132
3133		x1, x2, y1, y2 = ppl.axis()
3134		ppl.axis([
3135			x1 - (x2-x1)/10,
3136			x2 + (x2-x1)/10,
3137			y1 - (y2-y1)/10,
3138			y2 + (y2-y1)/10,
3139			])			
3140
3141
3142		if not os.path.exists(dir):
3143			os.makedirs(dir)
3144		fig.savefig(f'{dir}/__all__.pdf')
3145		if show:
3146			ppl.show()
3147		ppl.close(fig)
3148		
3149
3150	def _save_D4x_correl(
3151		self,
3152		samples = None,
3153		dir = 'output',
3154		filename = None,
3155		D4x_precision = 4,
3156		correl_precision = 4,
3157		):
3158		'''
3159		Save D4x values along with their SE and correlation matrix.
3160
3161		**Parameters**
3162
3163		+ `samples`: Only these samples are output (by default: all samples).
3164		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3165		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3166		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3167		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3168		'''
3169		if samples is None:
3170			samples = sorted([s for s in self.unknowns])
3171		
3172		out = [['Sample']] + [[s] for s in samples]
3173		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3174		for k,s in enumerate(samples):
3175			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3176			for s2 in samples:
3177				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3178		
3179		if not os.path.exists(dir):
3180			os.makedirs(dir)
3181		if filename is None:
3182			filename = f'D{self._4x}_correl.csv'
3183		with open(f'{dir}/{filename}', 'w') as fid:
3184			fid.write(make_csv(out))

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
956	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
957		'''
958		**Parameters**
959
960		+ `l`: a list of dictionaries, with each dictionary including at least the keys
961		`Sample`, `d45`, `d46`, and `d47` or `d48`.
962		+ `mass`: `'47'` or `'48'`
963		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
964		+ `session`: define session name for analyses without a `Session` key
965		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
966
967		Returns a `D4xdata` object derived from `list`.
968		'''
969		self._4x = mass
970		self.verbose = verbose
971		self.prefix = 'D4xdata'
972		self.logfile = logfile
973		list.__init__(self, l)
974		self.Nf = None
975		self.repeatability = {}
976		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
979	def make_verbal(oldfun):
980		'''
981		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
982		'''
983		@wraps(oldfun)
984		def newfun(*args, verbose = '', **kwargs):
985			myself = args[0]
986			oldprefix = myself.prefix
987			myself.prefix = oldfun.__name__
988			if verbose != '':
989				oldverbose = myself.verbose
990				myself.verbose = verbose
991			out = oldfun(*args, **kwargs)
992			myself.prefix = oldprefix
993			if verbose != '':
994				myself.verbose = oldverbose
995			return out
996		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
 999	def msg(self, txt):
1000		'''
1001		Log a message to `self.logfile`, and print it out if `verbose = True`
1002		'''
1003		self.log(txt)
1004		if self.verbose:
1005			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
1008	def vmsg(self, txt):
1009		'''
1010		Log a message to `self.logfile` and print it out
1011		'''
1012		self.log(txt)
1013		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
1016	def log(self, *txts):
1017		'''
1018		Log a message to `self.logfile`
1019		'''
1020		if self.logfile:
1021			with open(self.logfile, 'a') as fid:
1022				for txt in txts:
1023					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
1026	def refresh(self, session = 'mySession'):
1027		'''
1028		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1029		'''
1030		self.fill_in_missing_info(session = session)
1031		self.refresh_sessions()
1032		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1035	def refresh_sessions(self):
1036		'''
1037		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1038		to `False` for all sessions.
1039		'''
1040		self.sessions = {
1041			s: {'data': [r for r in self if r['Session'] == s]}
1042			for s in sorted({r['Session'] for r in self})
1043			}
1044		for s in self.sessions:
1045			self.sessions[s]['scrambling_drift'] = False
1046			self.sessions[s]['slope_drift'] = False
1047			self.sessions[s]['wg_drift'] = False
1048			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1049			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1052	def refresh_samples(self):
1053		'''
1054		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1055		'''
1056		self.samples = {
1057			s: {'data': [r for r in self if r['Sample'] == s]}
1058			for s in sorted({r['Sample'] for r in self})
1059			}
1060		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1061		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1064	def read(self, filename, sep = '', session = ''):
1065		'''
1066		Read file in csv format to load data into a `D47data` object.
1067
1068		In the csv file, spaces before and after field separators (`','` by default)
1069		are optional. Each line corresponds to a single analysis.
1070
1071		The required fields are:
1072
1073		+ `UID`: a unique identifier
1074		+ `Session`: an identifier for the analytical session
1075		+ `Sample`: a sample identifier
1076		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1077
1078		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1079		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1080		and `d49` are optional, and set to NaN by default.
1081
1082		**Parameters**
1083
1084		+ `fileneme`: the path of the file to read
1085		+ `sep`: csv separator delimiting the fields
1086		+ `session`: set `Session` field to this string for all analyses
1087		'''
1088		with open(filename) as fid:
1089			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1092	def input(self, txt, sep = '', session = ''):
1093		'''
1094		Read `txt` string in csv format to load analysis data into a `D47data` object.
1095
1096		In the csv string, spaces before and after field separators (`','` by default)
1097		are optional. Each line corresponds to a single analysis.
1098
1099		The required fields are:
1100
1101		+ `UID`: a unique identifier
1102		+ `Session`: an identifier for the analytical session
1103		+ `Sample`: a sample identifier
1104		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1105
1106		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1107		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1108		and `d49` are optional, and set to NaN by default.
1109
1110		**Parameters**
1111
1112		+ `txt`: the csv string to read
1113		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1114		whichever appers most often in `txt`.
1115		+ `session`: set `Session` field to this string for all analyses
1116		'''
1117		if sep == '':
1118			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1119		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1120		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1121
1122		if session != '':
1123			for r in data:
1124				r['Session'] = session
1125
1126		self += data
1127		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, a18_acid=None):
1130	@make_verbal
1131	def wg(self, samples = None, a18_acid = None):
1132		'''
1133		Compute bulk composition of the working gas for each session based on
1134		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1135		`self.Nominal_d18O_VPDB`.
1136		'''
1137
1138		self.msg('Computing WG composition:')
1139
1140		if a18_acid is None:
1141			a18_acid = self.ALPHA_18O_ACID_REACTION
1142		if samples is None:
1143			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1144
1145		assert a18_acid, f'Acid fractionation factor should not be zero.'
1146
1147		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1148		R45R46_standards = {}
1149		for sample in samples:
1150			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1151			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1152			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1153			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1154			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1155
1156			C12_s = 1 / (1 + R13_s)
1157			C13_s = R13_s / (1 + R13_s)
1158			C16_s = 1 / (1 + R17_s + R18_s)
1159			C17_s = R17_s / (1 + R17_s + R18_s)
1160			C18_s = R18_s / (1 + R17_s + R18_s)
1161
1162			C626_s = C12_s * C16_s ** 2
1163			C627_s = 2 * C12_s * C16_s * C17_s
1164			C628_s = 2 * C12_s * C16_s * C18_s
1165			C636_s = C13_s * C16_s ** 2
1166			C637_s = 2 * C13_s * C16_s * C17_s
1167			C727_s = C12_s * C17_s ** 2
1168
1169			R45_s = (C627_s + C636_s) / C626_s
1170			R46_s = (C628_s + C637_s + C727_s) / C626_s
1171			R45R46_standards[sample] = (R45_s, R46_s)
1172		
1173		for s in self.sessions:
1174			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1175			assert db, f'No sample from {samples} found in session "{s}".'
1176# 			dbsamples = sorted({r['Sample'] for r in db})
1177
1178			X = [r['d45'] for r in db]
1179			Y = [R45R46_standards[r['Sample']][0] for r in db]
1180			x1, x2 = np.min(X), np.max(X)
1181
1182			if x1 < x2:
1183				wgcoord = x1/(x1-x2)
1184			else:
1185				wgcoord = 999
1186
1187			if wgcoord < -.5 or wgcoord > 1.5:
1188				# unreasonable to extrapolate to d45 = 0
1189				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1190			else :
1191				# d45 = 0 is reasonably well bracketed
1192				R45_wg = np.polyfit(X, Y, 1)[1]
1193
1194			X = [r['d46'] for r in db]
1195			Y = [R45R46_standards[r['Sample']][1] for r in db]
1196			x1, x2 = np.min(X), np.max(X)
1197
1198			if x1 < x2:
1199				wgcoord = x1/(x1-x2)
1200			else:
1201				wgcoord = 999
1202
1203			if wgcoord < -.5 or wgcoord > 1.5:
1204				# unreasonable to extrapolate to d46 = 0
1205				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1206			else :
1207				# d46 = 0 is reasonably well bracketed
1208				R46_wg = np.polyfit(X, Y, 1)[1]
1209
1210			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1211
1212			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1213
1214			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1215			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1216			for r in self.sessions[s]['data']:
1217				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1218				r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

def compute_bulk_delta(self, R45, R46, D17O=0):
1221	def compute_bulk_delta(self, R45, R46, D17O = 0):
1222		'''
1223		Compute δ13C_VPDB and δ18O_VSMOW,
1224		by solving the generalized form of equation (17) from
1225		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1226		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1227		solving the corresponding second-order Taylor polynomial.
1228		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1229		'''
1230
1231		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1232
1233		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1234		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1235		C = 2 * self.R18_VSMOW
1236		D = -R46
1237
1238		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1239		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1240		cc = A + B + C + D
1241
1242		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1243
1244		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1245		R17 = K * R18 ** self.LAMBDA_17
1246		R13 = R45 - 2 * R17
1247
1248		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1249
1250		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1253	@make_verbal
1254	def crunch(self, verbose = ''):
1255		'''
1256		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1257		'''
1258		for r in self:
1259			self.compute_bulk_and_clumping_deltas(r)
1260		self.standardize_d13C()
1261		self.standardize_d18O()
1262		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1265	def fill_in_missing_info(self, session = 'mySession'):
1266		'''
1267		Fill in optional fields with default values
1268		'''
1269		for i,r in enumerate(self):
1270			if 'D17O' not in r:
1271				r['D17O'] = 0.
1272			if 'UID' not in r:
1273				r['UID'] = f'{i+1}'
1274			if 'Session' not in r:
1275				r['Session'] = session
1276			for k in ['d47', 'd48', 'd49']:
1277				if k not in r:
1278					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1281	def standardize_d13C(self):
1282		'''
1283		Perform δ13C standadization within each session `s` according to
1284		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1285		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1286		may be redefined abitrarily at a later stage.
1287		'''
1288		for s in self.sessions:
1289			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1290				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1291				X,Y = zip(*XY)
1292				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1293					offset = np.mean(Y) - np.mean(X)
1294					for r in self.sessions[s]['data']:
1295						r['d13C_VPDB'] += offset				
1296				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1297					a,b = np.polyfit(X,Y,1)
1298					for r in self.sessions[s]['data']:
1299						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1301	def standardize_d18O(self):
1302		'''
1303		Perform δ18O standadization within each session `s` according to
1304		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1305		which is defined by default by `D47data.refresh_sessions()`as equal to
1306		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1307		'''
1308		for s in self.sessions:
1309			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1310				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1311				X,Y = zip(*XY)
1312				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1313				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1314					offset = np.mean(Y) - np.mean(X)
1315					for r in self.sessions[s]['data']:
1316						r['d18O_VSMOW'] += offset				
1317				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1318					a,b = np.polyfit(X,Y,1)
1319					for r in self.sessions[s]['data']:
1320						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1323	def compute_bulk_and_clumping_deltas(self, r):
1324		'''
1325		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1326		'''
1327
1328		# Compute working gas R13, R18, and isobar ratios
1329		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1330		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1331		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1332
1333		# Compute analyte isobar ratios
1334		R45 = (1 + r['d45'] / 1000) * R45_wg
1335		R46 = (1 + r['d46'] / 1000) * R46_wg
1336		R47 = (1 + r['d47'] / 1000) * R47_wg
1337		R48 = (1 + r['d48'] / 1000) * R48_wg
1338		R49 = (1 + r['d49'] / 1000) * R49_wg
1339
1340		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1341		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1342		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1343
1344		# Compute stochastic isobar ratios of the analyte
1345		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1346			R13, R18, D17O = r['D17O']
1347		)
1348
1349		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1350		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1351		if (R45 / R45stoch - 1) > 5e-8:
1352			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1353		if (R46 / R46stoch - 1) > 5e-8:
1354			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1355
1356		# Compute raw clumped isotope anomalies
1357		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1358		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1359		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1362	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1363		'''
1364		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1365		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1366		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1367		'''
1368
1369		# Compute R17
1370		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1371
1372		# Compute isotope concentrations
1373		C12 = (1 + R13) ** -1
1374		C13 = C12 * R13
1375		C16 = (1 + R17 + R18) ** -1
1376		C17 = C16 * R17
1377		C18 = C16 * R18
1378
1379		# Compute stochastic isotopologue concentrations
1380		C626 = C16 * C12 * C16
1381		C627 = C16 * C12 * C17 * 2
1382		C628 = C16 * C12 * C18 * 2
1383		C636 = C16 * C13 * C16
1384		C637 = C16 * C13 * C17 * 2
1385		C638 = C16 * C13 * C18 * 2
1386		C727 = C17 * C12 * C17
1387		C728 = C17 * C12 * C18 * 2
1388		C737 = C17 * C13 * C17
1389		C738 = C17 * C13 * C18 * 2
1390		C828 = C18 * C12 * C18
1391		C838 = C18 * C13 * C18
1392
1393		# Compute stochastic isobar ratios
1394		R45 = (C636 + C627) / C626
1395		R46 = (C628 + C637 + C727) / C626
1396		R47 = (C638 + C728 + C737) / C626
1397		R48 = (C738 + C828) / C626
1398		R49 = C838 / C626
1399
1400		# Account for stochastic anomalies
1401		R47 *= 1 + D47 / 1000
1402		R48 *= 1 + D48 / 1000
1403		R49 *= 1 + D49 / 1000
1404
1405		# Return isobar ratios
1406		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1409	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1410		'''
1411		Split unknown samples by UID (treat all analyses as different samples)
1412		or by session (treat analyses of a given sample in different sessions as
1413		different samples).
1414
1415		**Parameters**
1416
1417		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1418		+ `grouping`: `by_uid` | `by_session`
1419		'''
1420		if samples_to_split == 'all':
1421			samples_to_split = [s for s in self.unknowns]
1422		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1423		self.grouping = grouping.lower()
1424		if self.grouping in gkeys:
1425			gkey = gkeys[self.grouping]
1426		for r in self:
1427			if r['Sample'] in samples_to_split:
1428				r['Sample_original'] = r['Sample']
1429				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1430			elif r['Sample'] in self.unknowns:
1431				r['Sample_original'] = r['Sample']
1432		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1435	def unsplit_samples(self, tables = False):
1436		'''
1437		Reverse the effects of `D47data.split_samples()`.
1438		
1439		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1440		
1441		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1442		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1443		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1444		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1445		that case session-averaged Δ4x values are statistically independent).
1446		'''
1447		unknowns_old = sorted({s for s in self.unknowns})
1448		CM_old = self.standardization.covar[:,:]
1449		VD_old = self.standardization.params.valuesdict().copy()
1450		vars_old = self.standardization.var_names
1451
1452		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1453
1454		Ns = len(vars_old) - len(unknowns_old)
1455		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1456		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1457
1458		W = np.zeros((len(vars_new), len(vars_old)))
1459		W[:Ns,:Ns] = np.eye(Ns)
1460		for u in unknowns_new:
1461			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1462			if self.grouping == 'by_session':
1463				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1464			elif self.grouping == 'by_uid':
1465				weights = [1 for s in splits]
1466			sw = sum(weights)
1467			weights = [w/sw for w in weights]
1468			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1469
1470		CM_new = W @ CM_old @ W.T
1471		V = W @ np.array([[VD_old[k]] for k in vars_old])
1472		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1473
1474		self.standardization.covar = CM_new
1475		self.standardization.params.valuesdict = lambda : VD_new
1476		self.standardization.var_names = vars_new
1477
1478		for r in self:
1479			if r['Sample'] in self.unknowns:
1480				r['Sample_split'] = r['Sample']
1481				r['Sample'] = r['Sample_original']
1482
1483		self.refresh_samples()
1484		self.consolidate_samples()
1485		self.repeatabilities()
1486
1487		if tables:
1488			self.table_of_analyses()
1489			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1491	def assign_timestamps(self):
1492		'''
1493		Assign a time field `t` of type `float` to each analysis.
1494
1495		If `TimeTag` is one of the data fields, `t` is equal within a given session
1496		to `TimeTag` minus the mean value of `TimeTag` for that session.
1497		Otherwise, `TimeTag` is by default equal to the index of each analysis
1498		in the dataset and `t` is defined as above.
1499		'''
1500		for session in self.sessions:
1501			sdata = self.sessions[session]['data']
1502			try:
1503				t0 = np.mean([r['TimeTag'] for r in sdata])
1504				for r in sdata:
1505					r['t'] = r['TimeTag'] - t0
1506			except KeyError:
1507				t0 = (len(sdata)-1)/2
1508				for t,r in enumerate(sdata):
1509					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1512	def report(self):
1513		'''
1514		Prints a report on the standardization fit.
1515		Only applicable after `D4xdata.standardize(method='pooled')`.
1516		'''
1517		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1520	def combine_samples(self, sample_groups):
1521		'''
1522		Combine analyses of different samples to compute weighted average Δ4x
1523		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1524		dictionary.
1525		
1526		Caution: samples are weighted by number of replicate analyses, which is a
1527		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1528		correlated analytical errors for one or more samples).
1529		
1530		Returns a tuplet of:
1531		
1532		+ the list of group names
1533		+ an array of the corresponding Δ4x values
1534		+ the corresponding (co)variance matrix
1535		
1536		**Parameters**
1537
1538		+ `sample_groups`: a dictionary of the form:
1539		```py
1540		{'group1': ['sample_1', 'sample_2'],
1541		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1542		```
1543		'''
1544		
1545		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1546		groups = sorted(sample_groups.keys())
1547		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1548		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1549		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1550		W = np.array([
1551			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1552			for j in groups])
1553		D4x_new = W @ D4x_old
1554		CM_new = W @ CM_old @ W.T
1555
1556		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1559	@make_verbal
1560	def standardize(self,
1561		method = 'pooled',
1562		weighted_sessions = [],
1563		consolidate = True,
1564		consolidate_tables = False,
1565		consolidate_plots = False,
1566		constraints = {},
1567		):
1568		'''
1569		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1570		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1571		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1572		i.e. that their true Δ4x value does not change between sessions,
1573		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1574		`'indep_sessions'`, the standardization processes each session independently, based only
1575		on anchors analyses.
1576		'''
1577
1578		self.standardization_method = method
1579		self.assign_timestamps()
1580
1581		if method == 'pooled':
1582			if weighted_sessions:
1583				for session_group in weighted_sessions:
1584					if self._4x == '47':
1585						X = D47data([r for r in self if r['Session'] in session_group])
1586					elif self._4x == '48':
1587						X = D48data([r for r in self if r['Session'] in session_group])
1588					X.Nominal_D4x = self.Nominal_D4x.copy()
1589					X.refresh()
1590					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1591					w = np.sqrt(result.redchi)
1592					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1593					for r in X:
1594						r[f'wD{self._4x}raw'] *= w
1595			else:
1596				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1597				for r in self:
1598					r[f'wD{self._4x}raw'] = 1.
1599
1600			params = Parameters()
1601			for k,session in enumerate(self.sessions):
1602				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1603				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1604				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1605				s = pf(session)
1606				params.add(f'a_{s}', value = 0.9)
1607				params.add(f'b_{s}', value = 0.)
1608				params.add(f'c_{s}', value = -0.9)
1609				params.add(f'a2_{s}', value = 0.,
1610# 					vary = self.sessions[session]['scrambling_drift'],
1611					)
1612				params.add(f'b2_{s}', value = 0.,
1613# 					vary = self.sessions[session]['slope_drift'],
1614					)
1615				params.add(f'c2_{s}', value = 0.,
1616# 					vary = self.sessions[session]['wg_drift'],
1617					)
1618				if not self.sessions[session]['scrambling_drift']:
1619					params[f'a2_{s}'].expr = '0'
1620				if not self.sessions[session]['slope_drift']:
1621					params[f'b2_{s}'].expr = '0'
1622				if not self.sessions[session]['wg_drift']:
1623					params[f'c2_{s}'].expr = '0'
1624
1625			for sample in self.unknowns:
1626				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1627
1628			for k in constraints:
1629				params[k].expr = constraints[k]
1630
1631			def residuals(p):
1632				R = []
1633				for r in self:
1634					session = pf(r['Session'])
1635					sample = pf(r['Sample'])
1636					if r['Sample'] in self.Nominal_D4x:
1637						R += [ (
1638							r[f'D{self._4x}raw'] - (
1639								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1640								+ p[f'b_{session}'] * r[f'd{self._4x}']
1641								+	p[f'c_{session}']
1642								+ r['t'] * (
1643									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1644									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1645									+	p[f'c2_{session}']
1646									)
1647								)
1648							) / r[f'wD{self._4x}raw'] ]
1649					else:
1650						R += [ (
1651							r[f'D{self._4x}raw'] - (
1652								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1653								+ p[f'b_{session}'] * r[f'd{self._4x}']
1654								+	p[f'c_{session}']
1655								+ r['t'] * (
1656									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1657									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1658									+	p[f'c2_{session}']
1659									)
1660								)
1661							) / r[f'wD{self._4x}raw'] ]
1662				return R
1663
1664			M = Minimizer(residuals, params)
1665			result = M.least_squares()
1666			self.Nf = result.nfree
1667			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1668			new_names, new_covar, new_se = _fullcovar(result)[:3]
1669			result.var_names = new_names
1670			result.covar = new_covar
1671
1672			for r in self:
1673				s = pf(r["Session"])
1674				a = result.params.valuesdict()[f'a_{s}']
1675				b = result.params.valuesdict()[f'b_{s}']
1676				c = result.params.valuesdict()[f'c_{s}']
1677				a2 = result.params.valuesdict()[f'a2_{s}']
1678				b2 = result.params.valuesdict()[f'b2_{s}']
1679				c2 = result.params.valuesdict()[f'c2_{s}']
1680				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1681				
1682
1683			self.standardization = result
1684
1685			for session in self.sessions:
1686				self.sessions[session]['Np'] = 3
1687				for k in ['scrambling', 'slope', 'wg']:
1688					if self.sessions[session][f'{k}_drift']:
1689						self.sessions[session]['Np'] += 1
1690
1691			if consolidate:
1692				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1693			return result
1694
1695
1696		elif method == 'indep_sessions':
1697
1698			if weighted_sessions:
1699				for session_group in weighted_sessions:
1700					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1701					X.Nominal_D4x = self.Nominal_D4x.copy()
1702					X.refresh()
1703					# This is only done to assign r['wD47raw'] for r in X:
1704					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1705					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1706			else:
1707				self.msg('All weights set to 1 ‰')
1708				for r in self:
1709					r[f'wD{self._4x}raw'] = 1
1710
1711			for session in self.sessions:
1712				s = self.sessions[session]
1713				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1714				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1715				s['Np'] = sum(p_active)
1716				sdata = s['data']
1717
1718				A = np.array([
1719					[
1720						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1721						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1722						1 / r[f'wD{self._4x}raw'],
1723						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1724						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1725						r['t'] / r[f'wD{self._4x}raw']
1726						]
1727					for r in sdata if r['Sample'] in self.anchors
1728					])[:,p_active] # only keep columns for the active parameters
1729				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1730				s['Na'] = Y.size
1731				CM = linalg.inv(A.T @ A)
1732				bf = (CM @ A.T @ Y).T[0,:]
1733				k = 0
1734				for n,a in zip(p_names, p_active):
1735					if a:
1736						s[n] = bf[k]
1737# 						self.msg(f'{n} = {bf[k]}')
1738						k += 1
1739					else:
1740						s[n] = 0.
1741# 						self.msg(f'{n} = 0.0')
1742
1743				for r in sdata :
1744					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1745					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1746					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1747
1748				s['CM'] = np.zeros((6,6))
1749				i = 0
1750				k_active = [j for j,a in enumerate(p_active) if a]
1751				for j,a in enumerate(p_active):
1752					if a:
1753						s['CM'][j,k_active] = CM[i,:]
1754						i += 1
1755
1756			if not weighted_sessions:
1757				w = self.rmswd()['rmswd']
1758				for r in self:
1759						r[f'wD{self._4x}'] *= w
1760						r[f'wD{self._4x}raw'] *= w
1761				for session in self.sessions:
1762					self.sessions[session]['CM'] *= w**2
1763
1764			for session in self.sessions:
1765				s = self.sessions[session]
1766				s['SE_a'] = s['CM'][0,0]**.5
1767				s['SE_b'] = s['CM'][1,1]**.5
1768				s['SE_c'] = s['CM'][2,2]**.5
1769				s['SE_a2'] = s['CM'][3,3]**.5
1770				s['SE_b2'] = s['CM'][4,4]**.5
1771				s['SE_c2'] = s['CM'][5,5]**.5
1772
1773			if not weighted_sessions:
1774				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1775			else:
1776				self.Nf = 0
1777				for sg in weighted_sessions:
1778					self.Nf += self.rmswd(sessions = sg)['Nf']
1779
1780			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1781
1782			avgD4x = {
1783				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1784				for sample in self.samples
1785				}
1786			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1787			rD4x = (chi2/self.Nf)**.5
1788			self.repeatability[f'sigma_{self._4x}'] = rD4x
1789
1790			if consolidate:
1791				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1794	def standardization_error(self, session, d4x, D4x, t = 0):
1795		'''
1796		Compute standardization error for a given session and
1797		(δ47, Δ47) composition.
1798		'''
1799		a = self.sessions[session]['a']
1800		b = self.sessions[session]['b']
1801		c = self.sessions[session]['c']
1802		a2 = self.sessions[session]['a2']
1803		b2 = self.sessions[session]['b2']
1804		c2 = self.sessions[session]['c2']
1805		CM = self.sessions[session]['CM']
1806
1807		x, y = D4x, d4x
1808		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1809# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1810		dxdy = -(b+b2*t) / (a+a2*t)
1811		dxdz = 1. / (a+a2*t)
1812		dxda = -x / (a+a2*t)
1813		dxdb = -y / (a+a2*t)
1814		dxdc = -1. / (a+a2*t)
1815		dxda2 = -x * a2 / (a+a2*t)
1816		dxdb2 = -y * t / (a+a2*t)
1817		dxdc2 = -t / (a+a2*t)
1818		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1819		sx = (V @ CM @ V.T) ** .5
1820		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1823	@make_verbal
1824	def summary(self,
1825		dir = 'output',
1826		filename = None,
1827		save_to_file = True,
1828		print_out = True,
1829		):
1830		'''
1831		Print out an/or save to disk a summary of the standardization results.
1832
1833		**Parameters**
1834
1835		+ `dir`: the directory in which to save the table
1836		+ `filename`: the name to the csv file to write to
1837		+ `save_to_file`: whether to save the table to disk
1838		+ `print_out`: whether to print out the table
1839		'''
1840
1841		out = []
1842		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1843		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1844		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1845		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1848		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1849		out += [['Model degrees of freedom', f"{self.Nf}"]]
1850		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1851		out += [['Standardization method', self.standardization_method]]
1852
1853		if save_to_file:
1854			if not os.path.exists(dir):
1855				os.makedirs(dir)
1856			if filename is None:
1857				filename = f'D{self._4x}_summary.csv'
1858			with open(f'{dir}/{filename}', 'w') as fid:
1859				fid.write(make_csv(out))
1860		if print_out:
1861			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1864	@make_verbal
1865	def table_of_sessions(self,
1866		dir = 'output',
1867		filename = None,
1868		save_to_file = True,
1869		print_out = True,
1870		output = None,
1871		):
1872		'''
1873		Print out an/or save to disk a table of sessions.
1874
1875		**Parameters**
1876
1877		+ `dir`: the directory in which to save the table
1878		+ `filename`: the name to the csv file to write to
1879		+ `save_to_file`: whether to save the table to disk
1880		+ `print_out`: whether to print out the table
1881		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1882		    if set to `'raw'`: return a list of list of strings
1883		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1884		'''
1885		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1886		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1887		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1888
1889		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1890		if include_a2:
1891			out[-1] += ['a2 ± SE']
1892		if include_b2:
1893			out[-1] += ['b2 ± SE']
1894		if include_c2:
1895			out[-1] += ['c2 ± SE']
1896		for session in self.sessions:
1897			out += [[
1898				session,
1899				f"{self.sessions[session]['Na']}",
1900				f"{self.sessions[session]['Nu']}",
1901				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1902				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1903				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1904				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1905				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1906				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1907				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1908				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1909				]]
1910			if include_a2:
1911				if self.sessions[session]['scrambling_drift']:
1912					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1913				else:
1914					out[-1] += ['']
1915			if include_b2:
1916				if self.sessions[session]['slope_drift']:
1917					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1918				else:
1919					out[-1] += ['']
1920			if include_c2:
1921				if self.sessions[session]['wg_drift']:
1922					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1923				else:
1924					out[-1] += ['']
1925
1926		if save_to_file:
1927			if not os.path.exists(dir):
1928				os.makedirs(dir)
1929			if filename is None:
1930				filename = f'D{self._4x}_sessions.csv'
1931			with open(f'{dir}/{filename}', 'w') as fid:
1932				fid.write(make_csv(out))
1933		if print_out:
1934			self.msg('\n' + pretty_table(out))
1935		if output == 'raw':
1936			return out
1937		elif output == 'pretty':
1938			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1941	@make_verbal
1942	def table_of_analyses(
1943		self,
1944		dir = 'output',
1945		filename = None,
1946		save_to_file = True,
1947		print_out = True,
1948		output = None,
1949		):
1950		'''
1951		Print out an/or save to disk a table of analyses.
1952
1953		**Parameters**
1954
1955		+ `dir`: the directory in which to save the table
1956		+ `filename`: the name to the csv file to write to
1957		+ `save_to_file`: whether to save the table to disk
1958		+ `print_out`: whether to print out the table
1959		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1960		    if set to `'raw'`: return a list of list of strings
1961		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1962		'''
1963
1964		out = [['UID','Session','Sample']]
1965		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1966		for f in extra_fields:
1967			out[-1] += [f[0]]
1968		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1969		for r in self:
1970			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1971			for f in extra_fields:
1972				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1973			out[-1] += [
1974				f"{r['d13Cwg_VPDB']:.3f}",
1975				f"{r['d18Owg_VSMOW']:.3f}",
1976				f"{r['d45']:.6f}",
1977				f"{r['d46']:.6f}",
1978				f"{r['d47']:.6f}",
1979				f"{r['d48']:.6f}",
1980				f"{r['d49']:.6f}",
1981				f"{r['d13C_VPDB']:.6f}",
1982				f"{r['d18O_VSMOW']:.6f}",
1983				f"{r['D47raw']:.6f}",
1984				f"{r['D48raw']:.6f}",
1985				f"{r['D49raw']:.6f}",
1986				f"{r[f'D{self._4x}']:.6f}"
1987				]
1988		if save_to_file:
1989			if not os.path.exists(dir):
1990				os.makedirs(dir)
1991			if filename is None:
1992				filename = f'D{self._4x}_analyses.csv'
1993			with open(f'{dir}/{filename}', 'w') as fid:
1994				fid.write(make_csv(out))
1995		if print_out:
1996			self.msg('\n' + pretty_table(out))
1997		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1999	@make_verbal
2000	def covar_table(
2001		self,
2002		correl = False,
2003		dir = 'output',
2004		filename = None,
2005		save_to_file = True,
2006		print_out = True,
2007		output = None,
2008		):
2009		'''
2010		Print out, save to disk and/or return the variance-covariance matrix of D4x
2011		for all unknown samples.
2012
2013		**Parameters**
2014
2015		+ `dir`: the directory in which to save the csv
2016		+ `filename`: the name of the csv file to write to
2017		+ `save_to_file`: whether to save the csv
2018		+ `print_out`: whether to print out the matrix
2019		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2020		    if set to `'raw'`: return a list of list of strings
2021		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2022		'''
2023		samples = sorted([u for u in self.unknowns])
2024		out = [[''] + samples]
2025		for s1 in samples:
2026			out.append([s1])
2027			for s2 in samples:
2028				if correl:
2029					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2030				else:
2031					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2032
2033		if save_to_file:
2034			if not os.path.exists(dir):
2035				os.makedirs(dir)
2036			if filename is None:
2037				if correl:
2038					filename = f'D{self._4x}_correl.csv'
2039				else:
2040					filename = f'D{self._4x}_covar.csv'
2041			with open(f'{dir}/{filename}', 'w') as fid:
2042				fid.write(make_csv(out))
2043		if print_out:
2044			self.msg('\n'+pretty_table(out))
2045		if output == 'raw':
2046			return out
2047		elif output == 'pretty':
2048			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2050	@make_verbal
2051	def table_of_samples(
2052		self,
2053		dir = 'output',
2054		filename = None,
2055		save_to_file = True,
2056		print_out = True,
2057		output = None,
2058		):
2059		'''
2060		Print out, save to disk and/or return a table of samples.
2061
2062		**Parameters**
2063
2064		+ `dir`: the directory in which to save the csv
2065		+ `filename`: the name of the csv file to write to
2066		+ `save_to_file`: whether to save the csv
2067		+ `print_out`: whether to print out the table
2068		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2069		    if set to `'raw'`: return a list of list of strings
2070		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2071		'''
2072
2073		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2074		for sample in self.anchors:
2075			out += [[
2076				f"{sample}",
2077				f"{self.samples[sample]['N']}",
2078				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2079				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2080				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2081				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2082				]]
2083		for sample in self.unknowns:
2084			out += [[
2085				f"{sample}",
2086				f"{self.samples[sample]['N']}",
2087				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2088				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2089				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2090				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2091				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2092				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2093				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2094				]]
2095		if save_to_file:
2096			if not os.path.exists(dir):
2097				os.makedirs(dir)
2098			if filename is None:
2099				filename = f'D{self._4x}_samples.csv'
2100			with open(f'{dir}/{filename}', 'w') as fid:
2101				fid.write(make_csv(out))
2102		if print_out:
2103			self.msg('\n'+pretty_table(out))
2104		if output == 'raw':
2105			return out
2106		elif output == 'pretty':
2107			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2110	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2111		'''
2112		Generate session plots and save them to disk.
2113
2114		**Parameters**
2115
2116		+ `dir`: the directory in which to save the plots
2117		+ `figsize`: the width and height (in inches) of each plot
2118		+ `filetype`: 'pdf' or 'png'
2119		+ `dpi`: resolution for PNG output
2120		'''
2121		if not os.path.exists(dir):
2122			os.makedirs(dir)
2123
2124		for session in self.sessions:
2125			sp = self.plot_single_session(session, xylimits = 'constant')
2126			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2127			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2131	@make_verbal
2132	def consolidate_samples(self):
2133		'''
2134		Compile various statistics for each sample.
2135
2136		For each anchor sample:
2137
2138		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2139		+ `SE_D47` or `SE_D48`: set to zero by definition
2140
2141		For each unknown sample:
2142
2143		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2144		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2145
2146		For each anchor and unknown:
2147
2148		+ `N`: the total number of analyses of this sample
2149		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2150		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2151		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2152		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2153		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2154		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2155		'''
2156		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2157		for sample in self.samples:
2158			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2159			if self.samples[sample]['N'] > 1:
2160				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2161
2162			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2163			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2164
2165			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2166			if len(D4x_pop) > 2:
2167				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2168			
2169		if self.standardization_method == 'pooled':
2170			for sample in self.anchors:
2171				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2172				self.samples[sample][f'SE_D{self._4x}'] = 0.
2173			for sample in self.unknowns:
2174				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2175				try:
2176					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2177				except ValueError:
2178					# when `sample` is constrained by self.standardize(constraints = {...}),
2179					# it is no longer listed in self.standardization.var_names.
2180					# Temporary fix: define SE as zero for now
2181					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2182
2183		elif self.standardization_method == 'indep_sessions':
2184			for sample in self.anchors:
2185				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2186				self.samples[sample][f'SE_D{self._4x}'] = 0.
2187			for sample in self.unknowns:
2188				self.msg(f'Consolidating sample {sample}')
2189				self.unknowns[sample][f'session_D{self._4x}'] = {}
2190				session_avg = []
2191				for session in self.sessions:
2192					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2193					if sdata:
2194						self.msg(f'{sample} found in session {session}')
2195						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2196						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2197						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2198						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2199						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2200						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2201						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2202				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2203				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2204				wsum = sum([weights[s] for s in weights])
2205				for s in weights:
2206					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2207
2208		for r in self:
2209			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2213	def consolidate_sessions(self):
2214		'''
2215		Compute various statistics for each session.
2216
2217		+ `Na`: Number of anchor analyses in the session
2218		+ `Nu`: Number of unknown analyses in the session
2219		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2220		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2221		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2222		+ `a`: scrambling factor
2223		+ `b`: compositional slope
2224		+ `c`: WG offset
2225		+ `SE_a`: Model stadard erorr of `a`
2226		+ `SE_b`: Model stadard erorr of `b`
2227		+ `SE_c`: Model stadard erorr of `c`
2228		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2229		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2230		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2231		+ `a2`: scrambling factor drift
2232		+ `b2`: compositional slope drift
2233		+ `c2`: WG offset drift
2234		+ `Np`: Number of standardization parameters to fit
2235		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2236		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2237		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2238		'''
2239		for session in self.sessions:
2240			if 'd13Cwg_VPDB' not in self.sessions[session]:
2241				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2242			if 'd18Owg_VSMOW' not in self.sessions[session]:
2243				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2244			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2245			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2246
2247			self.msg(f'Computing repeatabilities for session {session}')
2248			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2249			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2250			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2251
2252		if self.standardization_method == 'pooled':
2253			for session in self.sessions:
2254
2255				# different (better?) computation of D4x repeatability for each session:
2256				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2257				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2258
2259				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2260				i = self.standardization.var_names.index(f'a_{pf(session)}')
2261				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2262
2263				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2264				i = self.standardization.var_names.index(f'b_{pf(session)}')
2265				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2266
2267				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2268				i = self.standardization.var_names.index(f'c_{pf(session)}')
2269				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2270
2271				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2272				if self.sessions[session]['scrambling_drift']:
2273					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2274					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2275				else:
2276					self.sessions[session]['SE_a2'] = 0.
2277
2278				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2279				if self.sessions[session]['slope_drift']:
2280					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2281					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2282				else:
2283					self.sessions[session]['SE_b2'] = 0.
2284
2285				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2286				if self.sessions[session]['wg_drift']:
2287					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2288					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2289				else:
2290					self.sessions[session]['SE_c2'] = 0.
2291
2292				i = self.standardization.var_names.index(f'a_{pf(session)}')
2293				j = self.standardization.var_names.index(f'b_{pf(session)}')
2294				k = self.standardization.var_names.index(f'c_{pf(session)}')
2295				CM = np.zeros((6,6))
2296				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2297				try:
2298					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2299					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2300					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2301					try:
2302						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2303						CM[3,4] = self.standardization.covar[i2,j2]
2304						CM[4,3] = self.standardization.covar[j2,i2]
2305					except ValueError:
2306						pass
2307					try:
2308						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2309						CM[3,5] = self.standardization.covar[i2,k2]
2310						CM[5,3] = self.standardization.covar[k2,i2]
2311					except ValueError:
2312						pass
2313				except ValueError:
2314					pass
2315				try:
2316					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2317					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2318					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2319					try:
2320						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2321						CM[4,5] = self.standardization.covar[j2,k2]
2322						CM[5,4] = self.standardization.covar[k2,j2]
2323					except ValueError:
2324						pass
2325				except ValueError:
2326					pass
2327				try:
2328					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2329					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2330					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2331				except ValueError:
2332					pass
2333
2334				self.sessions[session]['CM'] = CM
2335
2336		elif self.standardization_method == 'indep_sessions':
2337			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2340	@make_verbal
2341	def repeatabilities(self):
2342		'''
2343		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2344		(for all samples, for anchors, and for unknowns).
2345		'''
2346		self.msg('Computing reproducibilities for all sessions')
2347
2348		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2349		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2351		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2352		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2355	@make_verbal
2356	def consolidate(self, tables = True, plots = True):
2357		'''
2358		Collect information about samples, sessions and repeatabilities.
2359		'''
2360		self.consolidate_samples()
2361		self.consolidate_sessions()
2362		self.repeatabilities()
2363
2364		if tables:
2365			self.summary()
2366			self.table_of_sessions()
2367			self.table_of_analyses()
2368			self.table_of_samples()
2369
2370		if plots:
2371			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2374	@make_verbal
2375	def rmswd(self,
2376		samples = 'all samples',
2377		sessions = 'all sessions',
2378		):
2379		'''
2380		Compute the χ2, root mean squared weighted deviation
2381		(i.e. reduced χ2), and corresponding degrees of freedom of the
2382		Δ4x values for samples in `samples` and sessions in `sessions`.
2383		
2384		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2385		'''
2386		if samples == 'all samples':
2387			mysamples = [k for k in self.samples]
2388		elif samples == 'anchors':
2389			mysamples = [k for k in self.anchors]
2390		elif samples == 'unknowns':
2391			mysamples = [k for k in self.unknowns]
2392		else:
2393			mysamples = samples
2394
2395		if sessions == 'all sessions':
2396			sessions = [k for k in self.sessions]
2397
2398		chisq, Nf = 0, 0
2399		for sample in mysamples :
2400			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2401			if len(G) > 1 :
2402				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2403				Nf += (len(G) - 1)
2404				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2405		r = (chisq / Nf)**.5 if Nf > 0 else 0
2406		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2407		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2410	@make_verbal
2411	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2412		'''
2413		Compute the repeatability of `[r[key] for r in self]`
2414		'''
2415
2416		if samples == 'all samples':
2417			mysamples = [k for k in self.samples]
2418		elif samples == 'anchors':
2419			mysamples = [k for k in self.anchors]
2420		elif samples == 'unknowns':
2421			mysamples = [k for k in self.unknowns]
2422		else:
2423			mysamples = samples
2424
2425		if sessions == 'all sessions':
2426			sessions = [k for k in self.sessions]
2427
2428		if key in ['D47', 'D48']:
2429			# Full disclosure: the definition of Nf is tricky/debatable
2430			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2431			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2432			Nf = len(G)
2433# 			print(f'len(G) = {Nf}')
2434			Nf -= len([s for s in mysamples if s in self.unknowns])
2435# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2436			for session in sessions:
2437				Np = len([
2438					_ for _ in self.standardization.params
2439					if (
2440						self.standardization.params[_].expr is not None
2441						and (
2442							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2443							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2444							)
2445						)
2446					])
2447# 				print(f'session {session}: {Np} parameters to consider')
2448				Na = len({
2449					r['Sample'] for r in self.sessions[session]['data']
2450					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2451					})
2452# 				print(f'session {session}: {Na} different anchors in that session')
2453				Nf -= min(Np, Na)
2454# 			print(f'Nf = {Nf}')
2455
2456# 			for sample in mysamples :
2457# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2458# 				if len(X) > 1 :
2459# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2460# 					if sample in self.unknowns:
2461# 						Nf += len(X) - 1
2462# 					else:
2463# 						Nf += len(X)
2464# 			if samples in ['anchors', 'all samples']:
2465# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2466			r = (chisq / Nf)**.5 if Nf > 0 else 0
2467
2468		else: # if key not in ['D47', 'D48']
2469			chisq, Nf = 0, 0
2470			for sample in mysamples :
2471				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2472				if len(X) > 1 :
2473					Nf += len(X) - 1
2474					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2475			r = (chisq / Nf)**.5 if Nf > 0 else 0
2476
2477		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2478		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2480	def sample_average(self, samples, weights = 'equal', normalize = True):
2481		'''
2482		Weighted average Δ4x value of a group of samples, accounting for covariance.
2483
2484		Returns the weighed average Δ4x value and associated SE
2485		of a group of samples. Weights are equal by default. If `normalize` is
2486		true, `weights` will be rescaled so that their sum equals 1.
2487
2488		**Examples**
2489
2490		```python
2491		self.sample_average(['X','Y'], [1, 2])
2492		```
2493
2494		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2495		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2496		values of samples X and Y, respectively.
2497
2498		```python
2499		self.sample_average(['X','Y'], [1, -1], normalize = False)
2500		```
2501
2502		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2503		'''
2504		if weights == 'equal':
2505			weights = [1/len(samples)] * len(samples)
2506
2507		if normalize:
2508			s = sum(weights)
2509			if s:
2510				weights = [w/s for w in weights]
2511
2512		try:
2513# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2514# 			C = self.standardization.covar[indices,:][:,indices]
2515			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2516			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2517			return correlated_sum(X, C, weights)
2518		except ValueError:
2519			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2522	def sample_D4x_covar(self, sample1, sample2 = None):
2523		'''
2524		Covariance between Δ4x values of samples
2525
2526		Returns the error covariance between the average Δ4x values of two
2527		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2528		returns the Δ4x variance for that sample.
2529		'''
2530		if sample2 is None:
2531			sample2 = sample1
2532		if self.standardization_method == 'pooled':
2533			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2534			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2535			return self.standardization.covar[i, j]
2536		elif self.standardization_method == 'indep_sessions':
2537			if sample1 == sample2:
2538				return self.samples[sample1][f'SE_D{self._4x}']**2
2539			else:
2540				c = 0
2541				for session in self.sessions:
2542					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2543					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2544					if sdata1 and sdata2:
2545						a = self.sessions[session]['a']
2546						# !! TODO: CM below does not account for temporal changes in standardization parameters
2547						CM = self.sessions[session]['CM'][:3,:3]
2548						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2549						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2550						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2551						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2552						c += (
2553							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2554							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2555							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2556							@ CM
2557							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2558							) / a**2
2559				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2561	def sample_D4x_correl(self, sample1, sample2 = None):
2562		'''
2563		Correlation between Δ4x errors of samples
2564
2565		Returns the error correlation between the average Δ4x values of two samples.
2566		'''
2567		if sample2 is None or sample2 == sample1:
2568			return 1.
2569		return (
2570			self.sample_D4x_covar(sample1, sample2)
2571			/ self.unknowns[sample1][f'SE_D{self._4x}']
2572			/ self.unknowns[sample2][f'SE_D{self._4x}']
2573			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2575	def plot_single_session(self,
2576		session,
2577		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2578		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2579		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2580		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2581		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2582		xylimits = 'free', # | 'constant'
2583		x_label = None,
2584		y_label = None,
2585		error_contour_interval = 'auto',
2586		fig = 'new',
2587		):
2588		'''
2589		Generate plot for a single session
2590		'''
2591		if x_label is None:
2592			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2593		if y_label is None:
2594			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2595
2596		out = _SessionPlot()
2597		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2598		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2599		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2601		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2603		anchor_avg = (np.array([ np.array([
2604				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2605				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2606				]) for sample in anchors]).T,
2607			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2608		unknown_avg = (np.array([ np.array([
2609				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2610				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2611				]) for sample in unknowns]).T,
2612			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2613		
2614		
2615		if fig == 'new':
2616			out.fig = ppl.figure(figsize = (6,6))
2617			ppl.subplots_adjust(.1,.1,.9,.9)
2618
2619		out.anchor_analyses, = ppl.plot(
2620			anchors_d,
2621			anchors_D,
2622			**kw_plot_anchors)
2623		out.unknown_analyses, = ppl.plot(
2624			unknowns_d,
2625			unknowns_D,
2626			**kw_plot_unknowns)
2627		out.anchor_avg = ppl.plot(
2628			*anchor_avg,
2629			**kw_plot_anchor_avg)
2630		out.unknown_avg = ppl.plot(
2631			*unknown_avg,
2632			**kw_plot_unknown_avg)
2633		if xylimits == 'constant':
2634			x = [r[f'd{self._4x}'] for r in self]
2635			y = [r[f'D{self._4x}'] for r in self]
2636			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2637			w, h = x2-x1, y2-y1
2638			x1 -= w/20
2639			x2 += w/20
2640			y1 -= h/20
2641			y2 += h/20
2642			ppl.axis([x1, x2, y1, y2])
2643		elif xylimits == 'free':
2644			x1, x2, y1, y2 = ppl.axis()
2645		else:
2646			x1, x2, y1, y2 = ppl.axis(xylimits)
2647				
2648		if error_contour_interval != 'none':
2649			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2650			XI,YI = np.meshgrid(xi, yi)
2651			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2652			if error_contour_interval == 'auto':
2653				rng = np.max(SI) - np.min(SI)
2654				if rng <= 0.01:
2655					cinterval = 0.001
2656				elif rng <= 0.03:
2657					cinterval = 0.004
2658				elif rng <= 0.1:
2659					cinterval = 0.01
2660				elif rng <= 0.3:
2661					cinterval = 0.03
2662				elif rng <= 1.:
2663					cinterval = 0.1
2664				else:
2665					cinterval = 0.5
2666			else:
2667				cinterval = error_contour_interval
2668
2669			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2670			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2671			out.clabel = ppl.clabel(out.contour)
2672			contour = (XI, YI, SI, cval, cinterval)
2673
2674		if fig == None:
2675			return {
2676			'anchors':anchors,
2677			'unknowns':unknowns,
2678			'anchors_d':anchors_d,
2679			'anchors_D':anchors_D,
2680			'unknowns_d':unknowns_d,
2681			'unknowns_D':unknowns_D,
2682			'anchor_avg':anchor_avg,
2683			'unknown_avg':unknown_avg,
2684			'contour':contour,
2685			}
2686
2687		ppl.xlabel(x_label)
2688		ppl.ylabel(y_label)
2689		ppl.title(session, weight = 'bold')
2690		ppl.grid(alpha = .2)
2691		out.ax = ppl.gca()		
2692
2693		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2695	def plot_residuals(
2696		self,
2697		kde = False,
2698		hist = False,
2699		binwidth = 2/3,
2700		dir = 'output',
2701		filename = None,
2702		highlight = [],
2703		colors = None,
2704		figsize = None,
2705		dpi = 100,
2706		yspan = None,
2707		):
2708		'''
2709		Plot residuals of each analysis as a function of time (actually, as a function of
2710		the order of analyses in the `D4xdata` object)
2711
2712		+ `kde`: whether to add a kernel density estimate of residuals
2713		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2714		+ `histbins`: specify bin edges for the histogram
2715		+ `dir`: the directory in which to save the plot
2716		+ `highlight`: a list of samples to highlight
2717		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2718		+ `figsize`: (width, height) of figure
2719		+ `dpi`: resolution for PNG output
2720		+ `yspan`: factor controlling the range of y values shown in plot
2721		  (by default: `yspan = 1.5 if kde else 1.0`)
2722		'''
2723		
2724		from matplotlib import ticker
2725
2726		if yspan is None:
2727			if kde:
2728				yspan = 1.5
2729			else:
2730				yspan = 1.0
2731		
2732		# Layout
2733		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2734		if hist or kde:
2735			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2736			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2737		else:
2738			ppl.subplots_adjust(.08,.05,.78,.8)
2739			ax1 = ppl.subplot(111)
2740		
2741		# Colors
2742		N = len(self.anchors)
2743		if colors is None:
2744			if len(highlight) > 0:
2745				Nh = len(highlight)
2746				if Nh == 1:
2747					colors = {highlight[0]: (0,0,0)}
2748				elif Nh == 3:
2749					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2750				elif Nh == 4:
2751					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2752				else:
2753					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2754			else:
2755				if N == 3:
2756					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2757				elif N == 4:
2758					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2759				else:
2760					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2761
2762		ppl.sca(ax1)
2763		
2764		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2765
2766		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2767
2768		session = self[0]['Session']
2769		x1 = 0
2770# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2771		x_sessions = {}
2772		one_or_more_singlets = False
2773		one_or_more_multiplets = False
2774		multiplets = set()
2775		for k,r in enumerate(self):
2776			if r['Session'] != session:
2777				x2 = k-1
2778				x_sessions[session] = (x1+x2)/2
2779				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2780				session = r['Session']
2781				x1 = k
2782			singlet = len(self.samples[r['Sample']]['data']) == 1
2783			if not singlet:
2784				multiplets.add(r['Sample'])
2785			if r['Sample'] in self.unknowns:
2786				if singlet:
2787					one_or_more_singlets = True
2788				else:
2789					one_or_more_multiplets = True
2790			kw = dict(
2791				marker = 'x' if singlet else '+',
2792				ms = 4 if singlet else 5,
2793				ls = 'None',
2794				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2795				mew = 1,
2796				alpha = 0.2 if singlet else 1,
2797				)
2798			if highlight and r['Sample'] not in highlight:
2799				kw['alpha'] = 0.2
2800			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2801		x2 = k
2802		x_sessions[session] = (x1+x2)/2
2803
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2805		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2806		if not (hist or kde):
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2809
2810		xmin, xmax, ymin, ymax = ppl.axis()
2811		if yspan != 1:
2812			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2813		for s in x_sessions:
2814			ppl.text(
2815				x_sessions[s],
2816				ymax +1,
2817				s,
2818				va = 'bottom',
2819				**(
2820					dict(ha = 'center')
2821					if len(self.sessions[s]['data']) > (0.15 * len(self))
2822					else dict(ha = 'left', rotation = 45)
2823					)
2824				)
2825
2826		if hist or kde:
2827			ppl.sca(ax2)
2828
2829		for s in colors:
2830			kw['marker'] = '+'
2831			kw['ms'] = 5
2832			kw['mec'] = colors[s]
2833			kw['label'] = s
2834			kw['alpha'] = 1
2835			ppl.plot([], [], **kw)
2836
2837		kw['mec'] = (0,0,0)
2838
2839		if one_or_more_singlets:
2840			kw['marker'] = 'x'
2841			kw['ms'] = 4
2842			kw['alpha'] = .2
2843			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2844			ppl.plot([], [], **kw)
2845
2846		if one_or_more_multiplets:
2847			kw['marker'] = '+'
2848			kw['ms'] = 4
2849			kw['alpha'] = 1
2850			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2851			ppl.plot([], [], **kw)
2852
2853		if hist or kde:
2854			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2855		else:
2856			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2857		leg.set_zorder(-1000)
2858
2859		ppl.sca(ax1)
2860
2861		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2862		ppl.xticks([])
2863		ppl.axis([-1, len(self), None, None])
2864
2865		if hist or kde:
2866			ppl.sca(ax2)
2867			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2868
2869			if kde:
2870				from scipy.stats import gaussian_kde
2871				yi = np.linspace(ymin, ymax, 201)
2872				xi = gaussian_kde(X).evaluate(yi)
2873				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2874# 				ppl.plot(xi, yi, 'k-', lw = 1)
2875			elif hist:
2876				ppl.hist(
2877					X,
2878					orientation = 'horizontal',
2879					histtype = 'stepfilled',
2880					ec = [.4]*3,
2881					fc = [.25]*3,
2882					alpha = .25,
2883					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2884					)
2885			ppl.text(0, 0,
2886				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2887				size = 7.5,
2888				alpha = 1,
2889				va = 'center',
2890				ha = 'left',
2891				)
2892
2893			ppl.axis([0, None, ymin, ymax])
2894			ppl.xticks([])
2895			ppl.yticks([])
2896# 			ax2.spines['left'].set_visible(False)
2897			ax2.spines['right'].set_visible(False)
2898			ax2.spines['top'].set_visible(False)
2899			ax2.spines['bottom'].set_visible(False)
2900
2901		ax1.axis([None, None, ymin, ymax])
2902
2903		if not os.path.exists(dir):
2904			os.makedirs(dir)
2905		if filename is None:
2906			return fig
2907		elif filename == '':
2908			filename = f'D{self._4x}_residuals.pdf'
2909		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2910		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: <color>} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2913	def simulate(self, *args, **kwargs):
2914		'''
2915		Legacy function with warning message pointing to `virtual_data()`
2916		'''
2917		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
2919	def plot_distribution_of_analyses(
2920		self,
2921		dir = 'output',
2922		filename = None,
2923		vs_time = False,
2924		figsize = (6,4),
2925		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2926		output = None,
2927		dpi = 100,
2928		):
2929		'''
2930		Plot temporal distribution of all analyses in the data set.
2931		
2932		**Parameters**
2933
2934		+ `dir`: the directory in which to save the plot
2935		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2936		+ `dpi`: resolution for PNG output
2937		+ `figsize`: (width, height) of figure
2938		+ `dpi`: resolution for PNG output
2939		'''
2940
2941		asamples = [s for s in self.anchors]
2942		usamples = [s for s in self.unknowns]
2943		if output is None or output == 'fig':
2944			fig = ppl.figure(figsize = figsize)
2945			ppl.subplots_adjust(*subplots_adjust)
2946		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2948		Xmax += (Xmax-Xmin)/40
2949		Xmin -= (Xmax-Xmin)/41
2950		for k, s in enumerate(asamples + usamples):
2951			if vs_time:
2952				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2953			else:
2954				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2955			Y = [-k for x in X]
2956			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2957			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2958			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2959		ppl.axis([Xmin, Xmax, -k-1, 1])
2960		ppl.xlabel('\ntime')
2961		ppl.gca().annotate('',
2962			xy = (0.6, -0.02),
2963			xycoords = 'axes fraction',
2964			xytext = (.4, -0.02), 
2965            arrowprops = dict(arrowstyle = "->", color = 'k'),
2966            )
2967			
2968
2969		x2 = -1
2970		for session in self.sessions:
2971			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2972			if vs_time:
2973				ppl.axvline(x1, color = 'k', lw = .75)
2974			if x2 > -1:
2975				if not vs_time:
2976					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2977			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2978# 			from xlrd import xldate_as_datetime
2979# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2980			if vs_time:
2981				ppl.axvline(x2, color = 'k', lw = .75)
2982				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2983			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2984
2985		ppl.xticks([])
2986		ppl.yticks([])
2987
2988		if output is None:
2989			if not os.path.exists(dir):
2990				os.makedirs(dir)
2991			if filename == None:
2992				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2993			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2994			ppl.close(fig)
2995		elif output == 'ax':
2996			return ppl.gca()
2997		elif output == 'fig':
2998			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
3001	def plot_bulk_compositions(
3002		self,
3003		samples = None,
3004		dir = 'output/bulk_compositions',
3005		figsize = (6,6),
3006		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3007		show = False,
3008		sample_color = (0,.5,1),
3009		analysis_color = (.7,.7,.7),
3010		labeldist = 0.3,
3011		radius = 0.05,
3012		):
3013		'''
3014		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3015		
3016		By default, creates a directory `./output/bulk_compositions` where plots for
3017		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3018		
3019		
3020		**Parameters**
3021
3022		+ `samples`: Only these samples are processed (by default: all samples).
3023		+ `dir`: where to save the plots
3024		+ `figsize`: (width, height) of figure
3025		+ `subplots_adjust`: passed to `subplots_adjust()`
3026		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3027		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3028		+ `sample_color`: color used for replicate markers/labels
3029		+ `analysis_color`: color used for sample markers/labels
3030		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3031		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3032		'''
3033
3034		from matplotlib.patches import Ellipse
3035
3036		if samples is None:
3037			samples = [_ for _ in self.samples]
3038
3039		saved = {}
3040
3041		for s in samples:
3042
3043			fig = ppl.figure(figsize = figsize)
3044			fig.subplots_adjust(*subplots_adjust)
3045			ax = ppl.subplot(111)
3046			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3047			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3048			ppl.title(s)
3049
3050
3051			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3052			UID = [_['UID'] for _ in self.samples[s]['data']]
3053			XY0 = XY.mean(0)
3054
3055			for xy in XY:
3056				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3057				
3058			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3059			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3060			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3061			saved[s] = [XY, XY0]
3062			
3063			x1, x2, y1, y2 = ppl.axis()
3064			x0, dx = (x1+x2)/2, (x2-x1)/2
3065			y0, dy = (y1+y2)/2, (y2-y1)/2
3066			dx, dy = [max(max(dx, dy), radius)]*2
3067
3068			ppl.axis([
3069				x0 - 1.2*dx,
3070				x0 + 1.2*dx,
3071				y0 - 1.2*dy,
3072				y0 + 1.2*dy,
3073				])			
3074
3075			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3076
3077			for xy, uid in zip(XY, UID):
3078
3079				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3080				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3081
3082				if (vector_in_display_space**2).sum() > 0:
3083
3084					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3085					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3086					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3087					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3088
3089					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3090
3091				else:
3092
3093					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3094
3095			if radius:
3096				ax.add_artist(Ellipse(
3097					xy = XY0,
3098					width = radius*2,
3099					height = radius*2,
3100					ls = (0, (2,2)),
3101					lw = .7,
3102					ec = analysis_color,
3103					fc = 'None',
3104					))
3105				ppl.text(
3106					XY0[0],
3107					XY0[1]-radius,
3108					f'\n± {radius*1e3:.0f} ppm',
3109					color = analysis_color,
3110					va = 'top',
3111					ha = 'center',
3112					linespacing = 0.4,
3113					size = 8,
3114					)
3115
3116			if not os.path.exists(dir):
3117				os.makedirs(dir)
3118			fig.savefig(f'{dir}/{s}.pdf')
3119			ppl.close(fig)
3120
3121		fig = ppl.figure(figsize = figsize)
3122		fig.subplots_adjust(*subplots_adjust)
3123		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3124		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3125
3126		for s in saved:
3127			for xy in saved[s][0]:
3128				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3129			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3130			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3131			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3132
3133		x1, x2, y1, y2 = ppl.axis()
3134		ppl.axis([
3135			x1 - (x2-x1)/10,
3136			x2 + (x2-x1)/10,
3137			y1 - (y2-y1)/10,
3138			y2 + (y2-y1)/10,
3139			])			
3140
3141
3142		if not os.path.exists(dir):
3143			os.makedirs(dir)
3144		fig.savefig(f'{dir}/__all__.pdf')
3145		if show:
3146			ppl.show()
3147		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3189class D47data(D4xdata):
3190	'''
3191	Store and process data for a large set of Δ47 analyses,
3192	usually comprising more than one analytical session.
3193	'''
3194
3195	Nominal_D4x = {
3196		'ETH-1':   0.2052,
3197		'ETH-2':   0.2085,
3198		'ETH-3':   0.6132,
3199		'ETH-4':   0.4511,
3200		'IAEA-C1': 0.3018,
3201		'IAEA-C2': 0.6409,
3202		'MERCK':   0.5135,
3203		} # I-CDES (Bernasconi et al., 2021)
3204	'''
3205	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3206	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3207	reference frame.
3208
3209	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3210	```py
3211	{
3212		'ETH-1'   : 0.2052,
3213		'ETH-2'   : 0.2085,
3214		'ETH-3'   : 0.6132,
3215		'ETH-4'   : 0.4511,
3216		'IAEA-C1' : 0.3018,
3217		'IAEA-C2' : 0.6409,
3218		'MERCK'   : 0.5135,
3219	}
3220	```
3221	'''
3222
3223
3224	@property
3225	def Nominal_D47(self):
3226		return self.Nominal_D4x
3227	
3228
3229	@Nominal_D47.setter
3230	def Nominal_D47(self, new):
3231		self.Nominal_D4x = dict(**new)
3232		self.refresh()
3233
3234
3235	def __init__(self, l = [], **kwargs):
3236		'''
3237		**Parameters:** same as `D4xdata.__init__()`
3238		'''
3239		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3240
3241
3242	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3243		'''
3244		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3245		value for that temperature, and add treat these samples as additional anchors.
3246
3247		**Parameters**
3248
3249		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3250		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3251		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3252		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3253		if `new`: keep pre-existing anchors but update them in case of conflict
3254		between old and new Δ47 values;
3255		if `old`: keep pre-existing anchors but preserve their original Δ47
3256		values in case of conflict.
3257		'''
3258		f = {
3259			'petersen': fCO2eqD47_Petersen,
3260			'wang': fCO2eqD47_Wang,
3261			}[fCo2eqD47]
3262		foo = {}
3263		for r in self:
3264			if 'Teq' in r:
3265				if r['Sample'] in foo:
3266					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3267				else:
3268					foo[r['Sample']] = f(r['Teq'])
3269			else:
3270					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3271
3272		if priority == 'replace':
3273			self.Nominal_D47 = {}
3274		for s in foo:
3275			if priority != 'old' or s not in self.Nominal_D47:
3276				self.Nominal_D47[s] = foo[s]
3277	
3278	def save_D47_correl(self, *args, **kwargs):
3279		return self._save_D4x_correl(*args, **kwargs)
3280
3281	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3235	def __init__(self, l = [], **kwargs):
3236		'''
3237		**Parameters:** same as `D4xdata.__init__()`
3238		'''
3239		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3224	@property
3225	def Nominal_D47(self):
3226		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3242	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3243		'''
3244		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3245		value for that temperature, and add treat these samples as additional anchors.
3246
3247		**Parameters**
3248
3249		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3250		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3251		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3252		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3253		if `new`: keep pre-existing anchors but update them in case of conflict
3254		between old and new Δ47 values;
3255		if `old`: keep pre-existing anchors but preserve their original Δ47
3256		values in case of conflict.
3257		'''
3258		f = {
3259			'petersen': fCO2eqD47_Petersen,
3260			'wang': fCO2eqD47_Wang,
3261			}[fCo2eqD47]
3262		foo = {}
3263		for r in self:
3264			if 'Teq' in r:
3265				if r['Sample'] in foo:
3266					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3267				else:
3268					foo[r['Sample']] = f(r['Teq'])
3269			else:
3270					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3271
3272		if priority == 'replace':
3273			self.Nominal_D47 = {}
3274		for s in foo:
3275			if priority != 'old' or s not in self.Nominal_D47:
3276				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3278	def save_D47_correl(self, *args, **kwargs):
3279		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D48data(D4xdata):
3284class D48data(D4xdata):
3285	'''
3286	Store and process data for a large set of Δ48 analyses,
3287	usually comprising more than one analytical session.
3288	'''
3289
3290	Nominal_D4x = {
3291		'ETH-1':  0.138,
3292		'ETH-2':  0.138,
3293		'ETH-3':  0.270,
3294		'ETH-4':  0.223,
3295		'GU-1':  -0.419,
3296		} # (Fiebig et al., 2019, 2021)
3297	'''
3298	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3299	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3300	reference frame.
3301
3302	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3303	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3304
3305	```py
3306	{
3307		'ETH-1' :  0.138,
3308		'ETH-2' :  0.138,
3309		'ETH-3' :  0.270,
3310		'ETH-4' :  0.223,
3311		'GU-1'  : -0.419,
3312	}
3313	```
3314	'''
3315
3316
3317	@property
3318	def Nominal_D48(self):
3319		return self.Nominal_D4x
3320
3321	
3322	@Nominal_D48.setter
3323	def Nominal_D48(self, new):
3324		self.Nominal_D4x = dict(**new)
3325		self.refresh()
3326
3327
3328	def __init__(self, l = [], **kwargs):
3329		'''
3330		**Parameters:** same as `D4xdata.__init__()`
3331		'''
3332		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3333
3334	def save_D48_correl(self, *args, **kwargs):
3335		return self._save_D4x_correl(*args, **kwargs)
3336
3337	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3328	def __init__(self, l = [], **kwargs):
3329		'''
3330		**Parameters:** same as `D4xdata.__init__()`
3331		'''
3332		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3317	@property
3318	def Nominal_D48(self):
3319		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3334	def save_D48_correl(self, *args, **kwargs):
3335		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D49data(D4xdata):
3340class D49data(D4xdata):
3341	'''
3342	Store and process data for a large set of Δ49 analyses,
3343	usually comprising more than one analytical session.
3344	'''
3345	
3346	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3347	'''
3348	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3349	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3350	reference frame.
3351
3352	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3353
3354	```py
3355	{
3356		"1000C": 0.0,
3357		"25C": 2.228
3358	}
3359	```
3360	'''
3361	
3362	@property
3363	def Nominal_D49(self):
3364		return self.Nominal_D4x
3365	
3366	@Nominal_D49.setter
3367	def Nominal_D49(self, new):
3368		self.Nominal_D4x = dict(**new)
3369		self.refresh()
3370	
3371	def __init__(self, l=[], **kwargs):
3372		'''
3373		**Parameters:** same as `D4xdata.__init__()`
3374		'''
3375		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3376	
3377	def save_D49_correl(self, *args, **kwargs):
3378		return self._save_D4x_correl(*args, **kwargs)
3379	
3380	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3371	def __init__(self, l=[], **kwargs):
3372		'''
3373		**Parameters:** same as `D4xdata.__init__()`
3374		'''
3375		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3362	@property
3363	def Nominal_D49(self):
3364		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3377	def save_D49_correl(self, *args, **kwargs):
3378		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)