D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16<h1>API Documentation</h1>
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) Mathieu Daëron'
  23__license__   = 'MIT License - https://opensource.org/licenses/MIT'
  24__date__      = '2025-12-15'
  25__version__   = '2.5.3'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42from typer import rich_utils
  43
  44rich_utils.STYLE_HELPTEXT = ''
  45
  46rcParams['font.family'] = 'sans-serif'
  47rcParams['font.sans-serif'] = 'Helvetica'
  48rcParams['font.size'] = 10
  49rcParams['mathtext.fontset'] = 'custom'
  50rcParams['mathtext.rm'] = 'sans'
  51rcParams['mathtext.bf'] = 'sans:bold'
  52rcParams['mathtext.it'] = 'sans:italic'
  53rcParams['mathtext.cal'] = 'sans:italic'
  54rcParams['mathtext.default'] = 'rm'
  55rcParams['xtick.major.size'] = 4
  56rcParams['xtick.major.width'] = 1
  57rcParams['ytick.major.size'] = 4
  58rcParams['ytick.major.width'] = 1
  59rcParams['axes.grid'] = False
  60rcParams['axes.linewidth'] = 1
  61rcParams['grid.linewidth'] = .75
  62rcParams['grid.linestyle'] = '-'
  63rcParams['grid.alpha'] = .15
  64rcParams['savefig.dpi'] = 150
  65
  66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  68def fCO2eqD47_Petersen(T):
  69	'''
  70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  72
  73	'''
  74	return float(_fCO2eqD47_Petersen(T))
  75
  76
  77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  79def fCO2eqD47_Wang(T):
  80	'''
  81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  84	'''
  85	return float(_fCO2eqD47_Wang(T))
  86
  87
  88def correlated_sum(X, C, w = None):
  89	'''
  90	Compute covariance-aware linear combinations
  91
  92	**Parameters**
  93	
  94	+ `X`: list or 1-D array of values to sum
  95	+ `C`: covariance matrix for the elements of `X`
  96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  97	       (all equal to 1 by default)
  98
  99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
 100	to the elements of `w`, accounting for covariances between the elements of `X`.
 101	'''
 102	if w is None:
 103		w = [1 for x in X]
 104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 105
 106
 107def make_csv(x, hsep = ',', vsep = '\n'):
 108	'''
 109	Formats a list of lists of strings as a CSV
 110
 111	**Parameters**
 112
 113	+ `x`: the list of lists of strings to format
 114	+ `hsep`: the field separator (`,` by default)
 115	+ `vsep`: the line-ending convention to use (`\\n` by default)
 116
 117	**Example**
 118
 119	```py
 120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 121	```
 122
 123	outputs:
 124
 125	```py
 126	a,b,c
 127	d,e,f
 128	```
 129	'''
 130	return vsep.join([hsep.join(l) for l in x])
 131
 132
 133def pf(txt):
 134	'''
 135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 136	'''
 137	return txt.replace('-','_').replace('.','_').replace(' ','_')
 138
 139
 140def smart_type(x):
 141	'''
 142	Tries to convert string `x` to a float if it includes a decimal point, or
 143	to an integer if it does not. If both attempts fail, return the original
 144	string unchanged.
 145	'''
 146	try:
 147		y = float(x)
 148	except ValueError:
 149		return x
 150	if '.' not in x:
 151		return int(y)
 152	return y
 153
 154class _Defaults():
 155	def __init__(self):
 156		pass
 157
 158D47crunch_defaults = _Defaults()
 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—'
 160
 161def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
 162	'''
 163	Reads a list of lists of strings and outputs an ascii table
 164
 165	**Parameters**
 166
 167	+ `x`: a list of lists of strings
 168	+ `header`: the number of lines to treat as header lines
 169	+ `hsep`: the horizontal separator between columns
 170	+ `vsep`: the character to use as vertical separator
 171	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 172
 173	**Example**
 174
 175	```py
 176	print(pretty_table([
 177		['A', 'B', 'C'],
 178		['1', '1.9999', 'foo'],
 179		['10', 'x', 'bar'],
 180	]))
 181	```
 182	yields:	
 183	```
 184	——  ——————  ———
 185	A        B    C
 186	——  ——————  ———
 187	1   1.9999  foo
 188	10       x  bar
 189	——  ——————  ———
 190	```
 191
 192	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
 193	
 194	```py
 195	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
 196	print(pretty_table([
 197		['A', 'B', 'C'],
 198		['1', '1.9999', 'foo'],
 199		['10', 'x', 'bar'],
 200	]))
 201	```
 202	yields:	
 203	```
 204	==  ======  ===
 205	A        B    C
 206	==  ======  ===
 207	1   1.9999  foo
 208	10       x  bar
 209	==  ======  ===
 210	```
 211	'''
 212	
 213	if vsep is None:
 214		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
 215	
 216	txt = []
 217	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 218
 219	if len(widths) > len(align):
 220		align += '>' * (len(widths)-len(align))
 221	sepline = hsep.join([vsep*w for w in widths])
 222	txt += [sepline]
 223	for k,l in enumerate(x):
 224		if k and k == header:
 225			txt += [sepline]
 226		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 227	txt += [sepline]
 228	txt += ['']
 229	return '\n'.join(txt)
 230
 231
 232def transpose_table(x):
 233	'''
 234	Transpose a list if lists
 235
 236	**Parameters**
 237
 238	+ `x`: a list of lists
 239
 240	**Example**
 241
 242	```py
 243	x = [[1, 2], [3, 4]]
 244	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 245	```
 246	'''
 247	return [[e for e in c] for c in zip(*x)]
 248
 249
 250def w_avg(X, sX) :
 251	'''
 252	Compute variance-weighted average
 253
 254	Returns the value and SE of the weighted average of the elements of `X`,
 255	with relative weights equal to their inverse variances (`1/sX**2`).
 256
 257	**Parameters**
 258
 259	+ `X`: array-like of elements to average
 260	+ `sX`: array-like of the corresponding SE values
 261
 262	**Tip**
 263
 264	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 265	they may be rearranged using `zip()`:
 266
 267	```python
 268	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 269	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 270	```
 271	'''
 272	X = [ x for x in X ]
 273	sX = [ sx for sx in sX ]
 274	W = [ sx**-2 for sx in sX ]
 275	W = [ w/sum(W) for w in W ]
 276	Xavg = sum([ w*x for w,x in zip(W,X) ])
 277	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 278	return Xavg, sXavg
 279
 280
 281def read_csv(filename, sep = ''):
 282	'''
 283	Read contents of `filename` in csv format and return a list of dictionaries.
 284
 285	In the csv string, spaces before and after field separators (`','` by default)
 286	are optional.
 287
 288	**Parameters**
 289
 290	+ `filename`: the csv file to read
 291	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 292	whichever appers most often in the contents of `filename`.
 293	'''
 294	with open(filename) as fid:
 295		txt = fid.read()
 296
 297	if sep == '':
 298		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 299	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 300	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 301
 302
 303def simulate_single_analysis(
 304	sample = 'MYSAMPLE',
 305	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 306	d13C_VPDB = None, d18O_VPDB = None,
 307	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 308	a47 = 1., b47 = 0., c47 = -0.9,
 309	a48 = 1., b48 = 0., c48 = -0.45,
 310	Nominal_D47 = None,
 311	Nominal_D48 = None,
 312	Nominal_d13C_VPDB = None,
 313	Nominal_d18O_VPDB = None,
 314	ALPHA_18O_ACID_REACTION = None,
 315	R13_VPDB = None,
 316	R17_VSMOW = None,
 317	R18_VSMOW = None,
 318	LAMBDA_17 = None,
 319	R18_VPDB = None,
 320	):
 321	'''
 322	Compute working-gas delta values for a single analysis, assuming a stochastic working
 323	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 324	
 325	**Parameters**
 326
 327	+ `sample`: sample name
 328	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 329		(respectively –4 and +26 ‰ by default)
 330	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 331	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 332		of the carbonate sample
 333	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 334		Δ48 values if `D47` or `D48` are not specified
 335	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 336		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 337	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 338	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 339		correction parameters (by default equal to the `D4xdata` default values)
 340	
 341	Returns a dictionary with fields
 342	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 343	'''
 344
 345	if Nominal_d13C_VPDB is None:
 346		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 347
 348	if Nominal_d18O_VPDB is None:
 349		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 350
 351	if ALPHA_18O_ACID_REACTION is None:
 352		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 353
 354	if R13_VPDB is None:
 355		R13_VPDB = D4xdata().R13_VPDB
 356
 357	if R17_VSMOW is None:
 358		R17_VSMOW = D4xdata().R17_VSMOW
 359
 360	if R18_VSMOW is None:
 361		R18_VSMOW = D4xdata().R18_VSMOW
 362
 363	if LAMBDA_17 is None:
 364		LAMBDA_17 = D4xdata().LAMBDA_17
 365
 366	if R18_VPDB is None:
 367		R18_VPDB = D4xdata().R18_VPDB
 368	
 369	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 370	
 371	if Nominal_D47 is None:
 372		Nominal_D47 = D47data().Nominal_D47
 373
 374	if Nominal_D48 is None:
 375		Nominal_D48 = D48data().Nominal_D48
 376	
 377	if d13C_VPDB is None:
 378		if sample in Nominal_d13C_VPDB:
 379			d13C_VPDB = Nominal_d13C_VPDB[sample]
 380		else:
 381			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
 382
 383	if d18O_VPDB is None:
 384		if sample in Nominal_d18O_VPDB:
 385			d18O_VPDB = Nominal_d18O_VPDB[sample]
 386		else:
 387			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 388
 389	if D47 is None:
 390		if sample in Nominal_D47:
 391			D47 = Nominal_D47[sample]
 392		else:
 393			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 394
 395	if D48 is None:
 396		if sample in Nominal_D48:
 397			D48 = Nominal_D48[sample]
 398		else:
 399			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 400
 401	X = D4xdata()
 402	X.R13_VPDB = R13_VPDB
 403	X.R17_VSMOW = R17_VSMOW
 404	X.R18_VSMOW = R18_VSMOW
 405	X.LAMBDA_17 = LAMBDA_17
 406	X.R18_VPDB = R18_VPDB
 407	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 408
 409	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 410		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 411		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 412		)
 413	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 414		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 415		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 416		D17O=D17O, D47=D47, D48=D48, D49=D49,
 417		)
 418	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 419		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 420		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 421		D17O=D17O,
 422		)
 423	
 424	d45 = 1000 * (R45/R45wg - 1)
 425	d46 = 1000 * (R46/R46wg - 1)
 426	d47 = 1000 * (R47/R47wg - 1)
 427	d48 = 1000 * (R48/R48wg - 1)
 428	d49 = 1000 * (R49/R49wg - 1)
 429
 430	for k in range(3): # dumb iteration to adjust for small changes in d47
 431		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 432		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 433		d47 = 1000 * (R47raw/R47wg - 1)
 434		d48 = 1000 * (R48raw/R48wg - 1)
 435
 436	return dict(
 437		Sample = sample,
 438		D17O = D17O,
 439		d13Cwg_VPDB = d13Cwg_VPDB,
 440		d18Owg_VSMOW = d18Owg_VSMOW,
 441		d45 = d45,
 442		d46 = d46,
 443		d47 = d47,
 444		d48 = d48,
 445		d49 = d49,
 446		)
 447
 448
 449def virtual_data(
 450	samples = [],
 451	a47 = 1., b47 = 0., c47 = -0.9,
 452	a48 = 1., b48 = 0., c48 = -0.45,
 453	rd45 = 0.020, rd46 = 0.060,
 454	rD47 = 0.015, rD48 = 0.045,
 455	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 456	session = None,
 457	Nominal_D47 = None, Nominal_D48 = None,
 458	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 459	ALPHA_18O_ACID_REACTION = None,
 460	R13_VPDB = None,
 461	R17_VSMOW = None,
 462	R18_VSMOW = None,
 463	LAMBDA_17 = None,
 464	R18_VPDB = None,
 465	seed = 0,
 466	shuffle = True,
 467	):
 468	'''
 469	Return list with simulated analyses from a single session.
 470	
 471	**Parameters**
 472	
 473	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 474	    * `Sample`: the name of the sample
 475	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 476	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 477	    * `N`: how many analyses to generate for this sample
 478	+ `a47`: scrambling factor for Δ47
 479	+ `b47`: compositional nonlinearity for Δ47
 480	+ `c47`: working gas offset for Δ47
 481	+ `a48`: scrambling factor for Δ48
 482	+ `b48`: compositional nonlinearity for Δ48
 483	+ `c48`: working gas offset for Δ48
 484	+ `rd45`: analytical repeatability of δ45
 485	+ `rd46`: analytical repeatability of δ46
 486	+ `rD47`: analytical repeatability of Δ47
 487	+ `rD48`: analytical repeatability of Δ48
 488	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 489		(by default equal to the `simulate_single_analysis` default values)
 490	+ `session`: name of the session (no name by default)
 491	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 492		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 493	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 494		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 495		(by default equal to the `simulate_single_analysis` defaults)
 496	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 497		(by default equal to the `simulate_single_analysis` defaults)
 498	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 499		correction parameters (by default equal to the `simulate_single_analysis` default)
 500	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 501	+ `shuffle`: randomly reorder the sequence of analyses
 502	
 503		
 504	Here is an example of using this method to generate an arbitrary combination of
 505	anchors and unknowns for a bunch of sessions:
 506
 507	```py
 508	.. include:: ../../code_examples/virtual_data/example.py
 509	```
 510	
 511	This should output something like:
 512	
 513	```
 514	.. include:: ../../code_examples/virtual_data/output.txt
 515	```
 516	'''
 517	
 518	kwargs = locals().copy()
 519
 520	from numpy import random as nprandom
 521	if seed:
 522		nprandom.seed(seed)
 523		rng = nprandom.default_rng(seed)
 524	else:
 525		rng = nprandom.default_rng()
 526	
 527	N = sum([s['N'] for s in samples])
 528	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 529	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 530	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 531	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 532	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 533	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 534	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 535	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 536	
 537	k = 0
 538	out = []
 539	for s in samples:
 540		kw = {}
 541		kw['sample'] = s['Sample']
 542		kw = {
 543			**kw,
 544			**{var: kwargs[var]
 545				for var in [
 546					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 547					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 548					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 549					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 550					]
 551				if kwargs[var] is not None},
 552			**{var: s[var]
 553				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 554				if var in s},
 555			}
 556
 557		sN = s['N']
 558		while sN:
 559			out.append(simulate_single_analysis(**kw))
 560			out[-1]['d45'] += errors45[k]
 561			out[-1]['d46'] += errors46[k]
 562			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 563			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 564			sN -= 1
 565			k += 1
 566
 567		if session is not None:
 568			for r in out:
 569				r['Session'] = session
 570
 571		if shuffle:
 572			nprandom.shuffle(out)
 573
 574	return out
 575
 576def table_of_samples(
 577	data47 = None,
 578	data48 = None,
 579	dir = 'output',
 580	filename = None,
 581	save_to_file = True,
 582	print_out = True,
 583	output = None,
 584	):
 585	'''
 586	Print out, save to disk and/or return a combined table of samples
 587	for a pair of `D47data` and `D48data` objects.
 588
 589	**Parameters**
 590
 591	+ `data47`: `D47data` instance
 592	+ `data48`: `D48data` instance
 593	+ `dir`: the directory in which to save the table
 594	+ `filename`: the name to the csv file to write to
 595	+ `save_to_file`: whether to save the table to disk
 596	+ `print_out`: whether to print out the table
 597	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 598		if set to `'raw'`: return a list of list of strings
 599		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 600	'''
 601	if data47 is None:
 602		if data48 is None:
 603			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 604		else:
 605			return data48.table_of_samples(
 606				dir = dir,
 607				filename = filename,
 608				save_to_file = save_to_file,
 609				print_out = print_out,
 610				output = output
 611				)
 612	else:
 613		if data48 is None:
 614			return data47.table_of_samples(
 615				dir = dir,
 616				filename = filename,
 617				save_to_file = save_to_file,
 618				print_out = print_out,
 619				output = output
 620				)
 621		else:
 622			samples = (
 623				sorted([a for a in data47.anchors if a in data48.anchors])
 624				+ sorted([a for a in data47.anchors if a not in data48.anchors])
 625				+ sorted([a for a in data48.anchors if a not in data47.anchors])
 626				+ sorted([a for a in data47.unknowns if a in data48.unknowns])
 627			)
 628
 629			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 630			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 631			
 632			out47 = {l[0]: l for l in out47}
 633			out48 = {l[0]: l for l in out48}
 634
 635			out = [out47['Sample'] + out48['Sample'][4:]]
 636			for s in samples:
 637				out.append(out47[s] + out48[s][4:])
 638
 639			if save_to_file:
 640				if not os.path.exists(dir):
 641					os.makedirs(dir)
 642				if filename is None:
 643					filename = f'D47D48_samples.csv'
 644				with open(f'{dir}/{filename}', 'w') as fid:
 645					fid.write(make_csv(out))
 646			if print_out:
 647				print('\n'+pretty_table(out))
 648			if output == 'raw':
 649				return out
 650			elif output == 'pretty':
 651				return pretty_table(out)
 652
 653
 654def table_of_sessions(
 655	data47 = None,
 656	data48 = None,
 657	dir = 'output',
 658	filename = None,
 659	save_to_file = True,
 660	print_out = True,
 661	output = None,
 662	):
 663	'''
 664	Print out, save to disk and/or return a combined table of sessions
 665	for a pair of `D47data` and `D48data` objects.
 666	***Only applicable if the sessions in `data47` and those in `data48`
 667	consist of the exact same sets of analyses.***
 668
 669	**Parameters**
 670
 671	+ `data47`: `D47data` instance
 672	+ `data48`: `D48data` instance
 673	+ `dir`: the directory in which to save the table
 674	+ `filename`: the name to the csv file to write to
 675	+ `save_to_file`: whether to save the table to disk
 676	+ `print_out`: whether to print out the table
 677	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 678		if set to `'raw'`: return a list of list of strings
 679		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 680	'''
 681	if data47 is None:
 682		if data48 is None:
 683			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 684		else:
 685			return data48.table_of_sessions(
 686				dir = dir,
 687				filename = filename,
 688				save_to_file = save_to_file,
 689				print_out = print_out,
 690				output = output
 691				)
 692	else:
 693		if data48 is None:
 694			return data47.table_of_sessions(
 695				dir = dir,
 696				filename = filename,
 697				save_to_file = save_to_file,
 698				print_out = print_out,
 699				output = output
 700				)
 701		else:
 702			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 703			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 704			for k,x in enumerate(out47[0]):
 705				if k>7:
 706					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 707					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 708			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 709
 710			if save_to_file:
 711				if not os.path.exists(dir):
 712					os.makedirs(dir)
 713				if filename is None:
 714					filename = f'D47D48_sessions.csv'
 715				with open(f'{dir}/{filename}', 'w') as fid:
 716					fid.write(make_csv(out))
 717			if print_out:
 718				print('\n'+pretty_table(out))
 719			if output == 'raw':
 720				return out
 721			elif output == 'pretty':
 722				return pretty_table(out)
 723
 724
 725def table_of_analyses(
 726	data47 = None,
 727	data48 = None,
 728	dir = 'output',
 729	filename = None,
 730	save_to_file = True,
 731	print_out = True,
 732	output = None,
 733	):
 734	'''
 735	Print out, save to disk and/or return a combined table of analyses
 736	for a pair of `D47data` and `D48data` objects.
 737
 738	If the sessions in `data47` and those in `data48` do not consist of
 739	the exact same sets of analyses, the table will have two columns
 740	`Session_47` and `Session_48` instead of a single `Session` column.
 741
 742	**Parameters**
 743
 744	+ `data47`: `D47data` instance
 745	+ `data48`: `D48data` instance
 746	+ `dir`: the directory in which to save the table
 747	+ `filename`: the name to the csv file to write to
 748	+ `save_to_file`: whether to save the table to disk
 749	+ `print_out`: whether to print out the table
 750	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 751		if set to `'raw'`: return a list of list of strings
 752		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 753	'''
 754	if data47 is None:
 755		if data48 is None:
 756			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 757		else:
 758			return data48.table_of_analyses(
 759				dir = dir,
 760				filename = filename,
 761				save_to_file = save_to_file,
 762				print_out = print_out,
 763				output = output
 764				)
 765	else:
 766		if data48 is None:
 767			return data47.table_of_analyses(
 768				dir = dir,
 769				filename = filename,
 770				save_to_file = save_to_file,
 771				print_out = print_out,
 772				output = output
 773				)
 774		else:
 775			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 776			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 777			
 778			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 779				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 780			else:
 781				out47[0][1] = 'Session_47'
 782				out48[0][1] = 'Session_48'
 783				out47 = transpose_table(out47)
 784				out48 = transpose_table(out48)
 785				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 786
 787			if save_to_file:
 788				if not os.path.exists(dir):
 789					os.makedirs(dir)
 790				if filename is None:
 791					filename = f'D47D48_analyses.csv'
 792				with open(f'{dir}/{filename}', 'w') as fid:
 793					fid.write(make_csv(out))
 794			if print_out:
 795				print('\n'+pretty_table(out))
 796			if output == 'raw':
 797				return out
 798			elif output == 'pretty':
 799				return pretty_table(out)
 800
 801
 802def _fullcovar(minresult, epsilon = 0.01, named = False):
 803	'''
 804	Construct full covariance matrix in the case of constrained parameters
 805	'''
 806	
 807	import asteval
 808	
 809	def f(values):
 810		interp = asteval.Interpreter()
 811		for n,v in zip(minresult.var_names, values):
 812			interp(f'{n} = {v}')
 813		for q in minresult.params:
 814			if minresult.params[q].expr:
 815				interp(f'{q} = {minresult.params[q].expr}')
 816		return np.array([interp.symtable[q] for q in minresult.params])
 817
 818	# construct Jacobian
 819	J = np.zeros((minresult.nvarys, len(minresult.params)))
 820	X = np.array([minresult.params[p].value for p in minresult.var_names])
 821	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 822
 823	for j in range(minresult.nvarys):
 824		x1 = [_ for _ in X]
 825		x1[j] += epsilon * sX[j]
 826		x2 = [_ for _ in X]
 827		x2[j] -= epsilon * sX[j]
 828		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 829
 830	_names = [q for q in minresult.params]
 831	_covar = J.T @ minresult.covar @ J
 832	_se = np.diag(_covar)**.5
 833	_correl = _covar.copy()
 834	for k,s in enumerate(_se):
 835		if s:
 836			_correl[k,:] /= s
 837			_correl[:,k] /= s
 838
 839	if named:
 840		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 841		_se = {i: _se[i] for i in minresult.params}
 842		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 843
 844	return _names, _covar, _se, _correl
 845
 846
 847class D4xdata(list):
 848	'''
 849	Store and process data for a large set of Δ47 and/or Δ48
 850	analyses, usually comprising more than one analytical session.
 851	'''
 852
 853	### 17O CORRECTION PARAMETERS
 854	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 855	'''
 856	Absolute (13C/12C) ratio of VPDB.
 857	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 858	'''
 859
 860	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 861	'''
 862	Absolute (18O/16C) ratio of VSMOW.
 863	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 864	'''
 865
 866	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 867	'''
 868	Mass-dependent exponent for triple oxygen isotopes.
 869	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 870	'''
 871
 872	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 873	'''
 874	Absolute (17O/16C) ratio of VSMOW.
 875	By default equal to 0.00038475
 876	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 877	rescaled to `R13_VPDB`)
 878	'''
 879
 880	R18_VPDB = R18_VSMOW * 1.03092
 881	'''
 882	Absolute (18O/16C) ratio of VPDB.
 883	By definition equal to `R18_VSMOW * 1.03092`.
 884	'''
 885
 886	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 887	'''
 888	Absolute (17O/16C) ratio of VPDB.
 889	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 890	'''
 891
 892	LEVENE_REF_SAMPLE = 'ETH-3'
 893	'''
 894	After the Δ4x standardization step, each sample is tested to
 895	assess whether the Δ4x variance within all analyses for that
 896	sample differs significantly from that observed for a given reference
 897	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 898	which yields a p-value corresponding to the null hypothesis that the
 899	underlying variances are equal).
 900
 901	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 902	sample should be used as a reference for this test.
 903	'''
 904
 905	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 906	'''
 907	Specifies the 18O/16O fractionation factor generally applicable
 908	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 909	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 910
 911	By default equal to 1.008129 (calcite reacted at 90 °C,
 912	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 913	'''
 914
 915	Nominal_d13C_VPDB = {
 916		'ETH-1': 2.02,
 917		'ETH-2': -10.17,
 918		'ETH-3': 1.71,
 919		}	# (Bernasconi et al., 2018)
 920	'''
 921	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 922	`D4xdata.standardize_d13C()`.
 923
 924	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 925	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 926	'''
 927
 928	Nominal_d18O_VPDB = {
 929		'ETH-1': -2.19,
 930		'ETH-2': -18.69,
 931		'ETH-3': -1.78,
 932		}	# (Bernasconi et al., 2018)
 933	'''
 934	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 935	`D4xdata.standardize_d18O()`.
 936
 937	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 938	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 939	'''
 940
 941	d13C_STANDARDIZATION_METHOD = '2pt'
 942	'''
 943	Method by which to standardize δ13C values:
 944	
 945	+ `none`: do not apply any δ13C standardization.
 946	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 947	minimize the difference between final δ13C_VPDB values and
 948	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 949	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 950	values so as to minimize the difference between final δ13C_VPDB
 951	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 952	is defined).
 953	'''
 954
 955	d18O_STANDARDIZATION_METHOD = '2pt'
 956	'''
 957	Method by which to standardize δ18O values:
 958	
 959	+ `none`: do not apply any δ18O standardization.
 960	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 961	minimize the difference between final δ18O_VPDB values and
 962	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 963	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 964	values so as to minimize the difference between final δ18O_VPDB
 965	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 966	is defined).
 967	'''
 968
 969	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 970		'''
 971		**Parameters**
 972
 973		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 974		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 975		+ `mass`: `'47'` or `'48'`
 976		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 977		+ `session`: define session name for analyses without a `Session` key
 978		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 979
 980		Returns a `D4xdata` object derived from `list`.
 981		'''
 982		self._4x = mass
 983		self.verbose = verbose
 984		self.prefix = 'D4xdata'
 985		self.logfile = logfile
 986		list.__init__(self, l)
 987		self.Nf = None
 988		self.repeatability = {}
 989		self.refresh(session = session)
 990
 991
 992	def make_verbal(oldfun):
 993		'''
 994		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 995		'''
 996		@wraps(oldfun)
 997		def newfun(*args, verbose = '', **kwargs):
 998			myself = args[0]
 999			oldprefix = myself.prefix
1000			myself.prefix = oldfun.__name__
1001			if verbose != '':
1002				oldverbose = myself.verbose
1003				myself.verbose = verbose
1004			out = oldfun(*args, **kwargs)
1005			myself.prefix = oldprefix
1006			if verbose != '':
1007				myself.verbose = oldverbose
1008			return out
1009		return newfun
1010
1011
1012	def msg(self, txt):
1013		'''
1014		Log a message to `self.logfile`, and print it out if `verbose = True`
1015		'''
1016		self.log(txt)
1017		if self.verbose:
1018			print(f'{f"[{self.prefix}]":<16} {txt}')
1019
1020
1021	def vmsg(self, txt):
1022		'''
1023		Log a message to `self.logfile` and print it out
1024		'''
1025		self.log(txt)
1026		print(txt)
1027
1028
1029	def log(self, *txts):
1030		'''
1031		Log a message to `self.logfile`
1032		'''
1033		if self.logfile:
1034			with open(self.logfile, 'a') as fid:
1035				for txt in txts:
1036					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1037
1038
1039	def refresh(self, session = 'mySession'):
1040		'''
1041		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1042		'''
1043		self.fill_in_missing_info(session = session)
1044		self.refresh_sessions()
1045		self.refresh_samples()
1046
1047
1048	def refresh_sessions(self):
1049		'''
1050		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1051		to `False` for all sessions.
1052		'''
1053		self.sessions = {
1054			s: {'data': [r for r in self if r['Session'] == s]}
1055			for s in sorted({r['Session'] for r in self})
1056			}
1057		for s in self.sessions:
1058			self.sessions[s]['scrambling_drift'] = False
1059			self.sessions[s]['slope_drift'] = False
1060			self.sessions[s]['wg_drift'] = False
1061			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1062			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1063
1064
1065	def refresh_samples(self):
1066		'''
1067		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1068		'''
1069		self.samples = {
1070			s: {'data': [r for r in self if r['Sample'] == s]}
1071			for s in sorted({r['Sample'] for r in self})
1072			}
1073		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1074		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1075
1076
1077	def read(self, filename, sep = '', session = ''):
1078		'''
1079		Read file in csv format to load data into a `D47data` object.
1080
1081		In the csv file, spaces before and after field separators (`','` by default)
1082		are optional. Each line corresponds to a single analysis.
1083
1084		The required fields are:
1085
1086		+ `UID`: a unique identifier
1087		+ `Session`: an identifier for the analytical session
1088		+ `Sample`: a sample identifier
1089		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1090
1091		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1092		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1093		and `d49` are optional, and set to NaN by default.
1094
1095		**Parameters**
1096
1097		+ `fileneme`: the path of the file to read
1098		+ `sep`: csv separator delimiting the fields
1099		+ `session`: set `Session` field to this string for all analyses
1100		'''
1101		with open(filename) as fid:
1102			self.input(fid.read(), sep = sep, session = session)
1103
1104
1105	def input(self, txt, sep = '', session = ''):
1106		'''
1107		Read `txt` string in csv format to load analysis data into a `D47data` object.
1108
1109		In the csv string, spaces before and after field separators (`','` by default)
1110		are optional. Each line corresponds to a single analysis.
1111
1112		The required fields are:
1113
1114		+ `UID`: a unique identifier
1115		+ `Session`: an identifier for the analytical session
1116		+ `Sample`: a sample identifier
1117		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1118
1119		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1120		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1121		and `d49` are optional, and set to NaN by default.
1122
1123		**Parameters**
1124
1125		+ `txt`: the csv string to read
1126		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1127		whichever appers most often in `txt`.
1128		+ `session`: set `Session` field to this string for all analyses
1129		'''
1130		if sep == '':
1131			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1132		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1133		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1134
1135		if session != '':
1136			for r in data:
1137				r['Session'] = session
1138
1139		self += data
1140		self.refresh()
1141
1142
1143	@make_verbal
1144	def wg(self,
1145		samples = None,
1146		session_groups = None,
1147	):
1148		'''
1149		Compute bulk composition of the working gas for each session based (by default)
1150		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1151		`self.Nominal_d18O_VPDB`.
1152
1153		**Parameters**
1154
1155		+ `samples`: A list of samples specifying the subset of samples (defined in both
1156		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1157		when computing the working gas. By default, use all samples defined both in
1158		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1159		+ `session_groups`: a list of lists of sessions
1160		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1161		specifying which sessions groups, if any, have the exact same WG composition.
1162		If set to `'all'`, force all sessions to have the same WG composition (use with
1163		caution and on short time scales, since the WG may drift slowly a long time scales).
1164		'''
1165
1166		self.msg('Computing WG composition:')
1167
1168		a18_acid = self.ALPHA_18O_ACID_REACTION
1169		
1170		if samples is None:
1171			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1172		if session_groups is None:
1173			session_groups = [[s] for s in self.sessions]
1174		elif session_groups == 'all':
1175			session_groups = [[s for s in self.sessions]]
1176
1177		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1178		R45R46_standards = {}
1179		for sample in samples:
1180			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1181			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1182			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1183			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1184			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1185
1186			C12_s = 1 / (1 + R13_s)
1187			C13_s = R13_s / (1 + R13_s)
1188			C16_s = 1 / (1 + R17_s + R18_s)
1189			C17_s = R17_s / (1 + R17_s + R18_s)
1190			C18_s = R18_s / (1 + R17_s + R18_s)
1191
1192			C626_s = C12_s * C16_s ** 2
1193			C627_s = 2 * C12_s * C16_s * C17_s
1194			C628_s = 2 * C12_s * C16_s * C18_s
1195			C636_s = C13_s * C16_s ** 2
1196			C637_s = 2 * C13_s * C16_s * C17_s
1197			C727_s = C12_s * C17_s ** 2
1198
1199			R45_s = (C627_s + C636_s) / C626_s
1200			R46_s = (C628_s + C637_s + C727_s) / C626_s
1201			R45R46_standards[sample] = (R45_s, R46_s)
1202		
1203		for sg in session_groups:
1204			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1205			assert db, f'No sample from {samples} found in session group {sg}.'
1206
1207			X = [r['d45'] for r in db]
1208			Y = [R45R46_standards[r['Sample']][0] for r in db]
1209			x1, x2 = np.min(X), np.max(X)
1210
1211			if x1 < x2:
1212				wgcoord = x1/(x1-x2)
1213			else:
1214				wgcoord = 999
1215
1216			if wgcoord < -.5 or wgcoord > 1.5:
1217				# unreasonable to extrapolate to d45 = 0
1218				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1219			else :
1220				# d45 = 0 is reasonably well bracketed
1221				R45_wg = np.polyfit(X, Y, 1)[1]
1222
1223			X = [r['d46'] for r in db]
1224			Y = [R45R46_standards[r['Sample']][1] for r in db]
1225			x1, x2 = np.min(X), np.max(X)
1226
1227			if x1 < x2:
1228				wgcoord = x1/(x1-x2)
1229			else:
1230				wgcoord = 999
1231
1232			if wgcoord < -.5 or wgcoord > 1.5:
1233				# unreasonable to extrapolate to d46 = 0
1234				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1235			else :
1236				# d46 = 0 is reasonably well bracketed
1237				R46_wg = np.polyfit(X, Y, 1)[1]
1238
1239			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1240
1241			for s in sg:
1242				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1243	
1244				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1245				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1246				for r in self.sessions[s]['data']:
1247					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1248					r['d18Owg_VSMOW'] = d18Owg_VSMOW
1249
1250
1251	def compute_bulk_delta(self, R45, R46, D17O = 0):
1252		'''
1253		Compute δ13C_VPDB and δ18O_VSMOW,
1254		by solving the generalized form of equation (17) from
1255		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1256		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1257		solving the corresponding second-order Taylor polynomial.
1258		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1259		'''
1260
1261		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1262
1263		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1264		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1265		C = 2 * self.R18_VSMOW
1266		D = -R46
1267
1268		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1269		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1270		cc = A + B + C + D
1271
1272		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1273
1274		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1275		R17 = K * R18 ** self.LAMBDA_17
1276		R13 = R45 - 2 * R17
1277
1278		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1279
1280		return d13C_VPDB, d18O_VSMOW
1281
1282
1283	@make_verbal
1284	def crunch(self, verbose = ''):
1285		'''
1286		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1287		'''
1288		for r in self:
1289			self.compute_bulk_and_clumping_deltas(r)
1290		self.standardize_d13C()
1291		self.standardize_d18O()
1292		self.msg(f"Crunched {len(self)} analyses.")
1293
1294
1295	def fill_in_missing_info(self, session = 'mySession'):
1296		'''
1297		Fill in optional fields with default values
1298		'''
1299		for i,r in enumerate(self):
1300			if 'D17O' not in r:
1301				r['D17O'] = 0.
1302			if 'UID' not in r:
1303				r['UID'] = f'{i+1}'
1304			if 'Session' not in r:
1305				r['Session'] = session
1306			for k in ['d47', 'd48', 'd49']:
1307				if k not in r:
1308					r[k] = np.nan
1309
1310
1311	def standardize_d13C(self):
1312		'''
1313		Perform δ13C standadization within each session `s` according to
1314		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1315		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1316		may be redefined abitrarily at a later stage.
1317		'''
1318		for s in self.sessions:
1319			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1320				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1321				X,Y = zip(*XY)
1322				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1323					offset = np.mean(Y) - np.mean(X)
1324					for r in self.sessions[s]['data']:
1325						r['d13C_VPDB'] += offset				
1326				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1327					a,b = np.polyfit(X,Y,1)
1328					for r in self.sessions[s]['data']:
1329						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1330
1331	def standardize_d18O(self):
1332		'''
1333		Perform δ18O standadization within each session `s` according to
1334		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1335		which is defined by default by `D47data.refresh_sessions()`as equal to
1336		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1337		'''
1338		for s in self.sessions:
1339			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1340				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1341				X,Y = zip(*XY)
1342				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1343				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1344					offset = np.mean(Y) - np.mean(X)
1345					for r in self.sessions[s]['data']:
1346						r['d18O_VSMOW'] += offset				
1347				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1348					a,b = np.polyfit(X,Y,1)
1349					for r in self.sessions[s]['data']:
1350						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1351	
1352
1353	def compute_bulk_and_clumping_deltas(self, r):
1354		'''
1355		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1356		'''
1357
1358		# Compute working gas R13, R18, and isobar ratios
1359		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1360		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1361		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1362
1363		# Compute analyte isobar ratios
1364		R45 = (1 + r['d45'] / 1000) * R45_wg
1365		R46 = (1 + r['d46'] / 1000) * R46_wg
1366		R47 = (1 + r['d47'] / 1000) * R47_wg
1367		R48 = (1 + r['d48'] / 1000) * R48_wg
1368		R49 = (1 + r['d49'] / 1000) * R49_wg
1369
1370		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1371		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1372		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1373
1374		# Compute stochastic isobar ratios of the analyte
1375		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1376			R13, R18, D17O = r['D17O']
1377		)
1378
1379		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1380		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1381		if (R45 / R45stoch - 1) > 5e-8:
1382			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1383		if (R46 / R46stoch - 1) > 5e-8:
1384			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1385
1386		# Compute raw clumped isotope anomalies
1387		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1388		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1389		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1390
1391
1392	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1393		'''
1394		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1395		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1396		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1397		'''
1398
1399		# Compute R17
1400		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1401
1402		# Compute isotope concentrations
1403		C12 = (1 + R13) ** -1
1404		C13 = C12 * R13
1405		C16 = (1 + R17 + R18) ** -1
1406		C17 = C16 * R17
1407		C18 = C16 * R18
1408
1409		# Compute stochastic isotopologue concentrations
1410		C626 = C16 * C12 * C16
1411		C627 = C16 * C12 * C17 * 2
1412		C628 = C16 * C12 * C18 * 2
1413		C636 = C16 * C13 * C16
1414		C637 = C16 * C13 * C17 * 2
1415		C638 = C16 * C13 * C18 * 2
1416		C727 = C17 * C12 * C17
1417		C728 = C17 * C12 * C18 * 2
1418		C737 = C17 * C13 * C17
1419		C738 = C17 * C13 * C18 * 2
1420		C828 = C18 * C12 * C18
1421		C838 = C18 * C13 * C18
1422
1423		# Compute stochastic isobar ratios
1424		R45 = (C636 + C627) / C626
1425		R46 = (C628 + C637 + C727) / C626
1426		R47 = (C638 + C728 + C737) / C626
1427		R48 = (C738 + C828) / C626
1428		R49 = C838 / C626
1429
1430		# Account for stochastic anomalies
1431		R47 *= 1 + D47 / 1000
1432		R48 *= 1 + D48 / 1000
1433		R49 *= 1 + D49 / 1000
1434
1435		# Return isobar ratios
1436		return R45, R46, R47, R48, R49
1437
1438
1439	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1440		'''
1441		Split unknown samples by UID (treat all analyses as different samples)
1442		or by session (treat analyses of a given sample in different sessions as
1443		different samples).
1444
1445		**Parameters**
1446
1447		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1448		+ `grouping`: `by_uid` | `by_session`
1449		'''
1450		if samples_to_split == 'all':
1451			samples_to_split = [s for s in self.unknowns]
1452		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1453		self.grouping = grouping.lower()
1454		if self.grouping in gkeys:
1455			gkey = gkeys[self.grouping]
1456		for r in self:
1457			if r['Sample'] in samples_to_split:
1458				r['Sample_original'] = r['Sample']
1459				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1460			elif r['Sample'] in self.unknowns:
1461				r['Sample_original'] = r['Sample']
1462		self.refresh_samples()
1463
1464
1465	def unsplit_samples(self, tables = False):
1466		'''
1467		Reverse the effects of `D47data.split_samples()`.
1468		
1469		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1470		
1471		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1472		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1473		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1474		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1475		that case session-averaged Δ4x values are statistically independent).
1476		'''
1477		unknowns_old = sorted({s for s in self.unknowns})
1478		CM_old = self.standardization.covar[:,:]
1479		VD_old = self.standardization.params.valuesdict().copy()
1480		vars_old = self.standardization.var_names
1481
1482		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1483
1484		Ns = len(vars_old) - len(unknowns_old)
1485		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1486		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1487
1488		W = np.zeros((len(vars_new), len(vars_old)))
1489		W[:Ns,:Ns] = np.eye(Ns)
1490		for u in unknowns_new:
1491			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1492			if self.grouping == 'by_session':
1493				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1494			elif self.grouping == 'by_uid':
1495				weights = [1 for s in splits]
1496			sw = sum(weights)
1497			weights = [w/sw for w in weights]
1498			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1499
1500		CM_new = W @ CM_old @ W.T
1501		V = W @ np.array([[VD_old[k]] for k in vars_old])
1502		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1503
1504		self.standardization.covar = CM_new
1505		self.standardization.params.valuesdict = lambda : VD_new
1506		self.standardization.var_names = vars_new
1507
1508		for r in self:
1509			if r['Sample'] in self.unknowns:
1510				r['Sample_split'] = r['Sample']
1511				r['Sample'] = r['Sample_original']
1512
1513		self.refresh_samples()
1514		self.consolidate_samples()
1515		self.repeatabilities()
1516
1517		if tables:
1518			self.table_of_analyses()
1519			self.table_of_samples()
1520
1521	def assign_timestamps(self):
1522		'''
1523		Assign a time field `t` of type `float` to each analysis.
1524
1525		If `TimeTag` is one of the data fields, `t` is equal within a given session
1526		to `TimeTag` minus the mean value of `TimeTag` for that session.
1527		Otherwise, `TimeTag` is by default equal to the index of each analysis
1528		in the dataset and `t` is defined as above.
1529		'''
1530		for session in self.sessions:
1531			sdata = self.sessions[session]['data']
1532			try:
1533				t0 = np.mean([r['TimeTag'] for r in sdata])
1534				for r in sdata:
1535					r['t'] = r['TimeTag'] - t0
1536			except KeyError:
1537				t0 = (len(sdata)-1)/2
1538				for t,r in enumerate(sdata):
1539					r['t'] = t - t0
1540
1541
1542	def report(self):
1543		'''
1544		Prints a report on the standardization fit.
1545		Only applicable after `D4xdata.standardize(method='pooled')`.
1546		'''
1547		report_fit(self.standardization)
1548
1549
1550	def combine_samples(self, sample_groups):
1551		'''
1552		Combine analyses of different samples to compute weighted average Δ4x
1553		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1554		dictionary.
1555		
1556		Caution: samples are weighted by number of replicate analyses, which is a
1557		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1558		correlated analytical errors for one or more samples).
1559		
1560		Returns a tuplet of:
1561		
1562		+ the list of group names
1563		+ an array of the corresponding Δ4x values
1564		+ the corresponding (co)variance matrix
1565		
1566		**Parameters**
1567
1568		+ `sample_groups`: a dictionary of the form:
1569		```py
1570		{'group1': ['sample_1', 'sample_2'],
1571		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1572		```
1573		'''
1574		
1575		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1576		groups = sorted(sample_groups.keys())
1577		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1578		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1579		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1580		W = np.array([
1581			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1582			for j in groups])
1583		D4x_new = W @ D4x_old
1584		CM_new = W @ CM_old @ W.T
1585
1586		return groups, D4x_new[:,0], CM_new
1587		
1588
1589	@make_verbal
1590	def standardize(self,
1591		method = 'pooled',
1592		weighted_sessions = [],
1593		consolidate = True,
1594		consolidate_tables = False,
1595		consolidate_plots = False,
1596		constraints = {},
1597		):
1598		'''
1599		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1600		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1601		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1602		i.e. that their true Δ4x value does not change between sessions,
1603		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1604		`'indep_sessions'`, the standardization processes each session independently, based only
1605		on anchors analyses.
1606		'''
1607
1608		self.standardization_method = method
1609		self.assign_timestamps()
1610
1611		if method == 'pooled':
1612			if weighted_sessions:
1613				for session_group in weighted_sessions:
1614					if self._4x == '47':
1615						X = D47data([r for r in self if r['Session'] in session_group])
1616					elif self._4x == '48':
1617						X = D48data([r for r in self if r['Session'] in session_group])
1618					X.Nominal_D4x = self.Nominal_D4x.copy()
1619					X.refresh()
1620					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1621					w = np.sqrt(result.redchi)
1622					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1623					for r in X:
1624						r[f'wD{self._4x}raw'] *= w
1625			else:
1626				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1627				for r in self:
1628					r[f'wD{self._4x}raw'] = 1.
1629
1630			params = Parameters()
1631			for k,session in enumerate(self.sessions):
1632				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1633				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1634				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1635				s = pf(session)
1636				params.add(f'a_{s}', value = 0.9)
1637				params.add(f'b_{s}', value = 0.)
1638				params.add(f'c_{s}', value = -0.9)
1639				params.add(f'a2_{s}', value = 0.,
1640# 					vary = self.sessions[session]['scrambling_drift'],
1641					)
1642				params.add(f'b2_{s}', value = 0.,
1643# 					vary = self.sessions[session]['slope_drift'],
1644					)
1645				params.add(f'c2_{s}', value = 0.,
1646# 					vary = self.sessions[session]['wg_drift'],
1647					)
1648				if not self.sessions[session]['scrambling_drift']:
1649					params[f'a2_{s}'].expr = '0'
1650				if not self.sessions[session]['slope_drift']:
1651					params[f'b2_{s}'].expr = '0'
1652				if not self.sessions[session]['wg_drift']:
1653					params[f'c2_{s}'].expr = '0'
1654
1655			for sample in self.unknowns:
1656				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1657
1658			for k in constraints:
1659				params[k].expr = constraints[k]
1660
1661			def residuals(p):
1662				R = []
1663				for r in self:
1664					session = pf(r['Session'])
1665					sample = pf(r['Sample'])
1666					if r['Sample'] in self.Nominal_D4x:
1667						R += [ (
1668							r[f'D{self._4x}raw'] - (
1669								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1670								+ p[f'b_{session}'] * r[f'd{self._4x}']
1671								+	p[f'c_{session}']
1672								+ r['t'] * (
1673									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1674									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1675									+	p[f'c2_{session}']
1676									)
1677								)
1678							) / r[f'wD{self._4x}raw'] ]
1679					else:
1680						R += [ (
1681							r[f'D{self._4x}raw'] - (
1682								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1683								+ p[f'b_{session}'] * r[f'd{self._4x}']
1684								+	p[f'c_{session}']
1685								+ r['t'] * (
1686									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1687									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1688									+	p[f'c2_{session}']
1689									)
1690								)
1691							) / r[f'wD{self._4x}raw'] ]
1692				return R
1693
1694			M = Minimizer(residuals, params)
1695			result = M.least_squares()
1696			self.Nf = result.nfree
1697			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1698			new_names, new_covar, new_se = _fullcovar(result)[:3]
1699			result.var_names = new_names
1700			result.covar = new_covar
1701
1702			for r in self:
1703				s = pf(r["Session"])
1704				a = result.params.valuesdict()[f'a_{s}']
1705				b = result.params.valuesdict()[f'b_{s}']
1706				c = result.params.valuesdict()[f'c_{s}']
1707				a2 = result.params.valuesdict()[f'a2_{s}']
1708				b2 = result.params.valuesdict()[f'b2_{s}']
1709				c2 = result.params.valuesdict()[f'c2_{s}']
1710				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1711				
1712
1713			self.standardization = result
1714
1715			for session in self.sessions:
1716				self.sessions[session]['Np'] = 3
1717				for k in ['scrambling', 'slope', 'wg']:
1718					if self.sessions[session][f'{k}_drift']:
1719						self.sessions[session]['Np'] += 1
1720
1721			if consolidate:
1722				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1723			return result
1724
1725
1726		elif method == 'indep_sessions':
1727
1728			if weighted_sessions:
1729				for session_group in weighted_sessions:
1730					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1731					X.Nominal_D4x = self.Nominal_D4x.copy()
1732					X.refresh()
1733					# This is only done to assign r['wD47raw'] for r in X:
1734					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1735					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1736			else:
1737				self.msg('All weights set to 1 ‰')
1738				for r in self:
1739					r[f'wD{self._4x}raw'] = 1
1740
1741			for session in self.sessions:
1742				s = self.sessions[session]
1743				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1744				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1745				s['Np'] = sum(p_active)
1746				sdata = s['data']
1747
1748				A = np.array([
1749					[
1750						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1751						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1752						1 / r[f'wD{self._4x}raw'],
1753						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1754						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1755						r['t'] / r[f'wD{self._4x}raw']
1756						]
1757					for r in sdata if r['Sample'] in self.anchors
1758					])[:,p_active] # only keep columns for the active parameters
1759				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1760				s['Na'] = Y.size
1761				CM = linalg.inv(A.T @ A)
1762				bf = (CM @ A.T @ Y).T[0,:]
1763				k = 0
1764				for n,a in zip(p_names, p_active):
1765					if a:
1766						s[n] = bf[k]
1767# 						self.msg(f'{n} = {bf[k]}')
1768						k += 1
1769					else:
1770						s[n] = 0.
1771# 						self.msg(f'{n} = 0.0')
1772
1773				for r in sdata :
1774					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1775					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1776					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1777
1778				s['CM'] = np.zeros((6,6))
1779				i = 0
1780				k_active = [j for j,a in enumerate(p_active) if a]
1781				for j,a in enumerate(p_active):
1782					if a:
1783						s['CM'][j,k_active] = CM[i,:]
1784						i += 1
1785
1786			if not weighted_sessions:
1787				w = self.rmswd()['rmswd']
1788				for r in self:
1789						r[f'wD{self._4x}'] *= w
1790						r[f'wD{self._4x}raw'] *= w
1791				for session in self.sessions:
1792					self.sessions[session]['CM'] *= w**2
1793
1794			for session in self.sessions:
1795				s = self.sessions[session]
1796				s['SE_a'] = s['CM'][0,0]**.5
1797				s['SE_b'] = s['CM'][1,1]**.5
1798				s['SE_c'] = s['CM'][2,2]**.5
1799				s['SE_a2'] = s['CM'][3,3]**.5
1800				s['SE_b2'] = s['CM'][4,4]**.5
1801				s['SE_c2'] = s['CM'][5,5]**.5
1802
1803			if not weighted_sessions:
1804				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1805			else:
1806				self.Nf = 0
1807				for sg in weighted_sessions:
1808					self.Nf += self.rmswd(sessions = sg)['Nf']
1809
1810			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1811
1812			avgD4x = {
1813				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1814				for sample in self.samples
1815				}
1816			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1817			rD4x = (chi2/self.Nf)**.5
1818			self.repeatability[f'sigma_{self._4x}'] = rD4x
1819
1820			if consolidate:
1821				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1822
1823
1824	def standardization_error(self, session, d4x, D4x, t = 0):
1825		'''
1826		Compute standardization error for a given session and
1827		(δ47, Δ47) composition.
1828		'''
1829		a = self.sessions[session]['a']
1830		b = self.sessions[session]['b']
1831		c = self.sessions[session]['c']
1832		a2 = self.sessions[session]['a2']
1833		b2 = self.sessions[session]['b2']
1834		c2 = self.sessions[session]['c2']
1835		CM = self.sessions[session]['CM']
1836
1837		x, y = D4x, d4x
1838		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1839# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1840		dxdy = -(b+b2*t) / (a+a2*t)
1841		dxdz = 1. / (a+a2*t)
1842		dxda = -x / (a+a2*t)
1843		dxdb = -y / (a+a2*t)
1844		dxdc = -1. / (a+a2*t)
1845		dxda2 = -x * a2 / (a+a2*t)
1846		dxdb2 = -y * t / (a+a2*t)
1847		dxdc2 = -t / (a+a2*t)
1848		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1849		sx = (V @ CM @ V.T) ** .5
1850		return sx
1851
1852
1853	@make_verbal
1854	def summary(self,
1855		dir = 'output',
1856		filename = None,
1857		save_to_file = True,
1858		print_out = True,
1859		):
1860		'''
1861		Print out an/or save to disk a summary of the standardization results.
1862
1863		**Parameters**
1864
1865		+ `dir`: the directory in which to save the table
1866		+ `filename`: the name to the csv file to write to
1867		+ `save_to_file`: whether to save the table to disk
1868		+ `print_out`: whether to print out the table
1869		'''
1870
1871		out = []
1872		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1873		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1874		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1875		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1876		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1877		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1878		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1879		out += [['Model degrees of freedom', f"{self.Nf}"]]
1880		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1881		out += [['Standardization method', self.standardization_method]]
1882
1883		if save_to_file:
1884			if not os.path.exists(dir):
1885				os.makedirs(dir)
1886			if filename is None:
1887				filename = f'D{self._4x}_summary.csv'
1888			with open(f'{dir}/{filename}', 'w') as fid:
1889				fid.write(make_csv(out))
1890		if print_out:
1891			self.msg('\n' + pretty_table(out, header = 0))
1892
1893
1894	@make_verbal
1895	def table_of_sessions(self,
1896		dir = 'output',
1897		filename = None,
1898		save_to_file = True,
1899		print_out = True,
1900		output = None,
1901		):
1902		'''
1903		Print out an/or save to disk a table of sessions.
1904
1905		**Parameters**
1906
1907		+ `dir`: the directory in which to save the table
1908		+ `filename`: the name to the csv file to write to
1909		+ `save_to_file`: whether to save the table to disk
1910		+ `print_out`: whether to print out the table
1911		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1912		    if set to `'raw'`: return a list of list of strings
1913		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1914		'''
1915		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1916		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1917		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1918
1919		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1920		if include_a2:
1921			out[-1] += ['a2 ± SE']
1922		if include_b2:
1923			out[-1] += ['b2 ± SE']
1924		if include_c2:
1925			out[-1] += ['c2 ± SE']
1926		for session in self.sessions:
1927			out += [[
1928				session,
1929				f"{self.sessions[session]['Na']}",
1930				f"{self.sessions[session]['Nu']}",
1931				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1932				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1933				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1934				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1935				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1936				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1937				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1938				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1939				]]
1940			if include_a2:
1941				if self.sessions[session]['scrambling_drift']:
1942					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1943				else:
1944					out[-1] += ['']
1945			if include_b2:
1946				if self.sessions[session]['slope_drift']:
1947					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1948				else:
1949					out[-1] += ['']
1950			if include_c2:
1951				if self.sessions[session]['wg_drift']:
1952					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1953				else:
1954					out[-1] += ['']
1955
1956		if save_to_file:
1957			if not os.path.exists(dir):
1958				os.makedirs(dir)
1959			if filename is None:
1960				filename = f'D{self._4x}_sessions.csv'
1961			with open(f'{dir}/{filename}', 'w') as fid:
1962				fid.write(make_csv(out))
1963		if print_out:
1964			self.msg('\n' + pretty_table(out))
1965		if output == 'raw':
1966			return out
1967		elif output == 'pretty':
1968			return pretty_table(out)
1969
1970
1971	@make_verbal
1972	def table_of_analyses(
1973		self,
1974		dir = 'output',
1975		filename = None,
1976		save_to_file = True,
1977		print_out = True,
1978		output = None,
1979		):
1980		'''
1981		Print out an/or save to disk a table of analyses.
1982
1983		**Parameters**
1984
1985		+ `dir`: the directory in which to save the table
1986		+ `filename`: the name to the csv file to write to
1987		+ `save_to_file`: whether to save the table to disk
1988		+ `print_out`: whether to print out the table
1989		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1990		    if set to `'raw'`: return a list of list of strings
1991		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1992		'''
1993
1994		out = [['UID','Session','Sample']]
1995		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1996		for f in extra_fields:
1997			out[-1] += [f[0]]
1998		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1999		for r in self:
2000			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
2001			for f in extra_fields:
2002				out[-1] += [f"{r[f[0]]:{f[1]}}"]
2003			out[-1] += [
2004				f"{r['d13Cwg_VPDB']:.3f}",
2005				f"{r['d18Owg_VSMOW']:.3f}",
2006				f"{r['d45']:.6f}",
2007				f"{r['d46']:.6f}",
2008				f"{r['d47']:.6f}",
2009				f"{r['d48']:.6f}",
2010				f"{r['d49']:.6f}",
2011				f"{r['d13C_VPDB']:.6f}",
2012				f"{r['d18O_VSMOW']:.6f}",
2013				f"{r['D47raw']:.6f}",
2014				f"{r['D48raw']:.6f}",
2015				f"{r['D49raw']:.6f}",
2016				f"{r[f'D{self._4x}']:.6f}"
2017				]
2018		if save_to_file:
2019			if not os.path.exists(dir):
2020				os.makedirs(dir)
2021			if filename is None:
2022				filename = f'D{self._4x}_analyses.csv'
2023			with open(f'{dir}/{filename}', 'w') as fid:
2024				fid.write(make_csv(out))
2025		if print_out:
2026			self.msg('\n' + pretty_table(out))
2027		return out
2028
2029	@make_verbal
2030	def covar_table(
2031		self,
2032		correl = False,
2033		dir = 'output',
2034		filename = None,
2035		save_to_file = True,
2036		print_out = True,
2037		output = None,
2038		):
2039		'''
2040		Print out, save to disk and/or return the variance-covariance matrix of D4x
2041		for all unknown samples.
2042
2043		**Parameters**
2044
2045		+ `dir`: the directory in which to save the csv
2046		+ `filename`: the name of the csv file to write to
2047		+ `save_to_file`: whether to save the csv
2048		+ `print_out`: whether to print out the matrix
2049		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2050		    if set to `'raw'`: return a list of list of strings
2051		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2052		'''
2053		samples = sorted([u for u in self.unknowns])
2054		out = [[''] + samples]
2055		for s1 in samples:
2056			out.append([s1])
2057			for s2 in samples:
2058				if correl:
2059					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2060				else:
2061					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2062
2063		if save_to_file:
2064			if not os.path.exists(dir):
2065				os.makedirs(dir)
2066			if filename is None:
2067				if correl:
2068					filename = f'D{self._4x}_correl.csv'
2069				else:
2070					filename = f'D{self._4x}_covar.csv'
2071			with open(f'{dir}/{filename}', 'w') as fid:
2072				fid.write(make_csv(out))
2073		if print_out:
2074			self.msg('\n'+pretty_table(out))
2075		if output == 'raw':
2076			return out
2077		elif output == 'pretty':
2078			return pretty_table(out)
2079
2080	@make_verbal
2081	def table_of_samples(
2082		self,
2083		dir = 'output',
2084		filename = None,
2085		save_to_file = True,
2086		print_out = True,
2087		output = None,
2088		):
2089		'''
2090		Print out, save to disk and/or return a table of samples.
2091
2092		**Parameters**
2093
2094		+ `dir`: the directory in which to save the csv
2095		+ `filename`: the name of the csv file to write to
2096		+ `save_to_file`: whether to save the csv
2097		+ `print_out`: whether to print out the table
2098		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2099		    if set to `'raw'`: return a list of list of strings
2100		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2101		'''
2102
2103		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2104		for sample in self.anchors:
2105			out += [[
2106				f"{sample}",
2107				f"{self.samples[sample]['N']}",
2108				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2109				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2110				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2111				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2112				]]
2113		for sample in self.unknowns:
2114			out += [[
2115				f"{sample}",
2116				f"{self.samples[sample]['N']}",
2117				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2118				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2119				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2120				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2121				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2122				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2123				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2124				]]
2125		if save_to_file:
2126			if not os.path.exists(dir):
2127				os.makedirs(dir)
2128			if filename is None:
2129				filename = f'D{self._4x}_samples.csv'
2130			with open(f'{dir}/{filename}', 'w') as fid:
2131				fid.write(make_csv(out))
2132		if print_out:
2133			self.msg('\n'+pretty_table(out))
2134		if output == 'raw':
2135			return out
2136		elif output == 'pretty':
2137			return pretty_table(out)
2138
2139
2140	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2141		'''
2142		Generate session plots and save them to disk.
2143
2144		**Parameters**
2145
2146		+ `dir`: the directory in which to save the plots
2147		+ `figsize`: the width and height (in inches) of each plot
2148		+ `filetype`: 'pdf' or 'png'
2149		+ `dpi`: resolution for PNG output
2150		'''
2151		if not os.path.exists(dir):
2152			os.makedirs(dir)
2153
2154		for session in self.sessions:
2155			sp = self.plot_single_session(session, xylimits = 'constant')
2156			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2157			ppl.close(sp.fig)
2158			
2159
2160
2161	@make_verbal
2162	def consolidate_samples(self):
2163		'''
2164		Compile various statistics for each sample.
2165
2166		For each anchor sample:
2167
2168		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2169		+ `SE_D47` or `SE_D48`: set to zero by definition
2170
2171		For each unknown sample:
2172
2173		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2174		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2175
2176		For each anchor and unknown:
2177
2178		+ `N`: the total number of analyses of this sample
2179		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2180		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2181		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2182		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2183		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2184		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2185		'''
2186		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2187		for sample in self.samples:
2188			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2189			if self.samples[sample]['N'] > 1:
2190				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2191
2192			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2193			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2194
2195			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2196			if len(D4x_pop) > 2:
2197				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2198			
2199		if self.standardization_method == 'pooled':
2200			for sample in self.anchors:
2201				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2202				self.samples[sample][f'SE_D{self._4x}'] = 0.
2203			for sample in self.unknowns:
2204				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2205				try:
2206					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2207				except ValueError:
2208					# when `sample` is constrained by self.standardize(constraints = {...}),
2209					# it is no longer listed in self.standardization.var_names.
2210					# Temporary fix: define SE as zero for now
2211					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2212
2213		elif self.standardization_method == 'indep_sessions':
2214			for sample in self.anchors:
2215				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2216				self.samples[sample][f'SE_D{self._4x}'] = 0.
2217			for sample in self.unknowns:
2218				self.msg(f'Consolidating sample {sample}')
2219				self.unknowns[sample][f'session_D{self._4x}'] = {}
2220				session_avg = []
2221				for session in self.sessions:
2222					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2223					if sdata:
2224						self.msg(f'{sample} found in session {session}')
2225						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2226						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2227						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2228						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2229						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2230						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2231						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2232				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2233				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2234				wsum = sum([weights[s] for s in weights])
2235				for s in weights:
2236					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2237
2238		for r in self:
2239			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2240
2241
2242
2243	def consolidate_sessions(self):
2244		'''
2245		Compute various statistics for each session.
2246
2247		+ `Na`: Number of anchor analyses in the session
2248		+ `Nu`: Number of unknown analyses in the session
2249		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2250		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2251		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2252		+ `a`: scrambling factor
2253		+ `b`: compositional slope
2254		+ `c`: WG offset
2255		+ `SE_a`: Model stadard erorr of `a`
2256		+ `SE_b`: Model stadard erorr of `b`
2257		+ `SE_c`: Model stadard erorr of `c`
2258		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2259		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2260		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2261		+ `a2`: scrambling factor drift
2262		+ `b2`: compositional slope drift
2263		+ `c2`: WG offset drift
2264		+ `Np`: Number of standardization parameters to fit
2265		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2266		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2267		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2268		'''
2269		for session in self.sessions:
2270			if 'd13Cwg_VPDB' not in self.sessions[session]:
2271				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2272			if 'd18Owg_VSMOW' not in self.sessions[session]:
2273				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2274			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2275			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2276
2277			self.msg(f'Computing repeatabilities for session {session}')
2278			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2279			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2280			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2281
2282		if self.standardization_method == 'pooled':
2283			for session in self.sessions:
2284
2285				# different (better?) computation of D4x repeatability for each session:
2286				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2287				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2288
2289				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2290				i = self.standardization.var_names.index(f'a_{pf(session)}')
2291				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2292
2293				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2294				i = self.standardization.var_names.index(f'b_{pf(session)}')
2295				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2296
2297				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2298				i = self.standardization.var_names.index(f'c_{pf(session)}')
2299				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2300
2301				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2302				if self.sessions[session]['scrambling_drift']:
2303					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2304					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2305				else:
2306					self.sessions[session]['SE_a2'] = 0.
2307
2308				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2309				if self.sessions[session]['slope_drift']:
2310					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2311					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2312				else:
2313					self.sessions[session]['SE_b2'] = 0.
2314
2315				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2316				if self.sessions[session]['wg_drift']:
2317					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2318					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2319				else:
2320					self.sessions[session]['SE_c2'] = 0.
2321
2322				i = self.standardization.var_names.index(f'a_{pf(session)}')
2323				j = self.standardization.var_names.index(f'b_{pf(session)}')
2324				k = self.standardization.var_names.index(f'c_{pf(session)}')
2325				CM = np.zeros((6,6))
2326				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2327				try:
2328					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2329					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2330					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2331					try:
2332						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2333						CM[3,4] = self.standardization.covar[i2,j2]
2334						CM[4,3] = self.standardization.covar[j2,i2]
2335					except ValueError:
2336						pass
2337					try:
2338						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2339						CM[3,5] = self.standardization.covar[i2,k2]
2340						CM[5,3] = self.standardization.covar[k2,i2]
2341					except ValueError:
2342						pass
2343				except ValueError:
2344					pass
2345				try:
2346					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2347					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2348					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2349					try:
2350						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2351						CM[4,5] = self.standardization.covar[j2,k2]
2352						CM[5,4] = self.standardization.covar[k2,j2]
2353					except ValueError:
2354						pass
2355				except ValueError:
2356					pass
2357				try:
2358					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2359					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2360					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2361				except ValueError:
2362					pass
2363
2364				self.sessions[session]['CM'] = CM
2365
2366		elif self.standardization_method == 'indep_sessions':
2367			pass # Not implemented yet
2368
2369
2370	@make_verbal
2371	def repeatabilities(self):
2372		'''
2373		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2374		(for all samples, for anchors, and for unknowns).
2375		'''
2376		self.msg('Computing reproducibilities for all sessions')
2377
2378		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2379		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2380		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2381		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2382		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2383
2384
2385	@make_verbal
2386	def consolidate(self, tables = True, plots = True):
2387		'''
2388		Collect information about samples, sessions and repeatabilities.
2389		'''
2390		self.consolidate_samples()
2391		self.consolidate_sessions()
2392		self.repeatabilities()
2393
2394		if tables:
2395			self.summary()
2396			self.table_of_sessions()
2397			self.table_of_analyses()
2398			self.table_of_samples()
2399
2400		if plots:
2401			self.plot_sessions()
2402
2403
2404	@make_verbal
2405	def rmswd(self,
2406		samples = 'all samples',
2407		sessions = 'all sessions',
2408		):
2409		'''
2410		Compute the χ2, root mean squared weighted deviation
2411		(i.e. reduced χ2), and corresponding degrees of freedom of the
2412		Δ4x values for samples in `samples` and sessions in `sessions`.
2413		
2414		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2415		'''
2416		if samples == 'all samples':
2417			mysamples = [k for k in self.samples]
2418		elif samples == 'anchors':
2419			mysamples = [k for k in self.anchors]
2420		elif samples == 'unknowns':
2421			mysamples = [k for k in self.unknowns]
2422		else:
2423			mysamples = samples
2424
2425		if sessions == 'all sessions':
2426			sessions = [k for k in self.sessions]
2427
2428		chisq, Nf = 0, 0
2429		for sample in mysamples :
2430			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2431			if len(G) > 1 :
2432				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2433				Nf += (len(G) - 1)
2434				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2435		r = (chisq / Nf)**.5 if Nf > 0 else 0
2436		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2437		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2438
2439	
2440	@make_verbal
2441	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2442		'''
2443		Compute the repeatability of `[r[key] for r in self]`
2444		'''
2445
2446		if samples == 'all samples':
2447			mysamples = [k for k in self.samples]
2448		elif samples == 'anchors':
2449			mysamples = [k for k in self.anchors]
2450		elif samples == 'unknowns':
2451			mysamples = [k for k in self.unknowns]
2452		else:
2453			mysamples = samples
2454
2455		if sessions == 'all sessions':
2456			sessions = [k for k in self.sessions]
2457
2458		if key in ['D47', 'D48']:
2459			# Full disclosure: the definition of Nf is tricky/debatable
2460			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2461			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2462			Nf = len(G)
2463# 			print(f'len(G) = {Nf}')
2464			Nf -= len([s for s in mysamples if s in self.unknowns])
2465# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2466			for session in sessions:
2467				Np = len([
2468					_ for _ in self.standardization.params
2469					if (
2470						self.standardization.params[_].expr is not None
2471						and (
2472							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2473							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2474							)
2475						)
2476					])
2477# 				print(f'session {session}: {Np} parameters to consider')
2478				Na = len({
2479					r['Sample'] for r in self.sessions[session]['data']
2480					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2481					})
2482# 				print(f'session {session}: {Na} different anchors in that session')
2483				Nf -= min(Np, Na)
2484# 			print(f'Nf = {Nf}')
2485
2486# 			for sample in mysamples :
2487# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2488# 				if len(X) > 1 :
2489# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2490# 					if sample in self.unknowns:
2491# 						Nf += len(X) - 1
2492# 					else:
2493# 						Nf += len(X)
2494# 			if samples in ['anchors', 'all samples']:
2495# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2496			r = (chisq / Nf)**.5 if Nf > 0 else 0
2497
2498		else: # if key not in ['D47', 'D48']
2499			chisq, Nf = 0, 0
2500			for sample in mysamples :
2501				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2502				if len(X) > 1 :
2503					Nf += len(X) - 1
2504					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2505			r = (chisq / Nf)**.5 if Nf > 0 else 0
2506
2507		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2508		return r
2509
2510	def sample_average(self, samples, weights = 'equal', normalize = True):
2511		'''
2512		Weighted average Δ4x value of a group of samples, accounting for covariance.
2513
2514		Returns the weighed average Δ4x value and associated SE
2515		of a group of samples. Weights are equal by default. If `normalize` is
2516		true, `weights` will be rescaled so that their sum equals 1.
2517
2518		**Examples**
2519
2520		```python
2521		self.sample_average(['X','Y'], [1, 2])
2522		```
2523
2524		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2525		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2526		values of samples X and Y, respectively.
2527
2528		```python
2529		self.sample_average(['X','Y'], [1, -1], normalize = False)
2530		```
2531
2532		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2533		'''
2534		if weights == 'equal':
2535			weights = [1/len(samples)] * len(samples)
2536
2537		if normalize:
2538			s = sum(weights)
2539			if s:
2540				weights = [w/s for w in weights]
2541
2542		try:
2543# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2544# 			C = self.standardization.covar[indices,:][:,indices]
2545			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2546			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2547			return correlated_sum(X, C, weights)
2548		except ValueError:
2549			return (0., 0.)
2550
2551
2552	def sample_D4x_covar(self, sample1, sample2 = None):
2553		'''
2554		Covariance between Δ4x values of samples
2555
2556		Returns the error covariance between the average Δ4x values of two
2557		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2558		returns the Δ4x variance for that sample.
2559		'''
2560		if sample2 is None:
2561			sample2 = sample1
2562		if self.standardization_method == 'pooled':
2563			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2564			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2565			return self.standardization.covar[i, j]
2566		elif self.standardization_method == 'indep_sessions':
2567			if sample1 == sample2:
2568				return self.samples[sample1][f'SE_D{self._4x}']**2
2569			else:
2570				c = 0
2571				for session in self.sessions:
2572					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2573					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2574					if sdata1 and sdata2:
2575						a = self.sessions[session]['a']
2576						# !! TODO: CM below does not account for temporal changes in standardization parameters
2577						CM = self.sessions[session]['CM'][:3,:3]
2578						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2579						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2580						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2581						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2582						c += (
2583							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2584							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2585							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2586							@ CM
2587							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2588							) / a**2
2589				return float(c)
2590
2591	def sample_D4x_correl(self, sample1, sample2 = None):
2592		'''
2593		Correlation between Δ4x errors of samples
2594
2595		Returns the error correlation between the average Δ4x values of two samples.
2596		'''
2597		if sample2 is None or sample2 == sample1:
2598			return 1.
2599		return (
2600			self.sample_D4x_covar(sample1, sample2)
2601			/ self.unknowns[sample1][f'SE_D{self._4x}']
2602			/ self.unknowns[sample2][f'SE_D{self._4x}']
2603			)
2604
2605	def plot_single_session(self,
2606		session,
2607		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2608		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2609		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2610		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2611		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2612		xylimits = 'free', # | 'constant'
2613		x_label = None,
2614		y_label = None,
2615		error_contour_interval = 'auto',
2616		fig = 'new',
2617		):
2618		'''
2619		Generate plot for a single session
2620		'''
2621		if x_label is None:
2622			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2623		if y_label is None:
2624			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2625
2626		out = _SessionPlot()
2627		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2628		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2629		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2630		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2631		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2632		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2633		anchor_avg = (np.array([ np.array([
2634				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2635				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2636				]) for sample in anchors]).T,
2637			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2638		unknown_avg = (np.array([ np.array([
2639				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2640				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2641				]) for sample in unknowns]).T,
2642			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2643		
2644		
2645		if fig == 'new':
2646			out.fig = ppl.figure(figsize = (6,6))
2647			ppl.subplots_adjust(.1,.1,.9,.9)
2648
2649		out.anchor_analyses, = ppl.plot(
2650			anchors_d,
2651			anchors_D,
2652			**kw_plot_anchors)
2653		out.unknown_analyses, = ppl.plot(
2654			unknowns_d,
2655			unknowns_D,
2656			**kw_plot_unknowns)
2657		out.anchor_avg = ppl.plot(
2658			*anchor_avg,
2659			**kw_plot_anchor_avg)
2660		out.unknown_avg = ppl.plot(
2661			*unknown_avg,
2662			**kw_plot_unknown_avg)
2663		if xylimits == 'constant':
2664			x = [r[f'd{self._4x}'] for r in self]
2665			y = [r[f'D{self._4x}'] for r in self]
2666			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2667			w, h = x2-x1, y2-y1
2668			x1 -= w/20
2669			x2 += w/20
2670			y1 -= h/20
2671			y2 += h/20
2672			ppl.axis([x1, x2, y1, y2])
2673		elif xylimits == 'free':
2674			x1, x2, y1, y2 = ppl.axis()
2675		else:
2676			x1, x2, y1, y2 = ppl.axis(xylimits)
2677				
2678		if error_contour_interval != 'none':
2679			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2680			XI,YI = np.meshgrid(xi, yi)
2681			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2682			if error_contour_interval == 'auto':
2683				rng = np.max(SI) - np.min(SI)
2684				if rng <= 0.01:
2685					cinterval = 0.001
2686				elif rng <= 0.03:
2687					cinterval = 0.004
2688				elif rng <= 0.1:
2689					cinterval = 0.01
2690				elif rng <= 0.3:
2691					cinterval = 0.03
2692				elif rng <= 1.:
2693					cinterval = 0.1
2694				else:
2695					cinterval = 0.5
2696			else:
2697				cinterval = error_contour_interval
2698
2699			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2700			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2701			out.clabel = ppl.clabel(out.contour)
2702			contour = (XI, YI, SI, cval, cinterval)
2703
2704		if fig == None:
2705			return {
2706			'anchors':anchors,
2707			'unknowns':unknowns,
2708			'anchors_d':anchors_d,
2709			'anchors_D':anchors_D,
2710			'unknowns_d':unknowns_d,
2711			'unknowns_D':unknowns_D,
2712			'anchor_avg':anchor_avg,
2713			'unknown_avg':unknown_avg,
2714			'contour':contour,
2715			}
2716
2717		ppl.xlabel(x_label)
2718		ppl.ylabel(y_label)
2719		ppl.title(session, weight = 'bold')
2720		ppl.grid(alpha = .2)
2721		out.ax = ppl.gca()		
2722
2723		return out
2724
2725	def plot_residuals(
2726		self,
2727		kde = False,
2728		hist = False,
2729		binwidth = 2/3,
2730		dir = 'output',
2731		filename = None,
2732		highlight = [],
2733		colors = None,
2734		figsize = None,
2735		dpi = 100,
2736		yspan = None,
2737		):
2738		'''
2739		Plot residuals of each analysis as a function of time (actually, as a function of
2740		the order of analyses in the `D4xdata` object)
2741
2742		+ `kde`: whether to add a kernel density estimate of residuals
2743		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2744		+ `histbins`: specify bin edges for the histogram
2745		+ `dir`: the directory in which to save the plot
2746		+ `highlight`: a list of samples to highlight
2747		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2748		+ `figsize`: (width, height) of figure
2749		+ `dpi`: resolution for PNG output
2750		+ `yspan`: factor controlling the range of y values shown in plot
2751		  (by default: `yspan = 1.5 if kde else 1.0`)
2752		'''
2753		
2754		from matplotlib import ticker
2755
2756		if yspan is None:
2757			if kde:
2758				yspan = 1.5
2759			else:
2760				yspan = 1.0
2761		
2762		# Layout
2763		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2764		if hist or kde:
2765			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2766			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2767		else:
2768			ppl.subplots_adjust(.08,.05,.78,.8)
2769			ax1 = ppl.subplot(111)
2770		
2771		# Colors
2772		N = len(self.anchors)
2773		if colors is None:
2774			if len(highlight) > 0:
2775				Nh = len(highlight)
2776				if Nh == 1:
2777					colors = {highlight[0]: (0,0,0)}
2778				elif Nh == 3:
2779					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2780				elif Nh == 4:
2781					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2782				else:
2783					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2784			else:
2785				if N == 3:
2786					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2787				elif N == 4:
2788					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2789				else:
2790					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2791
2792		ppl.sca(ax1)
2793		
2794		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2795
2796		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2797
2798		session = self[0]['Session']
2799		x1 = 0
2800# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2801		x_sessions = {}
2802		one_or_more_singlets = False
2803		one_or_more_multiplets = False
2804		multiplets = set()
2805		for k,r in enumerate(self):
2806			if r['Session'] != session:
2807				x2 = k-1
2808				x_sessions[session] = (x1+x2)/2
2809				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2810				session = r['Session']
2811				x1 = k
2812			singlet = len(self.samples[r['Sample']]['data']) == 1
2813			if not singlet:
2814				multiplets.add(r['Sample'])
2815			if r['Sample'] in self.unknowns:
2816				if singlet:
2817					one_or_more_singlets = True
2818				else:
2819					one_or_more_multiplets = True
2820			kw = dict(
2821				marker = 'x' if singlet else '+',
2822				ms = 4 if singlet else 5,
2823				ls = 'None',
2824				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2825				mew = 1,
2826				alpha = 0.2 if singlet else 1,
2827				)
2828			if highlight and r['Sample'] not in highlight:
2829				kw['alpha'] = 0.2
2830			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2831		x2 = k
2832		x_sessions[session] = (x1+x2)/2
2833
2834		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2835		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2836		if not (hist or kde):
2837			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2838			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2839
2840		xmin, xmax, ymin, ymax = ppl.axis()
2841		if yspan != 1:
2842			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2843		for s in x_sessions:
2844			ppl.text(
2845				x_sessions[s],
2846				ymax +1,
2847				s,
2848				va = 'bottom',
2849				**(
2850					dict(ha = 'center')
2851					if len(self.sessions[s]['data']) > (0.15 * len(self))
2852					else dict(ha = 'left', rotation = 45)
2853					)
2854				)
2855
2856		if hist or kde:
2857			ppl.sca(ax2)
2858
2859		for s in colors:
2860			kw['marker'] = '+'
2861			kw['ms'] = 5
2862			kw['mec'] = colors[s]
2863			kw['label'] = s
2864			kw['alpha'] = 1
2865			ppl.plot([], [], **kw)
2866
2867		kw['mec'] = (0,0,0)
2868
2869		if one_or_more_singlets:
2870			kw['marker'] = 'x'
2871			kw['ms'] = 4
2872			kw['alpha'] = .2
2873			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2874			ppl.plot([], [], **kw)
2875
2876		if one_or_more_multiplets:
2877			kw['marker'] = '+'
2878			kw['ms'] = 4
2879			kw['alpha'] = 1
2880			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2881			ppl.plot([], [], **kw)
2882
2883		if hist or kde:
2884			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2885		else:
2886			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2887		leg.set_zorder(-1000)
2888
2889		ppl.sca(ax1)
2890
2891		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2892		ppl.xticks([])
2893		ppl.axis([-1, len(self), None, None])
2894
2895		if hist or kde:
2896			ppl.sca(ax2)
2897			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2898
2899			if kde:
2900				from scipy.stats import gaussian_kde
2901				yi = np.linspace(ymin, ymax, 201)
2902				xi = gaussian_kde(X).evaluate(yi)
2903				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2904# 				ppl.plot(xi, yi, 'k-', lw = 1)
2905			elif hist:
2906				ppl.hist(
2907					X,
2908					orientation = 'horizontal',
2909					histtype = 'stepfilled',
2910					ec = [.4]*3,
2911					fc = [.25]*3,
2912					alpha = .25,
2913					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2914					)
2915			ppl.text(0, 0,
2916				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2917				size = 7.5,
2918				alpha = 1,
2919				va = 'center',
2920				ha = 'left',
2921				)
2922
2923			ppl.axis([0, None, ymin, ymax])
2924			ppl.xticks([])
2925			ppl.yticks([])
2926# 			ax2.spines['left'].set_visible(False)
2927			ax2.spines['right'].set_visible(False)
2928			ax2.spines['top'].set_visible(False)
2929			ax2.spines['bottom'].set_visible(False)
2930
2931		ax1.axis([None, None, ymin, ymax])
2932
2933		if not os.path.exists(dir):
2934			os.makedirs(dir)
2935		if filename is None:
2936			return fig
2937		elif filename == '':
2938			filename = f'D{self._4x}_residuals.pdf'
2939		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2940		ppl.close(fig)
2941				
2942
2943	def simulate(self, *args, **kwargs):
2944		'''
2945		Legacy function with warning message pointing to `virtual_data()`
2946		'''
2947		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2948
2949	def plot_anchor_residuals(
2950		self,
2951		dir = 'output',
2952		filename = '',
2953		figsize = None,
2954		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2955		dpi = 100,
2956		colors = None,
2957		):
2958		'''
2959		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2960		
2961		**Parameters**
2962
2963		+ `dir`: the directory in which to save the plot
2964		+ `filename`: the file name to save to.
2965		+ `dpi`: resolution for PNG output
2966		+ `figsize`: (width, height) of figure
2967		+ `subplots_adjust`: passed to the figure
2968		+ `dpi`: resolution for PNG output
2969		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2970		'''
2971
2972		# Colors
2973		N = len(self.anchors)
2974		if colors is None:
2975			if N == 3:
2976				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2977			elif N == 4:
2978				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2979			else:
2980				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2981
2982		if figsize is None:
2983			figsize = (4, 1.5*N+1)
2984		fig = ppl.figure(figsize = figsize)
2985		ppl.subplots_adjust(*subplots_adjust)
2986		axs = {}
2987		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2988		sigma = self.repeatability['r_D47a'] * 1000
2989		D = max(np.abs(X))
2990
2991		for k,a in enumerate(self.anchors):
2992			color = colors[a]
2993			axs[a] = ppl.subplot(N, 1, 1+k)
2994			axs[a].text(
2995				0.02, 1-0.05, a,
2996				va = 'top',
2997				ha = 'left',
2998				weight = 'bold',
2999				size = 9,
3000				color = [_*0.75 for _ in color],
3001				transform = axs[a].transAxes,
3002			)
3003			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
3004			axs[a].axvline(0, lw = 0.5, color = color)
3005			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
3006
3007			xi = np.linspace(-3*D, 3*D, 601)
3008			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
3009			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
3010			
3011			axs[a].errorbar(
3012				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3013				ecolor = color,
3014				marker = 's',
3015				ls = 'None',
3016				mec = color,
3017				mew = 1,
3018				mfc = 'w',
3019				ms = 8,
3020				elinewidth = 1,
3021				capsize = 4,
3022				capthick = 1,
3023			)
3024			
3025			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3026			ppl.yticks([])
3027
3028		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3029
3030		if not os.path.exists(dir):
3031			os.makedirs(dir)
3032		if filename is None:
3033			return fig
3034		elif filename == '':
3035			filename = f'D{self._4x}_anchor_residuals.pdf'
3036		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3037		ppl.close(fig)
3038		
3039
3040	def plot_distribution_of_analyses(
3041		self,
3042		dir = 'output',
3043		filename = None,
3044		vs_time = False,
3045		figsize = (6,4),
3046		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3047		output = None,
3048		dpi = 100,
3049		):
3050		'''
3051		Plot temporal distribution of all analyses in the data set.
3052		
3053		**Parameters**
3054
3055		+ `dir`: the directory in which to save the plot
3056		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3057		+ `dpi`: resolution for PNG output
3058		+ `figsize`: (width, height) of figure
3059		+ `dpi`: resolution for PNG output
3060		'''
3061
3062		asamples = [s for s in self.anchors]
3063		usamples = [s for s in self.unknowns]
3064		if output is None or output == 'fig':
3065			fig = ppl.figure(figsize = figsize)
3066			ppl.subplots_adjust(*subplots_adjust)
3067		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3068		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3069		Xmax += (Xmax-Xmin)/40
3070		Xmin -= (Xmax-Xmin)/41
3071		for k, s in enumerate(asamples + usamples):
3072			if vs_time:
3073				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3074			else:
3075				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3076			Y = [-k for x in X]
3077			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3078			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3079			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3080		ppl.axis([Xmin, Xmax, -k-1, 1])
3081		ppl.xlabel('\ntime')
3082		ppl.gca().annotate('',
3083			xy = (0.6, -0.02),
3084			xycoords = 'axes fraction',
3085			xytext = (.4, -0.02), 
3086            arrowprops = dict(arrowstyle = "->", color = 'k'),
3087            )
3088			
3089
3090		x2 = -1
3091		for session in self.sessions:
3092			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3093			if vs_time:
3094				ppl.axvline(x1, color = 'k', lw = .75)
3095			if x2 > -1:
3096				if not vs_time:
3097					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3098			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3099# 			from xlrd import xldate_as_datetime
3100# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3101			if vs_time:
3102				ppl.axvline(x2, color = 'k', lw = .75)
3103				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3104			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3105
3106		ppl.xticks([])
3107		ppl.yticks([])
3108
3109		if output is None:
3110			if not os.path.exists(dir):
3111				os.makedirs(dir)
3112			if filename == None:
3113				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3114			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3115			ppl.close(fig)
3116		elif output == 'ax':
3117			return ppl.gca()
3118		elif output == 'fig':
3119			return fig
3120
3121
3122	def plot_bulk_compositions(
3123		self,
3124		samples = None,
3125		dir = 'output/bulk_compositions',
3126		figsize = (6,6),
3127		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3128		show = False,
3129		sample_color = (0,.5,1),
3130		analysis_color = (.7,.7,.7),
3131		labeldist = 0.3,
3132		radius = 0.05,
3133		):
3134		'''
3135		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3136		
3137		By default, creates a directory `./output/bulk_compositions` where plots for
3138		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3139		
3140		
3141		**Parameters**
3142
3143		+ `samples`: Only these samples are processed (by default: all samples).
3144		+ `dir`: where to save the plots
3145		+ `figsize`: (width, height) of figure
3146		+ `subplots_adjust`: passed to `subplots_adjust()`
3147		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3148		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3149		+ `sample_color`: color used for replicate markers/labels
3150		+ `analysis_color`: color used for sample markers/labels
3151		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3152		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3153		'''
3154
3155		from matplotlib.patches import Ellipse
3156
3157		if samples is None:
3158			samples = [_ for _ in self.samples]
3159
3160		saved = {}
3161
3162		for s in samples:
3163
3164			fig = ppl.figure(figsize = figsize)
3165			fig.subplots_adjust(*subplots_adjust)
3166			ax = ppl.subplot(111)
3167			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3168			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3169			ppl.title(s)
3170
3171
3172			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3173			UID = [_['UID'] for _ in self.samples[s]['data']]
3174			XY0 = XY.mean(0)
3175
3176			for xy in XY:
3177				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3178				
3179			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3180			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3181			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3182			saved[s] = [XY, XY0]
3183			
3184			x1, x2, y1, y2 = ppl.axis()
3185			x0, dx = (x1+x2)/2, (x2-x1)/2
3186			y0, dy = (y1+y2)/2, (y2-y1)/2
3187			dx, dy = [max(max(dx, dy), radius)]*2
3188
3189			ppl.axis([
3190				x0 - 1.2*dx,
3191				x0 + 1.2*dx,
3192				y0 - 1.2*dy,
3193				y0 + 1.2*dy,
3194				])			
3195
3196			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3197
3198			for xy, uid in zip(XY, UID):
3199
3200				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3201				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3202
3203				if (vector_in_display_space**2).sum() > 0:
3204
3205					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3206					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3207					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3208					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3209
3210					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3211
3212				else:
3213
3214					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3215
3216			if radius:
3217				ax.add_artist(Ellipse(
3218					xy = XY0,
3219					width = radius*2,
3220					height = radius*2,
3221					ls = (0, (2,2)),
3222					lw = .7,
3223					ec = analysis_color,
3224					fc = 'None',
3225					))
3226				ppl.text(
3227					XY0[0],
3228					XY0[1]-radius,
3229					f'\n± {radius*1e3:.0f} ppm',
3230					color = analysis_color,
3231					va = 'top',
3232					ha = 'center',
3233					linespacing = 0.4,
3234					size = 8,
3235					)
3236
3237			if not os.path.exists(dir):
3238				os.makedirs(dir)
3239			fig.savefig(f'{dir}/{s}.pdf')
3240			ppl.close(fig)
3241
3242		fig = ppl.figure(figsize = figsize)
3243		fig.subplots_adjust(*subplots_adjust)
3244		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3245		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3246
3247		for s in saved:
3248			for xy in saved[s][0]:
3249				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3250			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3251			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3252			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3253
3254		x1, x2, y1, y2 = ppl.axis()
3255		ppl.axis([
3256			x1 - (x2-x1)/10,
3257			x2 + (x2-x1)/10,
3258			y1 - (y2-y1)/10,
3259			y2 + (y2-y1)/10,
3260			])			
3261
3262
3263		if not os.path.exists(dir):
3264			os.makedirs(dir)
3265		fig.savefig(f'{dir}/__all__.pdf')
3266		if show:
3267			ppl.show()
3268		ppl.close(fig)
3269		
3270
3271	def _save_D4x_correl(
3272		self,
3273		samples = None,
3274		dir = 'output',
3275		filename = None,
3276		D4x_precision = 4,
3277		correl_precision = 4,
3278		save_to_file = True,
3279		):
3280		'''
3281		Save D4x values along with their SE and correlation matrix.
3282
3283		**Parameters**
3284
3285		+ `samples`: Only these samples are output (by default: all samples).
3286		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3287		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3288		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3289		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3290		+ `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`,
3291		returns the output as a string
3292		'''
3293		if samples is None:
3294			samples = sorted([s for s in self.unknowns])
3295		
3296		out = [['Sample']] + [[s] for s in samples]
3297		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3298		for k,s in enumerate(samples):
3299			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3300			for s2 in samples:
3301				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3302		
3303		if save_to_file:
3304			if not os.path.exists(dir):
3305				os.makedirs(dir)
3306			if filename is None:
3307				filename = f'D{self._4x}_correl.csv'
3308			with open(f'{dir}/{filename}', 'w') as fid:
3309				fid.write(make_csv(out))
3310		else:
3311			return make_csv(out)
3312		
3313
3314class D47data(D4xdata):
3315	'''
3316	Store and process data for a large set of Δ47 analyses,
3317	usually comprising more than one analytical session.
3318	'''
3319
3320	Nominal_D4x = {
3321		'ETH-1':   0.2052,
3322		'ETH-2':   0.2085,
3323		'ETH-3':   0.6132,
3324		'ETH-4':   0.4511,
3325		'IAEA-C1': 0.3018,
3326		'IAEA-C2': 0.6409,
3327		'MERCK':   0.5135,
3328		} # I-CDES (Bernasconi et al., 2021)
3329	'''
3330	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3331	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3332	reference frame.
3333
3334	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3335	```py
3336	{
3337		'ETH-1'   : 0.2052,
3338		'ETH-2'   : 0.2085,
3339		'ETH-3'   : 0.6132,
3340		'ETH-4'   : 0.4511,
3341		'IAEA-C1' : 0.3018,
3342		'IAEA-C2' : 0.6409,
3343		'MERCK'   : 0.5135,
3344	}
3345	```
3346	'''
3347
3348
3349	@property
3350	def Nominal_D47(self):
3351		return self.Nominal_D4x
3352	
3353
3354	@Nominal_D47.setter
3355	def Nominal_D47(self, new):
3356		self.Nominal_D4x = dict(**new)
3357		self.refresh()
3358
3359
3360	def __init__(self, l = [], **kwargs):
3361		'''
3362		**Parameters:** same as `D4xdata.__init__()`
3363		'''
3364		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3365
3366
3367	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3368		'''
3369		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3370		value for that temperature, and add treat these samples as additional anchors.
3371
3372		**Parameters**
3373
3374		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3375		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3376		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3377		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3378		if `new`: keep pre-existing anchors but update them in case of conflict
3379		between old and new Δ47 values;
3380		if `old`: keep pre-existing anchors but preserve their original Δ47
3381		values in case of conflict.
3382		'''
3383		f = {
3384			'petersen': fCO2eqD47_Petersen,
3385			'wang': fCO2eqD47_Wang,
3386			}[fCo2eqD47]
3387		foo = {}
3388		for r in self:
3389			if 'Teq' in r:
3390				if r['Sample'] in foo:
3391					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3392				else:
3393					foo[r['Sample']] = f(r['Teq'])
3394			else:
3395					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3396
3397		if priority == 'replace':
3398			self.Nominal_D47 = {}
3399		for s in foo:
3400			if priority != 'old' or s not in self.Nominal_D47:
3401				self.Nominal_D47[s] = foo[s]
3402	
3403	def save_D47_correl(self, *args, **kwargs):
3404		return self._save_D4x_correl(*args, **kwargs)
3405
3406	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3407
3408
3409class D48data(D4xdata):
3410	'''
3411	Store and process data for a large set of Δ48 analyses,
3412	usually comprising more than one analytical session.
3413	'''
3414
3415	Nominal_D4x = {
3416		'ETH-1':  0.138,
3417		'ETH-2':  0.138,
3418		'ETH-3':  0.270,
3419		'ETH-4':  0.223,
3420		'GU-1':  -0.419,
3421		} # (Fiebig et al., 2019, 2021)
3422	'''
3423	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3424	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3425	reference frame.
3426
3427	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3428	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3429
3430	```py
3431	{
3432		'ETH-1' :  0.138,
3433		'ETH-2' :  0.138,
3434		'ETH-3' :  0.270,
3435		'ETH-4' :  0.223,
3436		'GU-1'  : -0.419,
3437	}
3438	```
3439	'''
3440
3441
3442	@property
3443	def Nominal_D48(self):
3444		return self.Nominal_D4x
3445
3446	
3447	@Nominal_D48.setter
3448	def Nominal_D48(self, new):
3449		self.Nominal_D4x = dict(**new)
3450		self.refresh()
3451
3452
3453	def __init__(self, l = [], **kwargs):
3454		'''
3455		**Parameters:** same as `D4xdata.__init__()`
3456		'''
3457		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3458
3459	def save_D48_correl(self, *args, **kwargs):
3460		return self._save_D4x_correl(*args, **kwargs)
3461
3462	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3463
3464
3465class D49data(D4xdata):
3466	'''
3467	Store and process data for a large set of Δ49 analyses,
3468	usually comprising more than one analytical session.
3469	'''
3470	
3471	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3472	'''
3473	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3474	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3475	reference frame.
3476
3477	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3478
3479	```py
3480	{
3481		"1000C": 0.0,
3482		"25C": 2.228
3483	}
3484	```
3485	'''
3486	
3487	@property
3488	def Nominal_D49(self):
3489		return self.Nominal_D4x
3490	
3491	@Nominal_D49.setter
3492	def Nominal_D49(self, new):
3493		self.Nominal_D4x = dict(**new)
3494		self.refresh()
3495	
3496	def __init__(self, l=[], **kwargs):
3497		'''
3498		**Parameters:** same as `D4xdata.__init__()`
3499		'''
3500		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3501	
3502	def save_D49_correl(self, *args, **kwargs):
3503		return self._save_D4x_correl(*args, **kwargs)
3504	
3505	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3506
3507class _SessionPlot():
3508	'''
3509	Simple placeholder class
3510	'''
3511	def __init__(self):
3512		pass
3513
3514_app = typer.Typer(
3515	add_completion = False,
3516	context_settings={'help_option_names': ['-h', '--help']},
3517	rich_markup_mode = 'rich',
3518	)
3519
3520@_app.command()
3521def _cli(
3522	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3523	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3524	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3525	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3526	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3527	):
3528	"""
3529	Process raw D47 data and return standardized results.
3530	
3531	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3532	
3533	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3534	the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3535	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3536	the following sequence is applied:
3537	
3538	* [b]D47data.wg()[/b]
3539	* [b]D47data.crunch()[/b]
3540	* [b]D47data.standardize()[/b]
3541	* [b]D47data.summary()[/b]
3542	* [b]D47data.table_of_samples()[/b]
3543	* [b]D47data.table_of_sessions()[/b]
3544	* [b]D47data.plot_sessions()[/b]
3545	* [b]D47data.plot_residuals()[/b]
3546	* [b]D47data.table_of_analyses()[/b]
3547	* [b]D47data.plot_distribution_of_analyses()[/b]
3548	* [b]D47data.plot_bulk_compositions()[/b]
3549	* [b]D47data.save_D47_correl()[/b]
3550	
3551	Optionally, also apply similar methods for [b]]D48[/b].
3552	
3553	[b]Example CSV file for --anchors option:[/b]	
3554	[i]
3555	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3556	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3557	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3558	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3559	ETH-4,            ,           ,  0.4511,  0.223
3560	[/i]
3561	Except for [i]Sample[/i], none of the columns above are mandatory.
3562
3563	[b]Example CSV file for --exclude option:[/b]	
3564	[i]
3565	Sample,  UID
3566	 FOO-1,
3567	 BAR-2,
3568	      ,  A04
3569	      ,  A17
3570	      ,  A88
3571	[/i]
3572	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3573	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3574	Neither column is mandatory.
3575	"""
3576
3577	data = D47data()
3578	data.read(rawdata)
3579
3580	if exclude != 'none':
3581		exclude = read_csv(exclude)
3582		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3583		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3584	else:
3585		exclude_uid = []
3586		exclude_sample = []
3587	
3588	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3589
3590	if anchors != 'none':
3591		anchors = read_csv(anchors)
3592		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3593			data.Nominal_d13C_VPDB = {
3594				_['Sample']: _['d13C_VPDB']
3595				for _ in anchors
3596				if 'd13C_VPDB' in _
3597				}
3598		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3599			data.Nominal_d18O_VPDB = {
3600				_['Sample']: _['d18O_VPDB']
3601				for _ in anchors
3602				if 'd18O_VPDB' in _
3603				}
3604		if len([_ for _ in anchors if 'D47' in _]):
3605			data.Nominal_D4x = {
3606				_['Sample']: _['D47']
3607				for _ in anchors
3608				if 'D47' in _
3609				}
3610
3611	data.refresh()
3612	data.wg()
3613	data.crunch()
3614	data.standardize()
3615	data.summary(dir = output_dir)
3616	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3617	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3618	data.plot_sessions(dir = output_dir)
3619	data.save_D47_correl(dir = output_dir)
3620	
3621	if not run_D48:
3622		data.table_of_samples(dir = output_dir)
3623		data.table_of_analyses(dir = output_dir)
3624		data.table_of_sessions(dir = output_dir)
3625
3626
3627	if run_D48:
3628		data2 = D48data()
3629		print(rawdata)
3630		data2.read(rawdata)
3631
3632		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3633
3634		if anchors != 'none':
3635			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3636				data2.Nominal_d13C_VPDB = {
3637					_['Sample']: _['d13C_VPDB']
3638					for _ in anchors
3639					if 'd13C_VPDB' in _
3640					}
3641			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3642				data2.Nominal_d18O_VPDB = {
3643					_['Sample']: _['d18O_VPDB']
3644					for _ in anchors
3645					if 'd18O_VPDB' in _
3646					}
3647			if len([_ for _ in anchors if 'D48' in _]):
3648				data2.Nominal_D4x = {
3649					_['Sample']: _['D48']
3650					for _ in anchors
3651					if 'D48' in _
3652					}
3653
3654		data2.refresh()
3655		data2.wg()
3656		data2.crunch()
3657		data2.standardize()
3658		data2.summary(dir = output_dir)
3659		data2.plot_sessions(dir = output_dir)
3660		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3661		data2.plot_distribution_of_analyses(dir = output_dir)
3662		data2.save_D48_correl(dir = output_dir)
3663
3664		table_of_analyses(data, data2, dir = output_dir)
3665		table_of_samples(data, data2, dir = output_dir)
3666		table_of_sessions(data, data2, dir = output_dir)
3667		
3668def __cli():
3669	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
69def fCO2eqD47_Petersen(T):
70	'''
71	CO2 equilibrium Δ47 value as a function of T (in degrees C)
72	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
73
74	'''
75	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
80def fCO2eqD47_Wang(T):
81	'''
82	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
83	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
84	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
85	'''
86	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 89def correlated_sum(X, C, w = None):
 90	'''
 91	Compute covariance-aware linear combinations
 92
 93	**Parameters**
 94	
 95	+ `X`: list or 1-D array of values to sum
 96	+ `C`: covariance matrix for the elements of `X`
 97	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 98	       (all equal to 1 by default)
 99
100	Return the sum (and its SE) of the elements of `X`, with optional weights equal
101	to the elements of `w`, accounting for covariances between the elements of `X`.
102	'''
103	if w is None:
104		w = [1 for x in X]
105	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
108def make_csv(x, hsep = ',', vsep = '\n'):
109	'''
110	Formats a list of lists of strings as a CSV
111
112	**Parameters**
113
114	+ `x`: the list of lists of strings to format
115	+ `hsep`: the field separator (`,` by default)
116	+ `vsep`: the line-ending convention to use (`\\n` by default)
117
118	**Example**
119
120	```py
121	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
122	```
123
124	outputs:
125
126	```py
127	a,b,c
128	d,e,f
129	```
130	'''
131	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
134def pf(txt):
135	'''
136	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
137	'''
138	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
141def smart_type(x):
142	'''
143	Tries to convert string `x` to a float if it includes a decimal point, or
144	to an integer if it does not. If both attempts fail, return the original
145	string unchanged.
146	'''
147	try:
148		y = float(x)
149	except ValueError:
150		return x
151	if '.' not in x:
152		return int(y)
153	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

D47crunch_defaults = <D47crunch._Defaults object>
def pretty_table(x, header=1, hsep=' ', vsep=None, align='<'):
162def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
163	'''
164	Reads a list of lists of strings and outputs an ascii table
165
166	**Parameters**
167
168	+ `x`: a list of lists of strings
169	+ `header`: the number of lines to treat as header lines
170	+ `hsep`: the horizontal separator between columns
171	+ `vsep`: the character to use as vertical separator
172	+ `align`: string of left (`<`) or right (`>`) alignment characters.
173
174	**Example**
175
176	```py
177	print(pretty_table([
178		['A', 'B', 'C'],
179		['1', '1.9999', 'foo'],
180		['10', 'x', 'bar'],
181	]))
182	```
183	yields:	
184	```
185	——  ——————  ———
186	A        B    C
187	——  ——————  ———
188	1   1.9999  foo
189	10       x  bar
190	——  ——————  ———
191	```
192
193	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
194	
195	```py
196	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
197	print(pretty_table([
198		['A', 'B', 'C'],
199		['1', '1.9999', 'foo'],
200		['10', 'x', 'bar'],
201	]))
202	```
203	yields:	
204	```
205	==  ======  ===
206	A        B    C
207	==  ======  ===
208	1   1.9999  foo
209	10       x  bar
210	==  ======  ===
211	```
212	'''
213	
214	if vsep is None:
215		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
216	
217	txt = []
218	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
219
220	if len(widths) > len(align):
221		align += '>' * (len(widths)-len(align))
222	sepline = hsep.join([vsep*w for w in widths])
223	txt += [sepline]
224	for k,l in enumerate(x):
225		if k and k == header:
226			txt += [sepline]
227		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
228	txt += [sepline]
229	txt += ['']
230	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

——  ——————  ———
A        B    C
——  ——————  ———
1   1.9999  foo
10       x  bar
——  ——————  ———

To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:

D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

==  ======  ===
A        B    C
==  ======  ===
1   1.9999  foo
10       x  bar
==  ======  ===
def transpose_table(x):
233def transpose_table(x):
234	'''
235	Transpose a list if lists
236
237	**Parameters**
238
239	+ `x`: a list of lists
240
241	**Example**
242
243	```py
244	x = [[1, 2], [3, 4]]
245	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
246	```
247	'''
248	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
251def w_avg(X, sX) :
252	'''
253	Compute variance-weighted average
254
255	Returns the value and SE of the weighted average of the elements of `X`,
256	with relative weights equal to their inverse variances (`1/sX**2`).
257
258	**Parameters**
259
260	+ `X`: array-like of elements to average
261	+ `sX`: array-like of the corresponding SE values
262
263	**Tip**
264
265	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
266	they may be rearranged using `zip()`:
267
268	```python
269	foo = [(0, 1), (1, 0.5), (2, 0.5)]
270	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
271	```
272	'''
273	X = [ x for x in X ]
274	sX = [ sx for sx in sX ]
275	W = [ sx**-2 for sx in sX ]
276	W = [ w/sum(W) for w in W ]
277	Xavg = sum([ w*x for w,x in zip(W,X) ])
278	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
279	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
282def read_csv(filename, sep = ''):
283	'''
284	Read contents of `filename` in csv format and return a list of dictionaries.
285
286	In the csv string, spaces before and after field separators (`','` by default)
287	are optional.
288
289	**Parameters**
290
291	+ `filename`: the csv file to read
292	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
293	whichever appers most often in the contents of `filename`.
294	'''
295	with open(filename) as fid:
296		txt = fid.read()
297
298	if sep == '':
299		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
300	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
301	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
304def simulate_single_analysis(
305	sample = 'MYSAMPLE',
306	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
307	d13C_VPDB = None, d18O_VPDB = None,
308	D47 = None, D48 = None, D49 = 0., D17O = 0.,
309	a47 = 1., b47 = 0., c47 = -0.9,
310	a48 = 1., b48 = 0., c48 = -0.45,
311	Nominal_D47 = None,
312	Nominal_D48 = None,
313	Nominal_d13C_VPDB = None,
314	Nominal_d18O_VPDB = None,
315	ALPHA_18O_ACID_REACTION = None,
316	R13_VPDB = None,
317	R17_VSMOW = None,
318	R18_VSMOW = None,
319	LAMBDA_17 = None,
320	R18_VPDB = None,
321	):
322	'''
323	Compute working-gas delta values for a single analysis, assuming a stochastic working
324	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
325	
326	**Parameters**
327
328	+ `sample`: sample name
329	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
330		(respectively –4 and +26 ‰ by default)
331	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
332	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
333		of the carbonate sample
334	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
335		Δ48 values if `D47` or `D48` are not specified
336	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
337		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
338	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
339	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
340		correction parameters (by default equal to the `D4xdata` default values)
341	
342	Returns a dictionary with fields
343	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
344	'''
345
346	if Nominal_d13C_VPDB is None:
347		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
348
349	if Nominal_d18O_VPDB is None:
350		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
351
352	if ALPHA_18O_ACID_REACTION is None:
353		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
354
355	if R13_VPDB is None:
356		R13_VPDB = D4xdata().R13_VPDB
357
358	if R17_VSMOW is None:
359		R17_VSMOW = D4xdata().R17_VSMOW
360
361	if R18_VSMOW is None:
362		R18_VSMOW = D4xdata().R18_VSMOW
363
364	if LAMBDA_17 is None:
365		LAMBDA_17 = D4xdata().LAMBDA_17
366
367	if R18_VPDB is None:
368		R18_VPDB = D4xdata().R18_VPDB
369	
370	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
371	
372	if Nominal_D47 is None:
373		Nominal_D47 = D47data().Nominal_D47
374
375	if Nominal_D48 is None:
376		Nominal_D48 = D48data().Nominal_D48
377	
378	if d13C_VPDB is None:
379		if sample in Nominal_d13C_VPDB:
380			d13C_VPDB = Nominal_d13C_VPDB[sample]
381		else:
382			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
383
384	if d18O_VPDB is None:
385		if sample in Nominal_d18O_VPDB:
386			d18O_VPDB = Nominal_d18O_VPDB[sample]
387		else:
388			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
389
390	if D47 is None:
391		if sample in Nominal_D47:
392			D47 = Nominal_D47[sample]
393		else:
394			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
395
396	if D48 is None:
397		if sample in Nominal_D48:
398			D48 = Nominal_D48[sample]
399		else:
400			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
401
402	X = D4xdata()
403	X.R13_VPDB = R13_VPDB
404	X.R17_VSMOW = R17_VSMOW
405	X.R18_VSMOW = R18_VSMOW
406	X.LAMBDA_17 = LAMBDA_17
407	X.R18_VPDB = R18_VPDB
408	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
409
410	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
411		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
412		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
413		)
414	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
415		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
416		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
417		D17O=D17O, D47=D47, D48=D48, D49=D49,
418		)
419	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
420		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
421		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
422		D17O=D17O,
423		)
424	
425	d45 = 1000 * (R45/R45wg - 1)
426	d46 = 1000 * (R46/R46wg - 1)
427	d47 = 1000 * (R47/R47wg - 1)
428	d48 = 1000 * (R48/R48wg - 1)
429	d49 = 1000 * (R49/R49wg - 1)
430
431	for k in range(3): # dumb iteration to adjust for small changes in d47
432		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
433		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
434		d47 = 1000 * (R47raw/R47wg - 1)
435		d48 = 1000 * (R48raw/R48wg - 1)
436
437	return dict(
438		Sample = sample,
439		D17O = D17O,
440		d13Cwg_VPDB = d13Cwg_VPDB,
441		d18Owg_VSMOW = d18Owg_VSMOW,
442		d45 = d45,
443		d46 = d46,
444		d47 = d47,
445		d48 = d48,
446		d49 = d49,
447		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
450def virtual_data(
451	samples = [],
452	a47 = 1., b47 = 0., c47 = -0.9,
453	a48 = 1., b48 = 0., c48 = -0.45,
454	rd45 = 0.020, rd46 = 0.060,
455	rD47 = 0.015, rD48 = 0.045,
456	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
457	session = None,
458	Nominal_D47 = None, Nominal_D48 = None,
459	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
460	ALPHA_18O_ACID_REACTION = None,
461	R13_VPDB = None,
462	R17_VSMOW = None,
463	R18_VSMOW = None,
464	LAMBDA_17 = None,
465	R18_VPDB = None,
466	seed = 0,
467	shuffle = True,
468	):
469	'''
470	Return list with simulated analyses from a single session.
471	
472	**Parameters**
473	
474	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
475	    * `Sample`: the name of the sample
476	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
477	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
478	    * `N`: how many analyses to generate for this sample
479	+ `a47`: scrambling factor for Δ47
480	+ `b47`: compositional nonlinearity for Δ47
481	+ `c47`: working gas offset for Δ47
482	+ `a48`: scrambling factor for Δ48
483	+ `b48`: compositional nonlinearity for Δ48
484	+ `c48`: working gas offset for Δ48
485	+ `rd45`: analytical repeatability of δ45
486	+ `rd46`: analytical repeatability of δ46
487	+ `rD47`: analytical repeatability of Δ47
488	+ `rD48`: analytical repeatability of Δ48
489	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
490		(by default equal to the `simulate_single_analysis` default values)
491	+ `session`: name of the session (no name by default)
492	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
493		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
494	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
495		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
496		(by default equal to the `simulate_single_analysis` defaults)
497	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
498		(by default equal to the `simulate_single_analysis` defaults)
499	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
500		correction parameters (by default equal to the `simulate_single_analysis` default)
501	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
502	+ `shuffle`: randomly reorder the sequence of analyses
503	
504		
505	Here is an example of using this method to generate an arbitrary combination of
506	anchors and unknowns for a bunch of sessions:
507
508	```py
509	.. include:: ../../code_examples/virtual_data/example.py
510	```
511	
512	This should output something like:
513	
514	```
515	.. include:: ../../code_examples/virtual_data/output.txt
516	```
517	'''
518	
519	kwargs = locals().copy()
520
521	from numpy import random as nprandom
522	if seed:
523		nprandom.seed(seed)
524		rng = nprandom.default_rng(seed)
525	else:
526		rng = nprandom.default_rng()
527	
528	N = sum([s['N'] for s in samples])
529	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
530	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
531	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
532	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
533	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
534	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
535	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
536	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
537	
538	k = 0
539	out = []
540	for s in samples:
541		kw = {}
542		kw['sample'] = s['Sample']
543		kw = {
544			**kw,
545			**{var: kwargs[var]
546				for var in [
547					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
548					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
549					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
550					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
551					]
552				if kwargs[var] is not None},
553			**{var: s[var]
554				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
555				if var in s},
556			}
557
558		sN = s['N']
559		while sN:
560			out.append(simulate_single_analysis(**kw))
561			out[-1]['d45'] += errors45[k]
562			out[-1]['d46'] += errors46[k]
563			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
564			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
565			sN -= 1
566			k += 1
567
568		if session is not None:
569			for r in out:
570				r['Session'] = session
571
572		if shuffle:
573			nprandom.shuffle(out)
574
575	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————

[table_of_samples] 
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————

[table_of_analyses] 
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
1    Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
2    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
3    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
4    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
5    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
6    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
7    Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
8    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
9    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
10   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
11   Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
12   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
15   Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
16   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
17   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
18   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
19   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
20   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
21   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
22   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
23   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
24   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
25   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
26   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
27   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
28   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
29   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
30   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
31   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
32   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
33   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
34   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
35   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
36   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
37   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
38   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
39   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
40   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
41   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
42   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
43   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
44   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
45   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
46   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
47   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
48   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
49   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
50   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
51   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
52   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
53   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
54   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
55   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
56   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
57   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
58   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
59   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
60   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
577def table_of_samples(
578	data47 = None,
579	data48 = None,
580	dir = 'output',
581	filename = None,
582	save_to_file = True,
583	print_out = True,
584	output = None,
585	):
586	'''
587	Print out, save to disk and/or return a combined table of samples
588	for a pair of `D47data` and `D48data` objects.
589
590	**Parameters**
591
592	+ `data47`: `D47data` instance
593	+ `data48`: `D48data` instance
594	+ `dir`: the directory in which to save the table
595	+ `filename`: the name to the csv file to write to
596	+ `save_to_file`: whether to save the table to disk
597	+ `print_out`: whether to print out the table
598	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
599		if set to `'raw'`: return a list of list of strings
600		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
601	'''
602	if data47 is None:
603		if data48 is None:
604			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
605		else:
606			return data48.table_of_samples(
607				dir = dir,
608				filename = filename,
609				save_to_file = save_to_file,
610				print_out = print_out,
611				output = output
612				)
613	else:
614		if data48 is None:
615			return data47.table_of_samples(
616				dir = dir,
617				filename = filename,
618				save_to_file = save_to_file,
619				print_out = print_out,
620				output = output
621				)
622		else:
623			samples = (
624				sorted([a for a in data47.anchors if a in data48.anchors])
625				+ sorted([a for a in data47.anchors if a not in data48.anchors])
626				+ sorted([a for a in data48.anchors if a not in data47.anchors])
627				+ sorted([a for a in data47.unknowns if a in data48.unknowns])
628			)
629
630			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
631			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
632			
633			out47 = {l[0]: l for l in out47}
634			out48 = {l[0]: l for l in out48}
635
636			out = [out47['Sample'] + out48['Sample'][4:]]
637			for s in samples:
638				out.append(out47[s] + out48[s][4:])
639
640			if save_to_file:
641				if not os.path.exists(dir):
642					os.makedirs(dir)
643				if filename is None:
644					filename = f'D47D48_samples.csv'
645				with open(f'{dir}/{filename}', 'w') as fid:
646					fid.write(make_csv(out))
647			if print_out:
648				print('\n'+pretty_table(out))
649			if output == 'raw':
650				return out
651			elif output == 'pretty':
652				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
655def table_of_sessions(
656	data47 = None,
657	data48 = None,
658	dir = 'output',
659	filename = None,
660	save_to_file = True,
661	print_out = True,
662	output = None,
663	):
664	'''
665	Print out, save to disk and/or return a combined table of sessions
666	for a pair of `D47data` and `D48data` objects.
667	***Only applicable if the sessions in `data47` and those in `data48`
668	consist of the exact same sets of analyses.***
669
670	**Parameters**
671
672	+ `data47`: `D47data` instance
673	+ `data48`: `D48data` instance
674	+ `dir`: the directory in which to save the table
675	+ `filename`: the name to the csv file to write to
676	+ `save_to_file`: whether to save the table to disk
677	+ `print_out`: whether to print out the table
678	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
679		if set to `'raw'`: return a list of list of strings
680		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
681	'''
682	if data47 is None:
683		if data48 is None:
684			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
685		else:
686			return data48.table_of_sessions(
687				dir = dir,
688				filename = filename,
689				save_to_file = save_to_file,
690				print_out = print_out,
691				output = output
692				)
693	else:
694		if data48 is None:
695			return data47.table_of_sessions(
696				dir = dir,
697				filename = filename,
698				save_to_file = save_to_file,
699				print_out = print_out,
700				output = output
701				)
702		else:
703			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
704			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
705			for k,x in enumerate(out47[0]):
706				if k>7:
707					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
708					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
709			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
710
711			if save_to_file:
712				if not os.path.exists(dir):
713					os.makedirs(dir)
714				if filename is None:
715					filename = f'D47D48_sessions.csv'
716				with open(f'{dir}/{filename}', 'w') as fid:
717					fid.write(make_csv(out))
718			if print_out:
719				print('\n'+pretty_table(out))
720			if output == 'raw':
721				return out
722			elif output == 'pretty':
723				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
726def table_of_analyses(
727	data47 = None,
728	data48 = None,
729	dir = 'output',
730	filename = None,
731	save_to_file = True,
732	print_out = True,
733	output = None,
734	):
735	'''
736	Print out, save to disk and/or return a combined table of analyses
737	for a pair of `D47data` and `D48data` objects.
738
739	If the sessions in `data47` and those in `data48` do not consist of
740	the exact same sets of analyses, the table will have two columns
741	`Session_47` and `Session_48` instead of a single `Session` column.
742
743	**Parameters**
744
745	+ `data47`: `D47data` instance
746	+ `data48`: `D48data` instance
747	+ `dir`: the directory in which to save the table
748	+ `filename`: the name to the csv file to write to
749	+ `save_to_file`: whether to save the table to disk
750	+ `print_out`: whether to print out the table
751	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
752		if set to `'raw'`: return a list of list of strings
753		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
754	'''
755	if data47 is None:
756		if data48 is None:
757			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
758		else:
759			return data48.table_of_analyses(
760				dir = dir,
761				filename = filename,
762				save_to_file = save_to_file,
763				print_out = print_out,
764				output = output
765				)
766	else:
767		if data48 is None:
768			return data47.table_of_analyses(
769				dir = dir,
770				filename = filename,
771				save_to_file = save_to_file,
772				print_out = print_out,
773				output = output
774				)
775		else:
776			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
777			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
778			
779			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
780				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
781			else:
782				out47[0][1] = 'Session_47'
783				out48[0][1] = 'Session_48'
784				out47 = transpose_table(out47)
785				out48 = transpose_table(out48)
786				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
787
788			if save_to_file:
789				if not os.path.exists(dir):
790					os.makedirs(dir)
791				if filename is None:
792					filename = f'D47D48_analyses.csv'
793				with open(f'{dir}/{filename}', 'w') as fid:
794					fid.write(make_csv(out))
795			if print_out:
796				print('\n'+pretty_table(out))
797			if output == 'raw':
798				return out
799			elif output == 'pretty':
800				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 848class D4xdata(list):
 849	'''
 850	Store and process data for a large set of Δ47 and/or Δ48
 851	analyses, usually comprising more than one analytical session.
 852	'''
 853
 854	### 17O CORRECTION PARAMETERS
 855	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 856	'''
 857	Absolute (13C/12C) ratio of VPDB.
 858	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 859	'''
 860
 861	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 862	'''
 863	Absolute (18O/16C) ratio of VSMOW.
 864	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 865	'''
 866
 867	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 868	'''
 869	Mass-dependent exponent for triple oxygen isotopes.
 870	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 871	'''
 872
 873	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 874	'''
 875	Absolute (17O/16C) ratio of VSMOW.
 876	By default equal to 0.00038475
 877	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 878	rescaled to `R13_VPDB`)
 879	'''
 880
 881	R18_VPDB = R18_VSMOW * 1.03092
 882	'''
 883	Absolute (18O/16C) ratio of VPDB.
 884	By definition equal to `R18_VSMOW * 1.03092`.
 885	'''
 886
 887	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 888	'''
 889	Absolute (17O/16C) ratio of VPDB.
 890	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 891	'''
 892
 893	LEVENE_REF_SAMPLE = 'ETH-3'
 894	'''
 895	After the Δ4x standardization step, each sample is tested to
 896	assess whether the Δ4x variance within all analyses for that
 897	sample differs significantly from that observed for a given reference
 898	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 899	which yields a p-value corresponding to the null hypothesis that the
 900	underlying variances are equal).
 901
 902	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 903	sample should be used as a reference for this test.
 904	'''
 905
 906	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 907	'''
 908	Specifies the 18O/16O fractionation factor generally applicable
 909	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 910	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 911
 912	By default equal to 1.008129 (calcite reacted at 90 °C,
 913	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 914	'''
 915
 916	Nominal_d13C_VPDB = {
 917		'ETH-1': 2.02,
 918		'ETH-2': -10.17,
 919		'ETH-3': 1.71,
 920		}	# (Bernasconi et al., 2018)
 921	'''
 922	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 923	`D4xdata.standardize_d13C()`.
 924
 925	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 926	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 927	'''
 928
 929	Nominal_d18O_VPDB = {
 930		'ETH-1': -2.19,
 931		'ETH-2': -18.69,
 932		'ETH-3': -1.78,
 933		}	# (Bernasconi et al., 2018)
 934	'''
 935	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 936	`D4xdata.standardize_d18O()`.
 937
 938	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 939	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 940	'''
 941
 942	d13C_STANDARDIZATION_METHOD = '2pt'
 943	'''
 944	Method by which to standardize δ13C values:
 945	
 946	+ `none`: do not apply any δ13C standardization.
 947	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 948	minimize the difference between final δ13C_VPDB values and
 949	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 950	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 951	values so as to minimize the difference between final δ13C_VPDB
 952	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 953	is defined).
 954	'''
 955
 956	d18O_STANDARDIZATION_METHOD = '2pt'
 957	'''
 958	Method by which to standardize δ18O values:
 959	
 960	+ `none`: do not apply any δ18O standardization.
 961	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 962	minimize the difference between final δ18O_VPDB values and
 963	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 964	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 965	values so as to minimize the difference between final δ18O_VPDB
 966	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 967	is defined).
 968	'''
 969
 970	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 971		'''
 972		**Parameters**
 973
 974		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 975		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 976		+ `mass`: `'47'` or `'48'`
 977		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 978		+ `session`: define session name for analyses without a `Session` key
 979		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 980
 981		Returns a `D4xdata` object derived from `list`.
 982		'''
 983		self._4x = mass
 984		self.verbose = verbose
 985		self.prefix = 'D4xdata'
 986		self.logfile = logfile
 987		list.__init__(self, l)
 988		self.Nf = None
 989		self.repeatability = {}
 990		self.refresh(session = session)
 991
 992
 993	def make_verbal(oldfun):
 994		'''
 995		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 996		'''
 997		@wraps(oldfun)
 998		def newfun(*args, verbose = '', **kwargs):
 999			myself = args[0]
1000			oldprefix = myself.prefix
1001			myself.prefix = oldfun.__name__
1002			if verbose != '':
1003				oldverbose = myself.verbose
1004				myself.verbose = verbose
1005			out = oldfun(*args, **kwargs)
1006			myself.prefix = oldprefix
1007			if verbose != '':
1008				myself.verbose = oldverbose
1009			return out
1010		return newfun
1011
1012
1013	def msg(self, txt):
1014		'''
1015		Log a message to `self.logfile`, and print it out if `verbose = True`
1016		'''
1017		self.log(txt)
1018		if self.verbose:
1019			print(f'{f"[{self.prefix}]":<16} {txt}')
1020
1021
1022	def vmsg(self, txt):
1023		'''
1024		Log a message to `self.logfile` and print it out
1025		'''
1026		self.log(txt)
1027		print(txt)
1028
1029
1030	def log(self, *txts):
1031		'''
1032		Log a message to `self.logfile`
1033		'''
1034		if self.logfile:
1035			with open(self.logfile, 'a') as fid:
1036				for txt in txts:
1037					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1038
1039
1040	def refresh(self, session = 'mySession'):
1041		'''
1042		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1043		'''
1044		self.fill_in_missing_info(session = session)
1045		self.refresh_sessions()
1046		self.refresh_samples()
1047
1048
1049	def refresh_sessions(self):
1050		'''
1051		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1052		to `False` for all sessions.
1053		'''
1054		self.sessions = {
1055			s: {'data': [r for r in self if r['Session'] == s]}
1056			for s in sorted({r['Session'] for r in self})
1057			}
1058		for s in self.sessions:
1059			self.sessions[s]['scrambling_drift'] = False
1060			self.sessions[s]['slope_drift'] = False
1061			self.sessions[s]['wg_drift'] = False
1062			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1063			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1064
1065
1066	def refresh_samples(self):
1067		'''
1068		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1069		'''
1070		self.samples = {
1071			s: {'data': [r for r in self if r['Sample'] == s]}
1072			for s in sorted({r['Sample'] for r in self})
1073			}
1074		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1075		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1076
1077
1078	def read(self, filename, sep = '', session = ''):
1079		'''
1080		Read file in csv format to load data into a `D47data` object.
1081
1082		In the csv file, spaces before and after field separators (`','` by default)
1083		are optional. Each line corresponds to a single analysis.
1084
1085		The required fields are:
1086
1087		+ `UID`: a unique identifier
1088		+ `Session`: an identifier for the analytical session
1089		+ `Sample`: a sample identifier
1090		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1091
1092		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1093		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1094		and `d49` are optional, and set to NaN by default.
1095
1096		**Parameters**
1097
1098		+ `fileneme`: the path of the file to read
1099		+ `sep`: csv separator delimiting the fields
1100		+ `session`: set `Session` field to this string for all analyses
1101		'''
1102		with open(filename) as fid:
1103			self.input(fid.read(), sep = sep, session = session)
1104
1105
1106	def input(self, txt, sep = '', session = ''):
1107		'''
1108		Read `txt` string in csv format to load analysis data into a `D47data` object.
1109
1110		In the csv string, spaces before and after field separators (`','` by default)
1111		are optional. Each line corresponds to a single analysis.
1112
1113		The required fields are:
1114
1115		+ `UID`: a unique identifier
1116		+ `Session`: an identifier for the analytical session
1117		+ `Sample`: a sample identifier
1118		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1119
1120		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1121		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1122		and `d49` are optional, and set to NaN by default.
1123
1124		**Parameters**
1125
1126		+ `txt`: the csv string to read
1127		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1128		whichever appers most often in `txt`.
1129		+ `session`: set `Session` field to this string for all analyses
1130		'''
1131		if sep == '':
1132			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1133		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1134		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1135
1136		if session != '':
1137			for r in data:
1138				r['Session'] = session
1139
1140		self += data
1141		self.refresh()
1142
1143
1144	@make_verbal
1145	def wg(self,
1146		samples = None,
1147		session_groups = None,
1148	):
1149		'''
1150		Compute bulk composition of the working gas for each session based (by default)
1151		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1152		`self.Nominal_d18O_VPDB`.
1153
1154		**Parameters**
1155
1156		+ `samples`: A list of samples specifying the subset of samples (defined in both
1157		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1158		when computing the working gas. By default, use all samples defined both in
1159		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1160		+ `session_groups`: a list of lists of sessions
1161		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1162		specifying which sessions groups, if any, have the exact same WG composition.
1163		If set to `'all'`, force all sessions to have the same WG composition (use with
1164		caution and on short time scales, since the WG may drift slowly a long time scales).
1165		'''
1166
1167		self.msg('Computing WG composition:')
1168
1169		a18_acid = self.ALPHA_18O_ACID_REACTION
1170		
1171		if samples is None:
1172			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1173		if session_groups is None:
1174			session_groups = [[s] for s in self.sessions]
1175		elif session_groups == 'all':
1176			session_groups = [[s for s in self.sessions]]
1177
1178		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1179		R45R46_standards = {}
1180		for sample in samples:
1181			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1182			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1183			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1184			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1185			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1186
1187			C12_s = 1 / (1 + R13_s)
1188			C13_s = R13_s / (1 + R13_s)
1189			C16_s = 1 / (1 + R17_s + R18_s)
1190			C17_s = R17_s / (1 + R17_s + R18_s)
1191			C18_s = R18_s / (1 + R17_s + R18_s)
1192
1193			C626_s = C12_s * C16_s ** 2
1194			C627_s = 2 * C12_s * C16_s * C17_s
1195			C628_s = 2 * C12_s * C16_s * C18_s
1196			C636_s = C13_s * C16_s ** 2
1197			C637_s = 2 * C13_s * C16_s * C17_s
1198			C727_s = C12_s * C17_s ** 2
1199
1200			R45_s = (C627_s + C636_s) / C626_s
1201			R46_s = (C628_s + C637_s + C727_s) / C626_s
1202			R45R46_standards[sample] = (R45_s, R46_s)
1203		
1204		for sg in session_groups:
1205			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1206			assert db, f'No sample from {samples} found in session group {sg}.'
1207
1208			X = [r['d45'] for r in db]
1209			Y = [R45R46_standards[r['Sample']][0] for r in db]
1210			x1, x2 = np.min(X), np.max(X)
1211
1212			if x1 < x2:
1213				wgcoord = x1/(x1-x2)
1214			else:
1215				wgcoord = 999
1216
1217			if wgcoord < -.5 or wgcoord > 1.5:
1218				# unreasonable to extrapolate to d45 = 0
1219				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1220			else :
1221				# d45 = 0 is reasonably well bracketed
1222				R45_wg = np.polyfit(X, Y, 1)[1]
1223
1224			X = [r['d46'] for r in db]
1225			Y = [R45R46_standards[r['Sample']][1] for r in db]
1226			x1, x2 = np.min(X), np.max(X)
1227
1228			if x1 < x2:
1229				wgcoord = x1/(x1-x2)
1230			else:
1231				wgcoord = 999
1232
1233			if wgcoord < -.5 or wgcoord > 1.5:
1234				# unreasonable to extrapolate to d46 = 0
1235				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1236			else :
1237				# d46 = 0 is reasonably well bracketed
1238				R46_wg = np.polyfit(X, Y, 1)[1]
1239
1240			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1241
1242			for s in sg:
1243				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1244	
1245				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1246				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1247				for r in self.sessions[s]['data']:
1248					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1249					r['d18Owg_VSMOW'] = d18Owg_VSMOW
1250
1251
1252	def compute_bulk_delta(self, R45, R46, D17O = 0):
1253		'''
1254		Compute δ13C_VPDB and δ18O_VSMOW,
1255		by solving the generalized form of equation (17) from
1256		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1257		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1258		solving the corresponding second-order Taylor polynomial.
1259		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1260		'''
1261
1262		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1263
1264		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1265		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1266		C = 2 * self.R18_VSMOW
1267		D = -R46
1268
1269		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1270		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1271		cc = A + B + C + D
1272
1273		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1274
1275		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1276		R17 = K * R18 ** self.LAMBDA_17
1277		R13 = R45 - 2 * R17
1278
1279		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1280
1281		return d13C_VPDB, d18O_VSMOW
1282
1283
1284	@make_verbal
1285	def crunch(self, verbose = ''):
1286		'''
1287		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1288		'''
1289		for r in self:
1290			self.compute_bulk_and_clumping_deltas(r)
1291		self.standardize_d13C()
1292		self.standardize_d18O()
1293		self.msg(f"Crunched {len(self)} analyses.")
1294
1295
1296	def fill_in_missing_info(self, session = 'mySession'):
1297		'''
1298		Fill in optional fields with default values
1299		'''
1300		for i,r in enumerate(self):
1301			if 'D17O' not in r:
1302				r['D17O'] = 0.
1303			if 'UID' not in r:
1304				r['UID'] = f'{i+1}'
1305			if 'Session' not in r:
1306				r['Session'] = session
1307			for k in ['d47', 'd48', 'd49']:
1308				if k not in r:
1309					r[k] = np.nan
1310
1311
1312	def standardize_d13C(self):
1313		'''
1314		Perform δ13C standadization within each session `s` according to
1315		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1316		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1317		may be redefined abitrarily at a later stage.
1318		'''
1319		for s in self.sessions:
1320			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1321				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1322				X,Y = zip(*XY)
1323				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1324					offset = np.mean(Y) - np.mean(X)
1325					for r in self.sessions[s]['data']:
1326						r['d13C_VPDB'] += offset				
1327				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1328					a,b = np.polyfit(X,Y,1)
1329					for r in self.sessions[s]['data']:
1330						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1331
1332	def standardize_d18O(self):
1333		'''
1334		Perform δ18O standadization within each session `s` according to
1335		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1336		which is defined by default by `D47data.refresh_sessions()`as equal to
1337		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1338		'''
1339		for s in self.sessions:
1340			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1341				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1342				X,Y = zip(*XY)
1343				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1344				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1345					offset = np.mean(Y) - np.mean(X)
1346					for r in self.sessions[s]['data']:
1347						r['d18O_VSMOW'] += offset				
1348				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1349					a,b = np.polyfit(X,Y,1)
1350					for r in self.sessions[s]['data']:
1351						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1352	
1353
1354	def compute_bulk_and_clumping_deltas(self, r):
1355		'''
1356		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1357		'''
1358
1359		# Compute working gas R13, R18, and isobar ratios
1360		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1361		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1362		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1363
1364		# Compute analyte isobar ratios
1365		R45 = (1 + r['d45'] / 1000) * R45_wg
1366		R46 = (1 + r['d46'] / 1000) * R46_wg
1367		R47 = (1 + r['d47'] / 1000) * R47_wg
1368		R48 = (1 + r['d48'] / 1000) * R48_wg
1369		R49 = (1 + r['d49'] / 1000) * R49_wg
1370
1371		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1372		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1373		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1374
1375		# Compute stochastic isobar ratios of the analyte
1376		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1377			R13, R18, D17O = r['D17O']
1378		)
1379
1380		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1381		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1382		if (R45 / R45stoch - 1) > 5e-8:
1383			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1384		if (R46 / R46stoch - 1) > 5e-8:
1385			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1386
1387		# Compute raw clumped isotope anomalies
1388		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1389		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1390		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1391
1392
1393	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1394		'''
1395		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1396		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1397		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1398		'''
1399
1400		# Compute R17
1401		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1402
1403		# Compute isotope concentrations
1404		C12 = (1 + R13) ** -1
1405		C13 = C12 * R13
1406		C16 = (1 + R17 + R18) ** -1
1407		C17 = C16 * R17
1408		C18 = C16 * R18
1409
1410		# Compute stochastic isotopologue concentrations
1411		C626 = C16 * C12 * C16
1412		C627 = C16 * C12 * C17 * 2
1413		C628 = C16 * C12 * C18 * 2
1414		C636 = C16 * C13 * C16
1415		C637 = C16 * C13 * C17 * 2
1416		C638 = C16 * C13 * C18 * 2
1417		C727 = C17 * C12 * C17
1418		C728 = C17 * C12 * C18 * 2
1419		C737 = C17 * C13 * C17
1420		C738 = C17 * C13 * C18 * 2
1421		C828 = C18 * C12 * C18
1422		C838 = C18 * C13 * C18
1423
1424		# Compute stochastic isobar ratios
1425		R45 = (C636 + C627) / C626
1426		R46 = (C628 + C637 + C727) / C626
1427		R47 = (C638 + C728 + C737) / C626
1428		R48 = (C738 + C828) / C626
1429		R49 = C838 / C626
1430
1431		# Account for stochastic anomalies
1432		R47 *= 1 + D47 / 1000
1433		R48 *= 1 + D48 / 1000
1434		R49 *= 1 + D49 / 1000
1435
1436		# Return isobar ratios
1437		return R45, R46, R47, R48, R49
1438
1439
1440	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1441		'''
1442		Split unknown samples by UID (treat all analyses as different samples)
1443		or by session (treat analyses of a given sample in different sessions as
1444		different samples).
1445
1446		**Parameters**
1447
1448		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1449		+ `grouping`: `by_uid` | `by_session`
1450		'''
1451		if samples_to_split == 'all':
1452			samples_to_split = [s for s in self.unknowns]
1453		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1454		self.grouping = grouping.lower()
1455		if self.grouping in gkeys:
1456			gkey = gkeys[self.grouping]
1457		for r in self:
1458			if r['Sample'] in samples_to_split:
1459				r['Sample_original'] = r['Sample']
1460				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1461			elif r['Sample'] in self.unknowns:
1462				r['Sample_original'] = r['Sample']
1463		self.refresh_samples()
1464
1465
1466	def unsplit_samples(self, tables = False):
1467		'''
1468		Reverse the effects of `D47data.split_samples()`.
1469		
1470		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1471		
1472		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1473		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1474		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1475		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1476		that case session-averaged Δ4x values are statistically independent).
1477		'''
1478		unknowns_old = sorted({s for s in self.unknowns})
1479		CM_old = self.standardization.covar[:,:]
1480		VD_old = self.standardization.params.valuesdict().copy()
1481		vars_old = self.standardization.var_names
1482
1483		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1484
1485		Ns = len(vars_old) - len(unknowns_old)
1486		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1487		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1488
1489		W = np.zeros((len(vars_new), len(vars_old)))
1490		W[:Ns,:Ns] = np.eye(Ns)
1491		for u in unknowns_new:
1492			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1493			if self.grouping == 'by_session':
1494				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1495			elif self.grouping == 'by_uid':
1496				weights = [1 for s in splits]
1497			sw = sum(weights)
1498			weights = [w/sw for w in weights]
1499			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1500
1501		CM_new = W @ CM_old @ W.T
1502		V = W @ np.array([[VD_old[k]] for k in vars_old])
1503		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1504
1505		self.standardization.covar = CM_new
1506		self.standardization.params.valuesdict = lambda : VD_new
1507		self.standardization.var_names = vars_new
1508
1509		for r in self:
1510			if r['Sample'] in self.unknowns:
1511				r['Sample_split'] = r['Sample']
1512				r['Sample'] = r['Sample_original']
1513
1514		self.refresh_samples()
1515		self.consolidate_samples()
1516		self.repeatabilities()
1517
1518		if tables:
1519			self.table_of_analyses()
1520			self.table_of_samples()
1521
1522	def assign_timestamps(self):
1523		'''
1524		Assign a time field `t` of type `float` to each analysis.
1525
1526		If `TimeTag` is one of the data fields, `t` is equal within a given session
1527		to `TimeTag` minus the mean value of `TimeTag` for that session.
1528		Otherwise, `TimeTag` is by default equal to the index of each analysis
1529		in the dataset and `t` is defined as above.
1530		'''
1531		for session in self.sessions:
1532			sdata = self.sessions[session]['data']
1533			try:
1534				t0 = np.mean([r['TimeTag'] for r in sdata])
1535				for r in sdata:
1536					r['t'] = r['TimeTag'] - t0
1537			except KeyError:
1538				t0 = (len(sdata)-1)/2
1539				for t,r in enumerate(sdata):
1540					r['t'] = t - t0
1541
1542
1543	def report(self):
1544		'''
1545		Prints a report on the standardization fit.
1546		Only applicable after `D4xdata.standardize(method='pooled')`.
1547		'''
1548		report_fit(self.standardization)
1549
1550
1551	def combine_samples(self, sample_groups):
1552		'''
1553		Combine analyses of different samples to compute weighted average Δ4x
1554		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1555		dictionary.
1556		
1557		Caution: samples are weighted by number of replicate analyses, which is a
1558		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1559		correlated analytical errors for one or more samples).
1560		
1561		Returns a tuplet of:
1562		
1563		+ the list of group names
1564		+ an array of the corresponding Δ4x values
1565		+ the corresponding (co)variance matrix
1566		
1567		**Parameters**
1568
1569		+ `sample_groups`: a dictionary of the form:
1570		```py
1571		{'group1': ['sample_1', 'sample_2'],
1572		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1573		```
1574		'''
1575		
1576		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1577		groups = sorted(sample_groups.keys())
1578		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1579		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1580		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1581		W = np.array([
1582			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1583			for j in groups])
1584		D4x_new = W @ D4x_old
1585		CM_new = W @ CM_old @ W.T
1586
1587		return groups, D4x_new[:,0], CM_new
1588		
1589
1590	@make_verbal
1591	def standardize(self,
1592		method = 'pooled',
1593		weighted_sessions = [],
1594		consolidate = True,
1595		consolidate_tables = False,
1596		consolidate_plots = False,
1597		constraints = {},
1598		):
1599		'''
1600		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1601		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1602		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1603		i.e. that their true Δ4x value does not change between sessions,
1604		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1605		`'indep_sessions'`, the standardization processes each session independently, based only
1606		on anchors analyses.
1607		'''
1608
1609		self.standardization_method = method
1610		self.assign_timestamps()
1611
1612		if method == 'pooled':
1613			if weighted_sessions:
1614				for session_group in weighted_sessions:
1615					if self._4x == '47':
1616						X = D47data([r for r in self if r['Session'] in session_group])
1617					elif self._4x == '48':
1618						X = D48data([r for r in self if r['Session'] in session_group])
1619					X.Nominal_D4x = self.Nominal_D4x.copy()
1620					X.refresh()
1621					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1622					w = np.sqrt(result.redchi)
1623					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1624					for r in X:
1625						r[f'wD{self._4x}raw'] *= w
1626			else:
1627				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1628				for r in self:
1629					r[f'wD{self._4x}raw'] = 1.
1630
1631			params = Parameters()
1632			for k,session in enumerate(self.sessions):
1633				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1634				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1635				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1636				s = pf(session)
1637				params.add(f'a_{s}', value = 0.9)
1638				params.add(f'b_{s}', value = 0.)
1639				params.add(f'c_{s}', value = -0.9)
1640				params.add(f'a2_{s}', value = 0.,
1641# 					vary = self.sessions[session]['scrambling_drift'],
1642					)
1643				params.add(f'b2_{s}', value = 0.,
1644# 					vary = self.sessions[session]['slope_drift'],
1645					)
1646				params.add(f'c2_{s}', value = 0.,
1647# 					vary = self.sessions[session]['wg_drift'],
1648					)
1649				if not self.sessions[session]['scrambling_drift']:
1650					params[f'a2_{s}'].expr = '0'
1651				if not self.sessions[session]['slope_drift']:
1652					params[f'b2_{s}'].expr = '0'
1653				if not self.sessions[session]['wg_drift']:
1654					params[f'c2_{s}'].expr = '0'
1655
1656			for sample in self.unknowns:
1657				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1658
1659			for k in constraints:
1660				params[k].expr = constraints[k]
1661
1662			def residuals(p):
1663				R = []
1664				for r in self:
1665					session = pf(r['Session'])
1666					sample = pf(r['Sample'])
1667					if r['Sample'] in self.Nominal_D4x:
1668						R += [ (
1669							r[f'D{self._4x}raw'] - (
1670								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1671								+ p[f'b_{session}'] * r[f'd{self._4x}']
1672								+	p[f'c_{session}']
1673								+ r['t'] * (
1674									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1675									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1676									+	p[f'c2_{session}']
1677									)
1678								)
1679							) / r[f'wD{self._4x}raw'] ]
1680					else:
1681						R += [ (
1682							r[f'D{self._4x}raw'] - (
1683								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1684								+ p[f'b_{session}'] * r[f'd{self._4x}']
1685								+	p[f'c_{session}']
1686								+ r['t'] * (
1687									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1688									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1689									+	p[f'c2_{session}']
1690									)
1691								)
1692							) / r[f'wD{self._4x}raw'] ]
1693				return R
1694
1695			M = Minimizer(residuals, params)
1696			result = M.least_squares()
1697			self.Nf = result.nfree
1698			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1699			new_names, new_covar, new_se = _fullcovar(result)[:3]
1700			result.var_names = new_names
1701			result.covar = new_covar
1702
1703			for r in self:
1704				s = pf(r["Session"])
1705				a = result.params.valuesdict()[f'a_{s}']
1706				b = result.params.valuesdict()[f'b_{s}']
1707				c = result.params.valuesdict()[f'c_{s}']
1708				a2 = result.params.valuesdict()[f'a2_{s}']
1709				b2 = result.params.valuesdict()[f'b2_{s}']
1710				c2 = result.params.valuesdict()[f'c2_{s}']
1711				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1712				
1713
1714			self.standardization = result
1715
1716			for session in self.sessions:
1717				self.sessions[session]['Np'] = 3
1718				for k in ['scrambling', 'slope', 'wg']:
1719					if self.sessions[session][f'{k}_drift']:
1720						self.sessions[session]['Np'] += 1
1721
1722			if consolidate:
1723				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1724			return result
1725
1726
1727		elif method == 'indep_sessions':
1728
1729			if weighted_sessions:
1730				for session_group in weighted_sessions:
1731					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1732					X.Nominal_D4x = self.Nominal_D4x.copy()
1733					X.refresh()
1734					# This is only done to assign r['wD47raw'] for r in X:
1735					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1736					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1737			else:
1738				self.msg('All weights set to 1 ‰')
1739				for r in self:
1740					r[f'wD{self._4x}raw'] = 1
1741
1742			for session in self.sessions:
1743				s = self.sessions[session]
1744				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1745				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1746				s['Np'] = sum(p_active)
1747				sdata = s['data']
1748
1749				A = np.array([
1750					[
1751						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1752						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1753						1 / r[f'wD{self._4x}raw'],
1754						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1755						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1756						r['t'] / r[f'wD{self._4x}raw']
1757						]
1758					for r in sdata if r['Sample'] in self.anchors
1759					])[:,p_active] # only keep columns for the active parameters
1760				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1761				s['Na'] = Y.size
1762				CM = linalg.inv(A.T @ A)
1763				bf = (CM @ A.T @ Y).T[0,:]
1764				k = 0
1765				for n,a in zip(p_names, p_active):
1766					if a:
1767						s[n] = bf[k]
1768# 						self.msg(f'{n} = {bf[k]}')
1769						k += 1
1770					else:
1771						s[n] = 0.
1772# 						self.msg(f'{n} = 0.0')
1773
1774				for r in sdata :
1775					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1776					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1777					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1778
1779				s['CM'] = np.zeros((6,6))
1780				i = 0
1781				k_active = [j for j,a in enumerate(p_active) if a]
1782				for j,a in enumerate(p_active):
1783					if a:
1784						s['CM'][j,k_active] = CM[i,:]
1785						i += 1
1786
1787			if not weighted_sessions:
1788				w = self.rmswd()['rmswd']
1789				for r in self:
1790						r[f'wD{self._4x}'] *= w
1791						r[f'wD{self._4x}raw'] *= w
1792				for session in self.sessions:
1793					self.sessions[session]['CM'] *= w**2
1794
1795			for session in self.sessions:
1796				s = self.sessions[session]
1797				s['SE_a'] = s['CM'][0,0]**.5
1798				s['SE_b'] = s['CM'][1,1]**.5
1799				s['SE_c'] = s['CM'][2,2]**.5
1800				s['SE_a2'] = s['CM'][3,3]**.5
1801				s['SE_b2'] = s['CM'][4,4]**.5
1802				s['SE_c2'] = s['CM'][5,5]**.5
1803
1804			if not weighted_sessions:
1805				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1806			else:
1807				self.Nf = 0
1808				for sg in weighted_sessions:
1809					self.Nf += self.rmswd(sessions = sg)['Nf']
1810
1811			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1812
1813			avgD4x = {
1814				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1815				for sample in self.samples
1816				}
1817			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1818			rD4x = (chi2/self.Nf)**.5
1819			self.repeatability[f'sigma_{self._4x}'] = rD4x
1820
1821			if consolidate:
1822				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1823
1824
1825	def standardization_error(self, session, d4x, D4x, t = 0):
1826		'''
1827		Compute standardization error for a given session and
1828		(δ47, Δ47) composition.
1829		'''
1830		a = self.sessions[session]['a']
1831		b = self.sessions[session]['b']
1832		c = self.sessions[session]['c']
1833		a2 = self.sessions[session]['a2']
1834		b2 = self.sessions[session]['b2']
1835		c2 = self.sessions[session]['c2']
1836		CM = self.sessions[session]['CM']
1837
1838		x, y = D4x, d4x
1839		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1840# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1841		dxdy = -(b+b2*t) / (a+a2*t)
1842		dxdz = 1. / (a+a2*t)
1843		dxda = -x / (a+a2*t)
1844		dxdb = -y / (a+a2*t)
1845		dxdc = -1. / (a+a2*t)
1846		dxda2 = -x * a2 / (a+a2*t)
1847		dxdb2 = -y * t / (a+a2*t)
1848		dxdc2 = -t / (a+a2*t)
1849		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1850		sx = (V @ CM @ V.T) ** .5
1851		return sx
1852
1853
1854	@make_verbal
1855	def summary(self,
1856		dir = 'output',
1857		filename = None,
1858		save_to_file = True,
1859		print_out = True,
1860		):
1861		'''
1862		Print out an/or save to disk a summary of the standardization results.
1863
1864		**Parameters**
1865
1866		+ `dir`: the directory in which to save the table
1867		+ `filename`: the name to the csv file to write to
1868		+ `save_to_file`: whether to save the table to disk
1869		+ `print_out`: whether to print out the table
1870		'''
1871
1872		out = []
1873		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1874		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1875		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1876		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1877		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1878		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1879		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1880		out += [['Model degrees of freedom', f"{self.Nf}"]]
1881		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1882		out += [['Standardization method', self.standardization_method]]
1883
1884		if save_to_file:
1885			if not os.path.exists(dir):
1886				os.makedirs(dir)
1887			if filename is None:
1888				filename = f'D{self._4x}_summary.csv'
1889			with open(f'{dir}/{filename}', 'w') as fid:
1890				fid.write(make_csv(out))
1891		if print_out:
1892			self.msg('\n' + pretty_table(out, header = 0))
1893
1894
1895	@make_verbal
1896	def table_of_sessions(self,
1897		dir = 'output',
1898		filename = None,
1899		save_to_file = True,
1900		print_out = True,
1901		output = None,
1902		):
1903		'''
1904		Print out an/or save to disk a table of sessions.
1905
1906		**Parameters**
1907
1908		+ `dir`: the directory in which to save the table
1909		+ `filename`: the name to the csv file to write to
1910		+ `save_to_file`: whether to save the table to disk
1911		+ `print_out`: whether to print out the table
1912		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1913		    if set to `'raw'`: return a list of list of strings
1914		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1915		'''
1916		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1917		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1918		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1919
1920		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1921		if include_a2:
1922			out[-1] += ['a2 ± SE']
1923		if include_b2:
1924			out[-1] += ['b2 ± SE']
1925		if include_c2:
1926			out[-1] += ['c2 ± SE']
1927		for session in self.sessions:
1928			out += [[
1929				session,
1930				f"{self.sessions[session]['Na']}",
1931				f"{self.sessions[session]['Nu']}",
1932				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1933				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1934				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1935				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1936				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1937				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1938				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1939				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1940				]]
1941			if include_a2:
1942				if self.sessions[session]['scrambling_drift']:
1943					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1944				else:
1945					out[-1] += ['']
1946			if include_b2:
1947				if self.sessions[session]['slope_drift']:
1948					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1949				else:
1950					out[-1] += ['']
1951			if include_c2:
1952				if self.sessions[session]['wg_drift']:
1953					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1954				else:
1955					out[-1] += ['']
1956
1957		if save_to_file:
1958			if not os.path.exists(dir):
1959				os.makedirs(dir)
1960			if filename is None:
1961				filename = f'D{self._4x}_sessions.csv'
1962			with open(f'{dir}/{filename}', 'w') as fid:
1963				fid.write(make_csv(out))
1964		if print_out:
1965			self.msg('\n' + pretty_table(out))
1966		if output == 'raw':
1967			return out
1968		elif output == 'pretty':
1969			return pretty_table(out)
1970
1971
1972	@make_verbal
1973	def table_of_analyses(
1974		self,
1975		dir = 'output',
1976		filename = None,
1977		save_to_file = True,
1978		print_out = True,
1979		output = None,
1980		):
1981		'''
1982		Print out an/or save to disk a table of analyses.
1983
1984		**Parameters**
1985
1986		+ `dir`: the directory in which to save the table
1987		+ `filename`: the name to the csv file to write to
1988		+ `save_to_file`: whether to save the table to disk
1989		+ `print_out`: whether to print out the table
1990		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1991		    if set to `'raw'`: return a list of list of strings
1992		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1993		'''
1994
1995		out = [['UID','Session','Sample']]
1996		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1997		for f in extra_fields:
1998			out[-1] += [f[0]]
1999		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
2000		for r in self:
2001			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
2002			for f in extra_fields:
2003				out[-1] += [f"{r[f[0]]:{f[1]}}"]
2004			out[-1] += [
2005				f"{r['d13Cwg_VPDB']:.3f}",
2006				f"{r['d18Owg_VSMOW']:.3f}",
2007				f"{r['d45']:.6f}",
2008				f"{r['d46']:.6f}",
2009				f"{r['d47']:.6f}",
2010				f"{r['d48']:.6f}",
2011				f"{r['d49']:.6f}",
2012				f"{r['d13C_VPDB']:.6f}",
2013				f"{r['d18O_VSMOW']:.6f}",
2014				f"{r['D47raw']:.6f}",
2015				f"{r['D48raw']:.6f}",
2016				f"{r['D49raw']:.6f}",
2017				f"{r[f'D{self._4x}']:.6f}"
2018				]
2019		if save_to_file:
2020			if not os.path.exists(dir):
2021				os.makedirs(dir)
2022			if filename is None:
2023				filename = f'D{self._4x}_analyses.csv'
2024			with open(f'{dir}/{filename}', 'w') as fid:
2025				fid.write(make_csv(out))
2026		if print_out:
2027			self.msg('\n' + pretty_table(out))
2028		return out
2029
2030	@make_verbal
2031	def covar_table(
2032		self,
2033		correl = False,
2034		dir = 'output',
2035		filename = None,
2036		save_to_file = True,
2037		print_out = True,
2038		output = None,
2039		):
2040		'''
2041		Print out, save to disk and/or return the variance-covariance matrix of D4x
2042		for all unknown samples.
2043
2044		**Parameters**
2045
2046		+ `dir`: the directory in which to save the csv
2047		+ `filename`: the name of the csv file to write to
2048		+ `save_to_file`: whether to save the csv
2049		+ `print_out`: whether to print out the matrix
2050		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2051		    if set to `'raw'`: return a list of list of strings
2052		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2053		'''
2054		samples = sorted([u for u in self.unknowns])
2055		out = [[''] + samples]
2056		for s1 in samples:
2057			out.append([s1])
2058			for s2 in samples:
2059				if correl:
2060					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2061				else:
2062					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2063
2064		if save_to_file:
2065			if not os.path.exists(dir):
2066				os.makedirs(dir)
2067			if filename is None:
2068				if correl:
2069					filename = f'D{self._4x}_correl.csv'
2070				else:
2071					filename = f'D{self._4x}_covar.csv'
2072			with open(f'{dir}/{filename}', 'w') as fid:
2073				fid.write(make_csv(out))
2074		if print_out:
2075			self.msg('\n'+pretty_table(out))
2076		if output == 'raw':
2077			return out
2078		elif output == 'pretty':
2079			return pretty_table(out)
2080
2081	@make_verbal
2082	def table_of_samples(
2083		self,
2084		dir = 'output',
2085		filename = None,
2086		save_to_file = True,
2087		print_out = True,
2088		output = None,
2089		):
2090		'''
2091		Print out, save to disk and/or return a table of samples.
2092
2093		**Parameters**
2094
2095		+ `dir`: the directory in which to save the csv
2096		+ `filename`: the name of the csv file to write to
2097		+ `save_to_file`: whether to save the csv
2098		+ `print_out`: whether to print out the table
2099		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2100		    if set to `'raw'`: return a list of list of strings
2101		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2102		'''
2103
2104		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2105		for sample in self.anchors:
2106			out += [[
2107				f"{sample}",
2108				f"{self.samples[sample]['N']}",
2109				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2110				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2111				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2112				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2113				]]
2114		for sample in self.unknowns:
2115			out += [[
2116				f"{sample}",
2117				f"{self.samples[sample]['N']}",
2118				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2119				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2120				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2121				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2122				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2123				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2124				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2125				]]
2126		if save_to_file:
2127			if not os.path.exists(dir):
2128				os.makedirs(dir)
2129			if filename is None:
2130				filename = f'D{self._4x}_samples.csv'
2131			with open(f'{dir}/{filename}', 'w') as fid:
2132				fid.write(make_csv(out))
2133		if print_out:
2134			self.msg('\n'+pretty_table(out))
2135		if output == 'raw':
2136			return out
2137		elif output == 'pretty':
2138			return pretty_table(out)
2139
2140
2141	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2142		'''
2143		Generate session plots and save them to disk.
2144
2145		**Parameters**
2146
2147		+ `dir`: the directory in which to save the plots
2148		+ `figsize`: the width and height (in inches) of each plot
2149		+ `filetype`: 'pdf' or 'png'
2150		+ `dpi`: resolution for PNG output
2151		'''
2152		if not os.path.exists(dir):
2153			os.makedirs(dir)
2154
2155		for session in self.sessions:
2156			sp = self.plot_single_session(session, xylimits = 'constant')
2157			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2158			ppl.close(sp.fig)
2159			
2160
2161
2162	@make_verbal
2163	def consolidate_samples(self):
2164		'''
2165		Compile various statistics for each sample.
2166
2167		For each anchor sample:
2168
2169		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2170		+ `SE_D47` or `SE_D48`: set to zero by definition
2171
2172		For each unknown sample:
2173
2174		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2175		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2176
2177		For each anchor and unknown:
2178
2179		+ `N`: the total number of analyses of this sample
2180		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2181		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2182		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2183		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2184		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2185		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2186		'''
2187		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2188		for sample in self.samples:
2189			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2190			if self.samples[sample]['N'] > 1:
2191				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2192
2193			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2194			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2195
2196			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2197			if len(D4x_pop) > 2:
2198				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2199			
2200		if self.standardization_method == 'pooled':
2201			for sample in self.anchors:
2202				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2203				self.samples[sample][f'SE_D{self._4x}'] = 0.
2204			for sample in self.unknowns:
2205				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2206				try:
2207					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2208				except ValueError:
2209					# when `sample` is constrained by self.standardize(constraints = {...}),
2210					# it is no longer listed in self.standardization.var_names.
2211					# Temporary fix: define SE as zero for now
2212					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2213
2214		elif self.standardization_method == 'indep_sessions':
2215			for sample in self.anchors:
2216				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2217				self.samples[sample][f'SE_D{self._4x}'] = 0.
2218			for sample in self.unknowns:
2219				self.msg(f'Consolidating sample {sample}')
2220				self.unknowns[sample][f'session_D{self._4x}'] = {}
2221				session_avg = []
2222				for session in self.sessions:
2223					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2224					if sdata:
2225						self.msg(f'{sample} found in session {session}')
2226						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2227						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2228						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2229						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2230						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2231						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2232						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2233				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2234				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2235				wsum = sum([weights[s] for s in weights])
2236				for s in weights:
2237					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2238
2239		for r in self:
2240			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2241
2242
2243
2244	def consolidate_sessions(self):
2245		'''
2246		Compute various statistics for each session.
2247
2248		+ `Na`: Number of anchor analyses in the session
2249		+ `Nu`: Number of unknown analyses in the session
2250		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2251		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2252		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2253		+ `a`: scrambling factor
2254		+ `b`: compositional slope
2255		+ `c`: WG offset
2256		+ `SE_a`: Model stadard erorr of `a`
2257		+ `SE_b`: Model stadard erorr of `b`
2258		+ `SE_c`: Model stadard erorr of `c`
2259		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2260		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2261		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2262		+ `a2`: scrambling factor drift
2263		+ `b2`: compositional slope drift
2264		+ `c2`: WG offset drift
2265		+ `Np`: Number of standardization parameters to fit
2266		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2267		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2268		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2269		'''
2270		for session in self.sessions:
2271			if 'd13Cwg_VPDB' not in self.sessions[session]:
2272				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2273			if 'd18Owg_VSMOW' not in self.sessions[session]:
2274				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2275			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2276			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2277
2278			self.msg(f'Computing repeatabilities for session {session}')
2279			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2280			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2281			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2282
2283		if self.standardization_method == 'pooled':
2284			for session in self.sessions:
2285
2286				# different (better?) computation of D4x repeatability for each session:
2287				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2288				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2289
2290				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2291				i = self.standardization.var_names.index(f'a_{pf(session)}')
2292				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2293
2294				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2295				i = self.standardization.var_names.index(f'b_{pf(session)}')
2296				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2297
2298				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2299				i = self.standardization.var_names.index(f'c_{pf(session)}')
2300				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2301
2302				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2303				if self.sessions[session]['scrambling_drift']:
2304					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2305					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2306				else:
2307					self.sessions[session]['SE_a2'] = 0.
2308
2309				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2310				if self.sessions[session]['slope_drift']:
2311					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2312					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2313				else:
2314					self.sessions[session]['SE_b2'] = 0.
2315
2316				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2317				if self.sessions[session]['wg_drift']:
2318					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2319					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2320				else:
2321					self.sessions[session]['SE_c2'] = 0.
2322
2323				i = self.standardization.var_names.index(f'a_{pf(session)}')
2324				j = self.standardization.var_names.index(f'b_{pf(session)}')
2325				k = self.standardization.var_names.index(f'c_{pf(session)}')
2326				CM = np.zeros((6,6))
2327				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2328				try:
2329					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2330					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2331					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2332					try:
2333						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2334						CM[3,4] = self.standardization.covar[i2,j2]
2335						CM[4,3] = self.standardization.covar[j2,i2]
2336					except ValueError:
2337						pass
2338					try:
2339						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2340						CM[3,5] = self.standardization.covar[i2,k2]
2341						CM[5,3] = self.standardization.covar[k2,i2]
2342					except ValueError:
2343						pass
2344				except ValueError:
2345					pass
2346				try:
2347					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2348					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2349					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2350					try:
2351						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2352						CM[4,5] = self.standardization.covar[j2,k2]
2353						CM[5,4] = self.standardization.covar[k2,j2]
2354					except ValueError:
2355						pass
2356				except ValueError:
2357					pass
2358				try:
2359					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2360					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2361					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2362				except ValueError:
2363					pass
2364
2365				self.sessions[session]['CM'] = CM
2366
2367		elif self.standardization_method == 'indep_sessions':
2368			pass # Not implemented yet
2369
2370
2371	@make_verbal
2372	def repeatabilities(self):
2373		'''
2374		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2375		(for all samples, for anchors, and for unknowns).
2376		'''
2377		self.msg('Computing reproducibilities for all sessions')
2378
2379		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2380		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2381		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2382		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2383		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2384
2385
2386	@make_verbal
2387	def consolidate(self, tables = True, plots = True):
2388		'''
2389		Collect information about samples, sessions and repeatabilities.
2390		'''
2391		self.consolidate_samples()
2392		self.consolidate_sessions()
2393		self.repeatabilities()
2394
2395		if tables:
2396			self.summary()
2397			self.table_of_sessions()
2398			self.table_of_analyses()
2399			self.table_of_samples()
2400
2401		if plots:
2402			self.plot_sessions()
2403
2404
2405	@make_verbal
2406	def rmswd(self,
2407		samples = 'all samples',
2408		sessions = 'all sessions',
2409		):
2410		'''
2411		Compute the χ2, root mean squared weighted deviation
2412		(i.e. reduced χ2), and corresponding degrees of freedom of the
2413		Δ4x values for samples in `samples` and sessions in `sessions`.
2414		
2415		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2416		'''
2417		if samples == 'all samples':
2418			mysamples = [k for k in self.samples]
2419		elif samples == 'anchors':
2420			mysamples = [k for k in self.anchors]
2421		elif samples == 'unknowns':
2422			mysamples = [k for k in self.unknowns]
2423		else:
2424			mysamples = samples
2425
2426		if sessions == 'all sessions':
2427			sessions = [k for k in self.sessions]
2428
2429		chisq, Nf = 0, 0
2430		for sample in mysamples :
2431			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2432			if len(G) > 1 :
2433				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2434				Nf += (len(G) - 1)
2435				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2436		r = (chisq / Nf)**.5 if Nf > 0 else 0
2437		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2438		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2439
2440	
2441	@make_verbal
2442	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2443		'''
2444		Compute the repeatability of `[r[key] for r in self]`
2445		'''
2446
2447		if samples == 'all samples':
2448			mysamples = [k for k in self.samples]
2449		elif samples == 'anchors':
2450			mysamples = [k for k in self.anchors]
2451		elif samples == 'unknowns':
2452			mysamples = [k for k in self.unknowns]
2453		else:
2454			mysamples = samples
2455
2456		if sessions == 'all sessions':
2457			sessions = [k for k in self.sessions]
2458
2459		if key in ['D47', 'D48']:
2460			# Full disclosure: the definition of Nf is tricky/debatable
2461			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2462			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2463			Nf = len(G)
2464# 			print(f'len(G) = {Nf}')
2465			Nf -= len([s for s in mysamples if s in self.unknowns])
2466# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2467			for session in sessions:
2468				Np = len([
2469					_ for _ in self.standardization.params
2470					if (
2471						self.standardization.params[_].expr is not None
2472						and (
2473							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2474							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2475							)
2476						)
2477					])
2478# 				print(f'session {session}: {Np} parameters to consider')
2479				Na = len({
2480					r['Sample'] for r in self.sessions[session]['data']
2481					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2482					})
2483# 				print(f'session {session}: {Na} different anchors in that session')
2484				Nf -= min(Np, Na)
2485# 			print(f'Nf = {Nf}')
2486
2487# 			for sample in mysamples :
2488# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2489# 				if len(X) > 1 :
2490# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2491# 					if sample in self.unknowns:
2492# 						Nf += len(X) - 1
2493# 					else:
2494# 						Nf += len(X)
2495# 			if samples in ['anchors', 'all samples']:
2496# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2497			r = (chisq / Nf)**.5 if Nf > 0 else 0
2498
2499		else: # if key not in ['D47', 'D48']
2500			chisq, Nf = 0, 0
2501			for sample in mysamples :
2502				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2503				if len(X) > 1 :
2504					Nf += len(X) - 1
2505					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2506			r = (chisq / Nf)**.5 if Nf > 0 else 0
2507
2508		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2509		return r
2510
2511	def sample_average(self, samples, weights = 'equal', normalize = True):
2512		'''
2513		Weighted average Δ4x value of a group of samples, accounting for covariance.
2514
2515		Returns the weighed average Δ4x value and associated SE
2516		of a group of samples. Weights are equal by default. If `normalize` is
2517		true, `weights` will be rescaled so that their sum equals 1.
2518
2519		**Examples**
2520
2521		```python
2522		self.sample_average(['X','Y'], [1, 2])
2523		```
2524
2525		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2526		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2527		values of samples X and Y, respectively.
2528
2529		```python
2530		self.sample_average(['X','Y'], [1, -1], normalize = False)
2531		```
2532
2533		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2534		'''
2535		if weights == 'equal':
2536			weights = [1/len(samples)] * len(samples)
2537
2538		if normalize:
2539			s = sum(weights)
2540			if s:
2541				weights = [w/s for w in weights]
2542
2543		try:
2544# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2545# 			C = self.standardization.covar[indices,:][:,indices]
2546			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2547			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2548			return correlated_sum(X, C, weights)
2549		except ValueError:
2550			return (0., 0.)
2551
2552
2553	def sample_D4x_covar(self, sample1, sample2 = None):
2554		'''
2555		Covariance between Δ4x values of samples
2556
2557		Returns the error covariance between the average Δ4x values of two
2558		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2559		returns the Δ4x variance for that sample.
2560		'''
2561		if sample2 is None:
2562			sample2 = sample1
2563		if self.standardization_method == 'pooled':
2564			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2565			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2566			return self.standardization.covar[i, j]
2567		elif self.standardization_method == 'indep_sessions':
2568			if sample1 == sample2:
2569				return self.samples[sample1][f'SE_D{self._4x}']**2
2570			else:
2571				c = 0
2572				for session in self.sessions:
2573					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2574					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2575					if sdata1 and sdata2:
2576						a = self.sessions[session]['a']
2577						# !! TODO: CM below does not account for temporal changes in standardization parameters
2578						CM = self.sessions[session]['CM'][:3,:3]
2579						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2580						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2581						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2582						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2583						c += (
2584							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2585							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2586							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2587							@ CM
2588							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2589							) / a**2
2590				return float(c)
2591
2592	def sample_D4x_correl(self, sample1, sample2 = None):
2593		'''
2594		Correlation between Δ4x errors of samples
2595
2596		Returns the error correlation between the average Δ4x values of two samples.
2597		'''
2598		if sample2 is None or sample2 == sample1:
2599			return 1.
2600		return (
2601			self.sample_D4x_covar(sample1, sample2)
2602			/ self.unknowns[sample1][f'SE_D{self._4x}']
2603			/ self.unknowns[sample2][f'SE_D{self._4x}']
2604			)
2605
2606	def plot_single_session(self,
2607		session,
2608		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2609		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2610		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2611		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2612		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2613		xylimits = 'free', # | 'constant'
2614		x_label = None,
2615		y_label = None,
2616		error_contour_interval = 'auto',
2617		fig = 'new',
2618		):
2619		'''
2620		Generate plot for a single session
2621		'''
2622		if x_label is None:
2623			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2624		if y_label is None:
2625			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2626
2627		out = _SessionPlot()
2628		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2629		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2630		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2631		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2632		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2633		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2634		anchor_avg = (np.array([ np.array([
2635				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2636				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2637				]) for sample in anchors]).T,
2638			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2639		unknown_avg = (np.array([ np.array([
2640				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2641				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2642				]) for sample in unknowns]).T,
2643			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2644		
2645		
2646		if fig == 'new':
2647			out.fig = ppl.figure(figsize = (6,6))
2648			ppl.subplots_adjust(.1,.1,.9,.9)
2649
2650		out.anchor_analyses, = ppl.plot(
2651			anchors_d,
2652			anchors_D,
2653			**kw_plot_anchors)
2654		out.unknown_analyses, = ppl.plot(
2655			unknowns_d,
2656			unknowns_D,
2657			**kw_plot_unknowns)
2658		out.anchor_avg = ppl.plot(
2659			*anchor_avg,
2660			**kw_plot_anchor_avg)
2661		out.unknown_avg = ppl.plot(
2662			*unknown_avg,
2663			**kw_plot_unknown_avg)
2664		if xylimits == 'constant':
2665			x = [r[f'd{self._4x}'] for r in self]
2666			y = [r[f'D{self._4x}'] for r in self]
2667			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2668			w, h = x2-x1, y2-y1
2669			x1 -= w/20
2670			x2 += w/20
2671			y1 -= h/20
2672			y2 += h/20
2673			ppl.axis([x1, x2, y1, y2])
2674		elif xylimits == 'free':
2675			x1, x2, y1, y2 = ppl.axis()
2676		else:
2677			x1, x2, y1, y2 = ppl.axis(xylimits)
2678				
2679		if error_contour_interval != 'none':
2680			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2681			XI,YI = np.meshgrid(xi, yi)
2682			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2683			if error_contour_interval == 'auto':
2684				rng = np.max(SI) - np.min(SI)
2685				if rng <= 0.01:
2686					cinterval = 0.001
2687				elif rng <= 0.03:
2688					cinterval = 0.004
2689				elif rng <= 0.1:
2690					cinterval = 0.01
2691				elif rng <= 0.3:
2692					cinterval = 0.03
2693				elif rng <= 1.:
2694					cinterval = 0.1
2695				else:
2696					cinterval = 0.5
2697			else:
2698				cinterval = error_contour_interval
2699
2700			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2701			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2702			out.clabel = ppl.clabel(out.contour)
2703			contour = (XI, YI, SI, cval, cinterval)
2704
2705		if fig == None:
2706			return {
2707			'anchors':anchors,
2708			'unknowns':unknowns,
2709			'anchors_d':anchors_d,
2710			'anchors_D':anchors_D,
2711			'unknowns_d':unknowns_d,
2712			'unknowns_D':unknowns_D,
2713			'anchor_avg':anchor_avg,
2714			'unknown_avg':unknown_avg,
2715			'contour':contour,
2716			}
2717
2718		ppl.xlabel(x_label)
2719		ppl.ylabel(y_label)
2720		ppl.title(session, weight = 'bold')
2721		ppl.grid(alpha = .2)
2722		out.ax = ppl.gca()		
2723
2724		return out
2725
2726	def plot_residuals(
2727		self,
2728		kde = False,
2729		hist = False,
2730		binwidth = 2/3,
2731		dir = 'output',
2732		filename = None,
2733		highlight = [],
2734		colors = None,
2735		figsize = None,
2736		dpi = 100,
2737		yspan = None,
2738		):
2739		'''
2740		Plot residuals of each analysis as a function of time (actually, as a function of
2741		the order of analyses in the `D4xdata` object)
2742
2743		+ `kde`: whether to add a kernel density estimate of residuals
2744		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2745		+ `histbins`: specify bin edges for the histogram
2746		+ `dir`: the directory in which to save the plot
2747		+ `highlight`: a list of samples to highlight
2748		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2749		+ `figsize`: (width, height) of figure
2750		+ `dpi`: resolution for PNG output
2751		+ `yspan`: factor controlling the range of y values shown in plot
2752		  (by default: `yspan = 1.5 if kde else 1.0`)
2753		'''
2754		
2755		from matplotlib import ticker
2756
2757		if yspan is None:
2758			if kde:
2759				yspan = 1.5
2760			else:
2761				yspan = 1.0
2762		
2763		# Layout
2764		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2765		if hist or kde:
2766			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2767			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2768		else:
2769			ppl.subplots_adjust(.08,.05,.78,.8)
2770			ax1 = ppl.subplot(111)
2771		
2772		# Colors
2773		N = len(self.anchors)
2774		if colors is None:
2775			if len(highlight) > 0:
2776				Nh = len(highlight)
2777				if Nh == 1:
2778					colors = {highlight[0]: (0,0,0)}
2779				elif Nh == 3:
2780					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2781				elif Nh == 4:
2782					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2783				else:
2784					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2785			else:
2786				if N == 3:
2787					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2788				elif N == 4:
2789					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2790				else:
2791					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2792
2793		ppl.sca(ax1)
2794		
2795		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2796
2797		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2798
2799		session = self[0]['Session']
2800		x1 = 0
2801# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2802		x_sessions = {}
2803		one_or_more_singlets = False
2804		one_or_more_multiplets = False
2805		multiplets = set()
2806		for k,r in enumerate(self):
2807			if r['Session'] != session:
2808				x2 = k-1
2809				x_sessions[session] = (x1+x2)/2
2810				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2811				session = r['Session']
2812				x1 = k
2813			singlet = len(self.samples[r['Sample']]['data']) == 1
2814			if not singlet:
2815				multiplets.add(r['Sample'])
2816			if r['Sample'] in self.unknowns:
2817				if singlet:
2818					one_or_more_singlets = True
2819				else:
2820					one_or_more_multiplets = True
2821			kw = dict(
2822				marker = 'x' if singlet else '+',
2823				ms = 4 if singlet else 5,
2824				ls = 'None',
2825				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2826				mew = 1,
2827				alpha = 0.2 if singlet else 1,
2828				)
2829			if highlight and r['Sample'] not in highlight:
2830				kw['alpha'] = 0.2
2831			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2832		x2 = k
2833		x_sessions[session] = (x1+x2)/2
2834
2835		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2836		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2837		if not (hist or kde):
2838			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2839			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2840
2841		xmin, xmax, ymin, ymax = ppl.axis()
2842		if yspan != 1:
2843			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2844		for s in x_sessions:
2845			ppl.text(
2846				x_sessions[s],
2847				ymax +1,
2848				s,
2849				va = 'bottom',
2850				**(
2851					dict(ha = 'center')
2852					if len(self.sessions[s]['data']) > (0.15 * len(self))
2853					else dict(ha = 'left', rotation = 45)
2854					)
2855				)
2856
2857		if hist or kde:
2858			ppl.sca(ax2)
2859
2860		for s in colors:
2861			kw['marker'] = '+'
2862			kw['ms'] = 5
2863			kw['mec'] = colors[s]
2864			kw['label'] = s
2865			kw['alpha'] = 1
2866			ppl.plot([], [], **kw)
2867
2868		kw['mec'] = (0,0,0)
2869
2870		if one_or_more_singlets:
2871			kw['marker'] = 'x'
2872			kw['ms'] = 4
2873			kw['alpha'] = .2
2874			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2875			ppl.plot([], [], **kw)
2876
2877		if one_or_more_multiplets:
2878			kw['marker'] = '+'
2879			kw['ms'] = 4
2880			kw['alpha'] = 1
2881			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2882			ppl.plot([], [], **kw)
2883
2884		if hist or kde:
2885			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2886		else:
2887			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2888		leg.set_zorder(-1000)
2889
2890		ppl.sca(ax1)
2891
2892		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2893		ppl.xticks([])
2894		ppl.axis([-1, len(self), None, None])
2895
2896		if hist or kde:
2897			ppl.sca(ax2)
2898			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2899
2900			if kde:
2901				from scipy.stats import gaussian_kde
2902				yi = np.linspace(ymin, ymax, 201)
2903				xi = gaussian_kde(X).evaluate(yi)
2904				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2905# 				ppl.plot(xi, yi, 'k-', lw = 1)
2906			elif hist:
2907				ppl.hist(
2908					X,
2909					orientation = 'horizontal',
2910					histtype = 'stepfilled',
2911					ec = [.4]*3,
2912					fc = [.25]*3,
2913					alpha = .25,
2914					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2915					)
2916			ppl.text(0, 0,
2917				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2918				size = 7.5,
2919				alpha = 1,
2920				va = 'center',
2921				ha = 'left',
2922				)
2923
2924			ppl.axis([0, None, ymin, ymax])
2925			ppl.xticks([])
2926			ppl.yticks([])
2927# 			ax2.spines['left'].set_visible(False)
2928			ax2.spines['right'].set_visible(False)
2929			ax2.spines['top'].set_visible(False)
2930			ax2.spines['bottom'].set_visible(False)
2931
2932		ax1.axis([None, None, ymin, ymax])
2933
2934		if not os.path.exists(dir):
2935			os.makedirs(dir)
2936		if filename is None:
2937			return fig
2938		elif filename == '':
2939			filename = f'D{self._4x}_residuals.pdf'
2940		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2941		ppl.close(fig)
2942				
2943
2944	def simulate(self, *args, **kwargs):
2945		'''
2946		Legacy function with warning message pointing to `virtual_data()`
2947		'''
2948		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2949
2950	def plot_anchor_residuals(
2951		self,
2952		dir = 'output',
2953		filename = '',
2954		figsize = None,
2955		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2956		dpi = 100,
2957		colors = None,
2958		):
2959		'''
2960		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2961		
2962		**Parameters**
2963
2964		+ `dir`: the directory in which to save the plot
2965		+ `filename`: the file name to save to.
2966		+ `dpi`: resolution for PNG output
2967		+ `figsize`: (width, height) of figure
2968		+ `subplots_adjust`: passed to the figure
2969		+ `dpi`: resolution for PNG output
2970		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2971		'''
2972
2973		# Colors
2974		N = len(self.anchors)
2975		if colors is None:
2976			if N == 3:
2977				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2978			elif N == 4:
2979				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2980			else:
2981				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2982
2983		if figsize is None:
2984			figsize = (4, 1.5*N+1)
2985		fig = ppl.figure(figsize = figsize)
2986		ppl.subplots_adjust(*subplots_adjust)
2987		axs = {}
2988		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2989		sigma = self.repeatability['r_D47a'] * 1000
2990		D = max(np.abs(X))
2991
2992		for k,a in enumerate(self.anchors):
2993			color = colors[a]
2994			axs[a] = ppl.subplot(N, 1, 1+k)
2995			axs[a].text(
2996				0.02, 1-0.05, a,
2997				va = 'top',
2998				ha = 'left',
2999				weight = 'bold',
3000				size = 9,
3001				color = [_*0.75 for _ in color],
3002				transform = axs[a].transAxes,
3003			)
3004			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
3005			axs[a].axvline(0, lw = 0.5, color = color)
3006			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
3007
3008			xi = np.linspace(-3*D, 3*D, 601)
3009			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
3010			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
3011			
3012			axs[a].errorbar(
3013				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3014				ecolor = color,
3015				marker = 's',
3016				ls = 'None',
3017				mec = color,
3018				mew = 1,
3019				mfc = 'w',
3020				ms = 8,
3021				elinewidth = 1,
3022				capsize = 4,
3023				capthick = 1,
3024			)
3025			
3026			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3027			ppl.yticks([])
3028
3029		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3030
3031		if not os.path.exists(dir):
3032			os.makedirs(dir)
3033		if filename is None:
3034			return fig
3035		elif filename == '':
3036			filename = f'D{self._4x}_anchor_residuals.pdf'
3037		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3038		ppl.close(fig)
3039		
3040
3041	def plot_distribution_of_analyses(
3042		self,
3043		dir = 'output',
3044		filename = None,
3045		vs_time = False,
3046		figsize = (6,4),
3047		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3048		output = None,
3049		dpi = 100,
3050		):
3051		'''
3052		Plot temporal distribution of all analyses in the data set.
3053		
3054		**Parameters**
3055
3056		+ `dir`: the directory in which to save the plot
3057		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3058		+ `dpi`: resolution for PNG output
3059		+ `figsize`: (width, height) of figure
3060		+ `dpi`: resolution for PNG output
3061		'''
3062
3063		asamples = [s for s in self.anchors]
3064		usamples = [s for s in self.unknowns]
3065		if output is None or output == 'fig':
3066			fig = ppl.figure(figsize = figsize)
3067			ppl.subplots_adjust(*subplots_adjust)
3068		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3069		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3070		Xmax += (Xmax-Xmin)/40
3071		Xmin -= (Xmax-Xmin)/41
3072		for k, s in enumerate(asamples + usamples):
3073			if vs_time:
3074				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3075			else:
3076				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3077			Y = [-k for x in X]
3078			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3079			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3080			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3081		ppl.axis([Xmin, Xmax, -k-1, 1])
3082		ppl.xlabel('\ntime')
3083		ppl.gca().annotate('',
3084			xy = (0.6, -0.02),
3085			xycoords = 'axes fraction',
3086			xytext = (.4, -0.02), 
3087            arrowprops = dict(arrowstyle = "->", color = 'k'),
3088            )
3089			
3090
3091		x2 = -1
3092		for session in self.sessions:
3093			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3094			if vs_time:
3095				ppl.axvline(x1, color = 'k', lw = .75)
3096			if x2 > -1:
3097				if not vs_time:
3098					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3099			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3100# 			from xlrd import xldate_as_datetime
3101# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3102			if vs_time:
3103				ppl.axvline(x2, color = 'k', lw = .75)
3104				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3105			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3106
3107		ppl.xticks([])
3108		ppl.yticks([])
3109
3110		if output is None:
3111			if not os.path.exists(dir):
3112				os.makedirs(dir)
3113			if filename == None:
3114				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3115			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3116			ppl.close(fig)
3117		elif output == 'ax':
3118			return ppl.gca()
3119		elif output == 'fig':
3120			return fig
3121
3122
3123	def plot_bulk_compositions(
3124		self,
3125		samples = None,
3126		dir = 'output/bulk_compositions',
3127		figsize = (6,6),
3128		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3129		show = False,
3130		sample_color = (0,.5,1),
3131		analysis_color = (.7,.7,.7),
3132		labeldist = 0.3,
3133		radius = 0.05,
3134		):
3135		'''
3136		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3137		
3138		By default, creates a directory `./output/bulk_compositions` where plots for
3139		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3140		
3141		
3142		**Parameters**
3143
3144		+ `samples`: Only these samples are processed (by default: all samples).
3145		+ `dir`: where to save the plots
3146		+ `figsize`: (width, height) of figure
3147		+ `subplots_adjust`: passed to `subplots_adjust()`
3148		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3149		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3150		+ `sample_color`: color used for replicate markers/labels
3151		+ `analysis_color`: color used for sample markers/labels
3152		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3153		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3154		'''
3155
3156		from matplotlib.patches import Ellipse
3157
3158		if samples is None:
3159			samples = [_ for _ in self.samples]
3160
3161		saved = {}
3162
3163		for s in samples:
3164
3165			fig = ppl.figure(figsize = figsize)
3166			fig.subplots_adjust(*subplots_adjust)
3167			ax = ppl.subplot(111)
3168			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3169			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3170			ppl.title(s)
3171
3172
3173			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3174			UID = [_['UID'] for _ in self.samples[s]['data']]
3175			XY0 = XY.mean(0)
3176
3177			for xy in XY:
3178				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3179				
3180			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3181			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3182			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3183			saved[s] = [XY, XY0]
3184			
3185			x1, x2, y1, y2 = ppl.axis()
3186			x0, dx = (x1+x2)/2, (x2-x1)/2
3187			y0, dy = (y1+y2)/2, (y2-y1)/2
3188			dx, dy = [max(max(dx, dy), radius)]*2
3189
3190			ppl.axis([
3191				x0 - 1.2*dx,
3192				x0 + 1.2*dx,
3193				y0 - 1.2*dy,
3194				y0 + 1.2*dy,
3195				])			
3196
3197			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3198
3199			for xy, uid in zip(XY, UID):
3200
3201				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3202				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3203
3204				if (vector_in_display_space**2).sum() > 0:
3205
3206					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3207					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3208					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3209					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3210
3211					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3212
3213				else:
3214
3215					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3216
3217			if radius:
3218				ax.add_artist(Ellipse(
3219					xy = XY0,
3220					width = radius*2,
3221					height = radius*2,
3222					ls = (0, (2,2)),
3223					lw = .7,
3224					ec = analysis_color,
3225					fc = 'None',
3226					))
3227				ppl.text(
3228					XY0[0],
3229					XY0[1]-radius,
3230					f'\n± {radius*1e3:.0f} ppm',
3231					color = analysis_color,
3232					va = 'top',
3233					ha = 'center',
3234					linespacing = 0.4,
3235					size = 8,
3236					)
3237
3238			if not os.path.exists(dir):
3239				os.makedirs(dir)
3240			fig.savefig(f'{dir}/{s}.pdf')
3241			ppl.close(fig)
3242
3243		fig = ppl.figure(figsize = figsize)
3244		fig.subplots_adjust(*subplots_adjust)
3245		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3246		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3247
3248		for s in saved:
3249			for xy in saved[s][0]:
3250				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3251			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3252			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3253			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3254
3255		x1, x2, y1, y2 = ppl.axis()
3256		ppl.axis([
3257			x1 - (x2-x1)/10,
3258			x2 + (x2-x1)/10,
3259			y1 - (y2-y1)/10,
3260			y2 + (y2-y1)/10,
3261			])			
3262
3263
3264		if not os.path.exists(dir):
3265			os.makedirs(dir)
3266		fig.savefig(f'{dir}/__all__.pdf')
3267		if show:
3268			ppl.show()
3269		ppl.close(fig)
3270		
3271
3272	def _save_D4x_correl(
3273		self,
3274		samples = None,
3275		dir = 'output',
3276		filename = None,
3277		D4x_precision = 4,
3278		correl_precision = 4,
3279		save_to_file = True,
3280		):
3281		'''
3282		Save D4x values along with their SE and correlation matrix.
3283
3284		**Parameters**
3285
3286		+ `samples`: Only these samples are output (by default: all samples).
3287		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3288		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3289		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3290		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3291		+ `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`,
3292		returns the output as a string
3293		'''
3294		if samples is None:
3295			samples = sorted([s for s in self.unknowns])
3296		
3297		out = [['Sample']] + [[s] for s in samples]
3298		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3299		for k,s in enumerate(samples):
3300			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3301			for s2 in samples:
3302				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3303		
3304		if save_to_file:
3305			if not os.path.exists(dir):
3306				os.makedirs(dir)
3307			if filename is None:
3308				filename = f'D{self._4x}_correl.csv'
3309			with open(f'{dir}/{filename}', 'w') as fid:
3310				fid.write(make_csv(out))
3311		else:
3312			return make_csv(out)

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
970	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
971		'''
972		**Parameters**
973
974		+ `l`: a list of dictionaries, with each dictionary including at least the keys
975		`Sample`, `d45`, `d46`, and `d47` or `d48`.
976		+ `mass`: `'47'` or `'48'`
977		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
978		+ `session`: define session name for analyses without a `Session` key
979		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
980
981		Returns a `D4xdata` object derived from `list`.
982		'''
983		self._4x = mass
984		self.verbose = verbose
985		self.prefix = 'D4xdata'
986		self.logfile = logfile
987		list.__init__(self, l)
988		self.Nf = None
989		self.repeatability = {}
990		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
 993	def make_verbal(oldfun):
 994		'''
 995		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 996		'''
 997		@wraps(oldfun)
 998		def newfun(*args, verbose = '', **kwargs):
 999			myself = args[0]
1000			oldprefix = myself.prefix
1001			myself.prefix = oldfun.__name__
1002			if verbose != '':
1003				oldverbose = myself.verbose
1004				myself.verbose = verbose
1005			out = oldfun(*args, **kwargs)
1006			myself.prefix = oldprefix
1007			if verbose != '':
1008				myself.verbose = oldverbose
1009			return out
1010		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
1013	def msg(self, txt):
1014		'''
1015		Log a message to `self.logfile`, and print it out if `verbose = True`
1016		'''
1017		self.log(txt)
1018		if self.verbose:
1019			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
1022	def vmsg(self, txt):
1023		'''
1024		Log a message to `self.logfile` and print it out
1025		'''
1026		self.log(txt)
1027		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
1030	def log(self, *txts):
1031		'''
1032		Log a message to `self.logfile`
1033		'''
1034		if self.logfile:
1035			with open(self.logfile, 'a') as fid:
1036				for txt in txts:
1037					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
1040	def refresh(self, session = 'mySession'):
1041		'''
1042		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1043		'''
1044		self.fill_in_missing_info(session = session)
1045		self.refresh_sessions()
1046		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1049	def refresh_sessions(self):
1050		'''
1051		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1052		to `False` for all sessions.
1053		'''
1054		self.sessions = {
1055			s: {'data': [r for r in self if r['Session'] == s]}
1056			for s in sorted({r['Session'] for r in self})
1057			}
1058		for s in self.sessions:
1059			self.sessions[s]['scrambling_drift'] = False
1060			self.sessions[s]['slope_drift'] = False
1061			self.sessions[s]['wg_drift'] = False
1062			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1063			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1066	def refresh_samples(self):
1067		'''
1068		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1069		'''
1070		self.samples = {
1071			s: {'data': [r for r in self if r['Sample'] == s]}
1072			for s in sorted({r['Sample'] for r in self})
1073			}
1074		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1075		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1078	def read(self, filename, sep = '', session = ''):
1079		'''
1080		Read file in csv format to load data into a `D47data` object.
1081
1082		In the csv file, spaces before and after field separators (`','` by default)
1083		are optional. Each line corresponds to a single analysis.
1084
1085		The required fields are:
1086
1087		+ `UID`: a unique identifier
1088		+ `Session`: an identifier for the analytical session
1089		+ `Sample`: a sample identifier
1090		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1091
1092		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1093		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1094		and `d49` are optional, and set to NaN by default.
1095
1096		**Parameters**
1097
1098		+ `fileneme`: the path of the file to read
1099		+ `sep`: csv separator delimiting the fields
1100		+ `session`: set `Session` field to this string for all analyses
1101		'''
1102		with open(filename) as fid:
1103			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1106	def input(self, txt, sep = '', session = ''):
1107		'''
1108		Read `txt` string in csv format to load analysis data into a `D47data` object.
1109
1110		In the csv string, spaces before and after field separators (`','` by default)
1111		are optional. Each line corresponds to a single analysis.
1112
1113		The required fields are:
1114
1115		+ `UID`: a unique identifier
1116		+ `Session`: an identifier for the analytical session
1117		+ `Sample`: a sample identifier
1118		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1119
1120		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1121		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1122		and `d49` are optional, and set to NaN by default.
1123
1124		**Parameters**
1125
1126		+ `txt`: the csv string to read
1127		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1128		whichever appers most often in `txt`.
1129		+ `session`: set `Session` field to this string for all analyses
1130		'''
1131		if sep == '':
1132			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1133		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1134		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1135
1136		if session != '':
1137			for r in data:
1138				r['Session'] = session
1139
1140		self += data
1141		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, session_groups=None):
1144	@make_verbal
1145	def wg(self,
1146		samples = None,
1147		session_groups = None,
1148	):
1149		'''
1150		Compute bulk composition of the working gas for each session based (by default)
1151		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1152		`self.Nominal_d18O_VPDB`.
1153
1154		**Parameters**
1155
1156		+ `samples`: A list of samples specifying the subset of samples (defined in both
1157		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1158		when computing the working gas. By default, use all samples defined both in
1159		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1160		+ `session_groups`: a list of lists of sessions
1161		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1162		specifying which sessions groups, if any, have the exact same WG composition.
1163		If set to `'all'`, force all sessions to have the same WG composition (use with
1164		caution and on short time scales, since the WG may drift slowly a long time scales).
1165		'''
1166
1167		self.msg('Computing WG composition:')
1168
1169		a18_acid = self.ALPHA_18O_ACID_REACTION
1170		
1171		if samples is None:
1172			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1173		if session_groups is None:
1174			session_groups = [[s] for s in self.sessions]
1175		elif session_groups == 'all':
1176			session_groups = [[s for s in self.sessions]]
1177
1178		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1179		R45R46_standards = {}
1180		for sample in samples:
1181			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1182			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1183			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1184			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1185			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1186
1187			C12_s = 1 / (1 + R13_s)
1188			C13_s = R13_s / (1 + R13_s)
1189			C16_s = 1 / (1 + R17_s + R18_s)
1190			C17_s = R17_s / (1 + R17_s + R18_s)
1191			C18_s = R18_s / (1 + R17_s + R18_s)
1192
1193			C626_s = C12_s * C16_s ** 2
1194			C627_s = 2 * C12_s * C16_s * C17_s
1195			C628_s = 2 * C12_s * C16_s * C18_s
1196			C636_s = C13_s * C16_s ** 2
1197			C637_s = 2 * C13_s * C16_s * C17_s
1198			C727_s = C12_s * C17_s ** 2
1199
1200			R45_s = (C627_s + C636_s) / C626_s
1201			R46_s = (C628_s + C637_s + C727_s) / C626_s
1202			R45R46_standards[sample] = (R45_s, R46_s)
1203		
1204		for sg in session_groups:
1205			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1206			assert db, f'No sample from {samples} found in session group {sg}.'
1207
1208			X = [r['d45'] for r in db]
1209			Y = [R45R46_standards[r['Sample']][0] for r in db]
1210			x1, x2 = np.min(X), np.max(X)
1211
1212			if x1 < x2:
1213				wgcoord = x1/(x1-x2)
1214			else:
1215				wgcoord = 999
1216
1217			if wgcoord < -.5 or wgcoord > 1.5:
1218				# unreasonable to extrapolate to d45 = 0
1219				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1220			else :
1221				# d45 = 0 is reasonably well bracketed
1222				R45_wg = np.polyfit(X, Y, 1)[1]
1223
1224			X = [r['d46'] for r in db]
1225			Y = [R45R46_standards[r['Sample']][1] for r in db]
1226			x1, x2 = np.min(X), np.max(X)
1227
1228			if x1 < x2:
1229				wgcoord = x1/(x1-x2)
1230			else:
1231				wgcoord = 999
1232
1233			if wgcoord < -.5 or wgcoord > 1.5:
1234				# unreasonable to extrapolate to d46 = 0
1235				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1236			else :
1237				# d46 = 0 is reasonably well bracketed
1238				R46_wg = np.polyfit(X, Y, 1)[1]
1239
1240			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1241
1242			for s in sg:
1243				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1244	
1245				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1246				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1247				for r in self.sessions[s]['data']:
1248					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1249					r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based (by default) on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

Parameters

  • samples: A list of samples specifying the subset of samples (defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB) which will be considered when computing the working gas. By default, use all samples defined both in self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.
  • session_groups: a list of lists of sessions (e.g., [['session1', 'session2'], ['session3', 'session4', 'session5']]) specifying which sessions groups, if any, have the exact same WG composition. If set to 'all', force all sessions to have the same WG composition (use with caution and on short time scales, since the WG may drift slowly a long time scales).
def compute_bulk_delta(self, R45, R46, D17O=0):
1252	def compute_bulk_delta(self, R45, R46, D17O = 0):
1253		'''
1254		Compute δ13C_VPDB and δ18O_VSMOW,
1255		by solving the generalized form of equation (17) from
1256		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1257		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1258		solving the corresponding second-order Taylor polynomial.
1259		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1260		'''
1261
1262		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1263
1264		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1265		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1266		C = 2 * self.R18_VSMOW
1267		D = -R46
1268
1269		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1270		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1271		cc = A + B + C + D
1272
1273		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1274
1275		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1276		R17 = K * R18 ** self.LAMBDA_17
1277		R13 = R45 - 2 * R17
1278
1279		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1280
1281		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1284	@make_verbal
1285	def crunch(self, verbose = ''):
1286		'''
1287		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1288		'''
1289		for r in self:
1290			self.compute_bulk_and_clumping_deltas(r)
1291		self.standardize_d13C()
1292		self.standardize_d18O()
1293		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1296	def fill_in_missing_info(self, session = 'mySession'):
1297		'''
1298		Fill in optional fields with default values
1299		'''
1300		for i,r in enumerate(self):
1301			if 'D17O' not in r:
1302				r['D17O'] = 0.
1303			if 'UID' not in r:
1304				r['UID'] = f'{i+1}'
1305			if 'Session' not in r:
1306				r['Session'] = session
1307			for k in ['d47', 'd48', 'd49']:
1308				if k not in r:
1309					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1312	def standardize_d13C(self):
1313		'''
1314		Perform δ13C standadization within each session `s` according to
1315		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1316		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1317		may be redefined abitrarily at a later stage.
1318		'''
1319		for s in self.sessions:
1320			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1321				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1322				X,Y = zip(*XY)
1323				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1324					offset = np.mean(Y) - np.mean(X)
1325					for r in self.sessions[s]['data']:
1326						r['d13C_VPDB'] += offset				
1327				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1328					a,b = np.polyfit(X,Y,1)
1329					for r in self.sessions[s]['data']:
1330						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1332	def standardize_d18O(self):
1333		'''
1334		Perform δ18O standadization within each session `s` according to
1335		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1336		which is defined by default by `D47data.refresh_sessions()`as equal to
1337		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1338		'''
1339		for s in self.sessions:
1340			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1341				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1342				X,Y = zip(*XY)
1343				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1344				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1345					offset = np.mean(Y) - np.mean(X)
1346					for r in self.sessions[s]['data']:
1347						r['d18O_VSMOW'] += offset				
1348				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1349					a,b = np.polyfit(X,Y,1)
1350					for r in self.sessions[s]['data']:
1351						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1354	def compute_bulk_and_clumping_deltas(self, r):
1355		'''
1356		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1357		'''
1358
1359		# Compute working gas R13, R18, and isobar ratios
1360		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1361		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1362		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1363
1364		# Compute analyte isobar ratios
1365		R45 = (1 + r['d45'] / 1000) * R45_wg
1366		R46 = (1 + r['d46'] / 1000) * R46_wg
1367		R47 = (1 + r['d47'] / 1000) * R47_wg
1368		R48 = (1 + r['d48'] / 1000) * R48_wg
1369		R49 = (1 + r['d49'] / 1000) * R49_wg
1370
1371		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1372		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1373		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1374
1375		# Compute stochastic isobar ratios of the analyte
1376		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1377			R13, R18, D17O = r['D17O']
1378		)
1379
1380		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1381		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1382		if (R45 / R45stoch - 1) > 5e-8:
1383			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1384		if (R46 / R46stoch - 1) > 5e-8:
1385			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1386
1387		# Compute raw clumped isotope anomalies
1388		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1389		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1390		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1393	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1394		'''
1395		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1396		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1397		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1398		'''
1399
1400		# Compute R17
1401		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1402
1403		# Compute isotope concentrations
1404		C12 = (1 + R13) ** -1
1405		C13 = C12 * R13
1406		C16 = (1 + R17 + R18) ** -1
1407		C17 = C16 * R17
1408		C18 = C16 * R18
1409
1410		# Compute stochastic isotopologue concentrations
1411		C626 = C16 * C12 * C16
1412		C627 = C16 * C12 * C17 * 2
1413		C628 = C16 * C12 * C18 * 2
1414		C636 = C16 * C13 * C16
1415		C637 = C16 * C13 * C17 * 2
1416		C638 = C16 * C13 * C18 * 2
1417		C727 = C17 * C12 * C17
1418		C728 = C17 * C12 * C18 * 2
1419		C737 = C17 * C13 * C17
1420		C738 = C17 * C13 * C18 * 2
1421		C828 = C18 * C12 * C18
1422		C838 = C18 * C13 * C18
1423
1424		# Compute stochastic isobar ratios
1425		R45 = (C636 + C627) / C626
1426		R46 = (C628 + C637 + C727) / C626
1427		R47 = (C638 + C728 + C737) / C626
1428		R48 = (C738 + C828) / C626
1429		R49 = C838 / C626
1430
1431		# Account for stochastic anomalies
1432		R47 *= 1 + D47 / 1000
1433		R48 *= 1 + D48 / 1000
1434		R49 *= 1 + D49 / 1000
1435
1436		# Return isobar ratios
1437		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1440	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1441		'''
1442		Split unknown samples by UID (treat all analyses as different samples)
1443		or by session (treat analyses of a given sample in different sessions as
1444		different samples).
1445
1446		**Parameters**
1447
1448		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1449		+ `grouping`: `by_uid` | `by_session`
1450		'''
1451		if samples_to_split == 'all':
1452			samples_to_split = [s for s in self.unknowns]
1453		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1454		self.grouping = grouping.lower()
1455		if self.grouping in gkeys:
1456			gkey = gkeys[self.grouping]
1457		for r in self:
1458			if r['Sample'] in samples_to_split:
1459				r['Sample_original'] = r['Sample']
1460				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1461			elif r['Sample'] in self.unknowns:
1462				r['Sample_original'] = r['Sample']
1463		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1466	def unsplit_samples(self, tables = False):
1467		'''
1468		Reverse the effects of `D47data.split_samples()`.
1469		
1470		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1471		
1472		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1473		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1474		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1475		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1476		that case session-averaged Δ4x values are statistically independent).
1477		'''
1478		unknowns_old = sorted({s for s in self.unknowns})
1479		CM_old = self.standardization.covar[:,:]
1480		VD_old = self.standardization.params.valuesdict().copy()
1481		vars_old = self.standardization.var_names
1482
1483		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1484
1485		Ns = len(vars_old) - len(unknowns_old)
1486		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1487		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1488
1489		W = np.zeros((len(vars_new), len(vars_old)))
1490		W[:Ns,:Ns] = np.eye(Ns)
1491		for u in unknowns_new:
1492			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1493			if self.grouping == 'by_session':
1494				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1495			elif self.grouping == 'by_uid':
1496				weights = [1 for s in splits]
1497			sw = sum(weights)
1498			weights = [w/sw for w in weights]
1499			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1500
1501		CM_new = W @ CM_old @ W.T
1502		V = W @ np.array([[VD_old[k]] for k in vars_old])
1503		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1504
1505		self.standardization.covar = CM_new
1506		self.standardization.params.valuesdict = lambda : VD_new
1507		self.standardization.var_names = vars_new
1508
1509		for r in self:
1510			if r['Sample'] in self.unknowns:
1511				r['Sample_split'] = r['Sample']
1512				r['Sample'] = r['Sample_original']
1513
1514		self.refresh_samples()
1515		self.consolidate_samples()
1516		self.repeatabilities()
1517
1518		if tables:
1519			self.table_of_analyses()
1520			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1522	def assign_timestamps(self):
1523		'''
1524		Assign a time field `t` of type `float` to each analysis.
1525
1526		If `TimeTag` is one of the data fields, `t` is equal within a given session
1527		to `TimeTag` minus the mean value of `TimeTag` for that session.
1528		Otherwise, `TimeTag` is by default equal to the index of each analysis
1529		in the dataset and `t` is defined as above.
1530		'''
1531		for session in self.sessions:
1532			sdata = self.sessions[session]['data']
1533			try:
1534				t0 = np.mean([r['TimeTag'] for r in sdata])
1535				for r in sdata:
1536					r['t'] = r['TimeTag'] - t0
1537			except KeyError:
1538				t0 = (len(sdata)-1)/2
1539				for t,r in enumerate(sdata):
1540					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1543	def report(self):
1544		'''
1545		Prints a report on the standardization fit.
1546		Only applicable after `D4xdata.standardize(method='pooled')`.
1547		'''
1548		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1551	def combine_samples(self, sample_groups):
1552		'''
1553		Combine analyses of different samples to compute weighted average Δ4x
1554		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1555		dictionary.
1556		
1557		Caution: samples are weighted by number of replicate analyses, which is a
1558		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1559		correlated analytical errors for one or more samples).
1560		
1561		Returns a tuplet of:
1562		
1563		+ the list of group names
1564		+ an array of the corresponding Δ4x values
1565		+ the corresponding (co)variance matrix
1566		
1567		**Parameters**
1568
1569		+ `sample_groups`: a dictionary of the form:
1570		```py
1571		{'group1': ['sample_1', 'sample_2'],
1572		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1573		```
1574		'''
1575		
1576		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1577		groups = sorted(sample_groups.keys())
1578		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1579		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1580		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1581		W = np.array([
1582			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1583			for j in groups])
1584		D4x_new = W @ D4x_old
1585		CM_new = W @ CM_old @ W.T
1586
1587		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1590	@make_verbal
1591	def standardize(self,
1592		method = 'pooled',
1593		weighted_sessions = [],
1594		consolidate = True,
1595		consolidate_tables = False,
1596		consolidate_plots = False,
1597		constraints = {},
1598		):
1599		'''
1600		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1601		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1602		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1603		i.e. that their true Δ4x value does not change between sessions,
1604		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1605		`'indep_sessions'`, the standardization processes each session independently, based only
1606		on anchors analyses.
1607		'''
1608
1609		self.standardization_method = method
1610		self.assign_timestamps()
1611
1612		if method == 'pooled':
1613			if weighted_sessions:
1614				for session_group in weighted_sessions:
1615					if self._4x == '47':
1616						X = D47data([r for r in self if r['Session'] in session_group])
1617					elif self._4x == '48':
1618						X = D48data([r for r in self if r['Session'] in session_group])
1619					X.Nominal_D4x = self.Nominal_D4x.copy()
1620					X.refresh()
1621					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1622					w = np.sqrt(result.redchi)
1623					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1624					for r in X:
1625						r[f'wD{self._4x}raw'] *= w
1626			else:
1627				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1628				for r in self:
1629					r[f'wD{self._4x}raw'] = 1.
1630
1631			params = Parameters()
1632			for k,session in enumerate(self.sessions):
1633				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1634				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1635				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1636				s = pf(session)
1637				params.add(f'a_{s}', value = 0.9)
1638				params.add(f'b_{s}', value = 0.)
1639				params.add(f'c_{s}', value = -0.9)
1640				params.add(f'a2_{s}', value = 0.,
1641# 					vary = self.sessions[session]['scrambling_drift'],
1642					)
1643				params.add(f'b2_{s}', value = 0.,
1644# 					vary = self.sessions[session]['slope_drift'],
1645					)
1646				params.add(f'c2_{s}', value = 0.,
1647# 					vary = self.sessions[session]['wg_drift'],
1648					)
1649				if not self.sessions[session]['scrambling_drift']:
1650					params[f'a2_{s}'].expr = '0'
1651				if not self.sessions[session]['slope_drift']:
1652					params[f'b2_{s}'].expr = '0'
1653				if not self.sessions[session]['wg_drift']:
1654					params[f'c2_{s}'].expr = '0'
1655
1656			for sample in self.unknowns:
1657				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1658
1659			for k in constraints:
1660				params[k].expr = constraints[k]
1661
1662			def residuals(p):
1663				R = []
1664				for r in self:
1665					session = pf(r['Session'])
1666					sample = pf(r['Sample'])
1667					if r['Sample'] in self.Nominal_D4x:
1668						R += [ (
1669							r[f'D{self._4x}raw'] - (
1670								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1671								+ p[f'b_{session}'] * r[f'd{self._4x}']
1672								+	p[f'c_{session}']
1673								+ r['t'] * (
1674									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1675									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1676									+	p[f'c2_{session}']
1677									)
1678								)
1679							) / r[f'wD{self._4x}raw'] ]
1680					else:
1681						R += [ (
1682							r[f'D{self._4x}raw'] - (
1683								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1684								+ p[f'b_{session}'] * r[f'd{self._4x}']
1685								+	p[f'c_{session}']
1686								+ r['t'] * (
1687									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1688									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1689									+	p[f'c2_{session}']
1690									)
1691								)
1692							) / r[f'wD{self._4x}raw'] ]
1693				return R
1694
1695			M = Minimizer(residuals, params)
1696			result = M.least_squares()
1697			self.Nf = result.nfree
1698			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1699			new_names, new_covar, new_se = _fullcovar(result)[:3]
1700			result.var_names = new_names
1701			result.covar = new_covar
1702
1703			for r in self:
1704				s = pf(r["Session"])
1705				a = result.params.valuesdict()[f'a_{s}']
1706				b = result.params.valuesdict()[f'b_{s}']
1707				c = result.params.valuesdict()[f'c_{s}']
1708				a2 = result.params.valuesdict()[f'a2_{s}']
1709				b2 = result.params.valuesdict()[f'b2_{s}']
1710				c2 = result.params.valuesdict()[f'c2_{s}']
1711				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1712				
1713
1714			self.standardization = result
1715
1716			for session in self.sessions:
1717				self.sessions[session]['Np'] = 3
1718				for k in ['scrambling', 'slope', 'wg']:
1719					if self.sessions[session][f'{k}_drift']:
1720						self.sessions[session]['Np'] += 1
1721
1722			if consolidate:
1723				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1724			return result
1725
1726
1727		elif method == 'indep_sessions':
1728
1729			if weighted_sessions:
1730				for session_group in weighted_sessions:
1731					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1732					X.Nominal_D4x = self.Nominal_D4x.copy()
1733					X.refresh()
1734					# This is only done to assign r['wD47raw'] for r in X:
1735					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1736					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1737			else:
1738				self.msg('All weights set to 1 ‰')
1739				for r in self:
1740					r[f'wD{self._4x}raw'] = 1
1741
1742			for session in self.sessions:
1743				s = self.sessions[session]
1744				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1745				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1746				s['Np'] = sum(p_active)
1747				sdata = s['data']
1748
1749				A = np.array([
1750					[
1751						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1752						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1753						1 / r[f'wD{self._4x}raw'],
1754						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1755						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1756						r['t'] / r[f'wD{self._4x}raw']
1757						]
1758					for r in sdata if r['Sample'] in self.anchors
1759					])[:,p_active] # only keep columns for the active parameters
1760				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1761				s['Na'] = Y.size
1762				CM = linalg.inv(A.T @ A)
1763				bf = (CM @ A.T @ Y).T[0,:]
1764				k = 0
1765				for n,a in zip(p_names, p_active):
1766					if a:
1767						s[n] = bf[k]
1768# 						self.msg(f'{n} = {bf[k]}')
1769						k += 1
1770					else:
1771						s[n] = 0.
1772# 						self.msg(f'{n} = 0.0')
1773
1774				for r in sdata :
1775					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1776					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1777					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1778
1779				s['CM'] = np.zeros((6,6))
1780				i = 0
1781				k_active = [j for j,a in enumerate(p_active) if a]
1782				for j,a in enumerate(p_active):
1783					if a:
1784						s['CM'][j,k_active] = CM[i,:]
1785						i += 1
1786
1787			if not weighted_sessions:
1788				w = self.rmswd()['rmswd']
1789				for r in self:
1790						r[f'wD{self._4x}'] *= w
1791						r[f'wD{self._4x}raw'] *= w
1792				for session in self.sessions:
1793					self.sessions[session]['CM'] *= w**2
1794
1795			for session in self.sessions:
1796				s = self.sessions[session]
1797				s['SE_a'] = s['CM'][0,0]**.5
1798				s['SE_b'] = s['CM'][1,1]**.5
1799				s['SE_c'] = s['CM'][2,2]**.5
1800				s['SE_a2'] = s['CM'][3,3]**.5
1801				s['SE_b2'] = s['CM'][4,4]**.5
1802				s['SE_c2'] = s['CM'][5,5]**.5
1803
1804			if not weighted_sessions:
1805				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1806			else:
1807				self.Nf = 0
1808				for sg in weighted_sessions:
1809					self.Nf += self.rmswd(sessions = sg)['Nf']
1810
1811			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1812
1813			avgD4x = {
1814				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1815				for sample in self.samples
1816				}
1817			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1818			rD4x = (chi2/self.Nf)**.5
1819			self.repeatability[f'sigma_{self._4x}'] = rD4x
1820
1821			if consolidate:
1822				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1825	def standardization_error(self, session, d4x, D4x, t = 0):
1826		'''
1827		Compute standardization error for a given session and
1828		(δ47, Δ47) composition.
1829		'''
1830		a = self.sessions[session]['a']
1831		b = self.sessions[session]['b']
1832		c = self.sessions[session]['c']
1833		a2 = self.sessions[session]['a2']
1834		b2 = self.sessions[session]['b2']
1835		c2 = self.sessions[session]['c2']
1836		CM = self.sessions[session]['CM']
1837
1838		x, y = D4x, d4x
1839		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1840# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1841		dxdy = -(b+b2*t) / (a+a2*t)
1842		dxdz = 1. / (a+a2*t)
1843		dxda = -x / (a+a2*t)
1844		dxdb = -y / (a+a2*t)
1845		dxdc = -1. / (a+a2*t)
1846		dxda2 = -x * a2 / (a+a2*t)
1847		dxdb2 = -y * t / (a+a2*t)
1848		dxdc2 = -t / (a+a2*t)
1849		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1850		sx = (V @ CM @ V.T) ** .5
1851		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1854	@make_verbal
1855	def summary(self,
1856		dir = 'output',
1857		filename = None,
1858		save_to_file = True,
1859		print_out = True,
1860		):
1861		'''
1862		Print out an/or save to disk a summary of the standardization results.
1863
1864		**Parameters**
1865
1866		+ `dir`: the directory in which to save the table
1867		+ `filename`: the name to the csv file to write to
1868		+ `save_to_file`: whether to save the table to disk
1869		+ `print_out`: whether to print out the table
1870		'''
1871
1872		out = []
1873		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1874		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1875		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1876		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1877		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1878		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1879		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1880		out += [['Model degrees of freedom', f"{self.Nf}"]]
1881		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1882		out += [['Standardization method', self.standardization_method]]
1883
1884		if save_to_file:
1885			if not os.path.exists(dir):
1886				os.makedirs(dir)
1887			if filename is None:
1888				filename = f'D{self._4x}_summary.csv'
1889			with open(f'{dir}/{filename}', 'w') as fid:
1890				fid.write(make_csv(out))
1891		if print_out:
1892			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1895	@make_verbal
1896	def table_of_sessions(self,
1897		dir = 'output',
1898		filename = None,
1899		save_to_file = True,
1900		print_out = True,
1901		output = None,
1902		):
1903		'''
1904		Print out an/or save to disk a table of sessions.
1905
1906		**Parameters**
1907
1908		+ `dir`: the directory in which to save the table
1909		+ `filename`: the name to the csv file to write to
1910		+ `save_to_file`: whether to save the table to disk
1911		+ `print_out`: whether to print out the table
1912		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1913		    if set to `'raw'`: return a list of list of strings
1914		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1915		'''
1916		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1917		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1918		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1919
1920		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1921		if include_a2:
1922			out[-1] += ['a2 ± SE']
1923		if include_b2:
1924			out[-1] += ['b2 ± SE']
1925		if include_c2:
1926			out[-1] += ['c2 ± SE']
1927		for session in self.sessions:
1928			out += [[
1929				session,
1930				f"{self.sessions[session]['Na']}",
1931				f"{self.sessions[session]['Nu']}",
1932				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1933				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1934				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1935				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1936				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1937				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1938				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1939				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1940				]]
1941			if include_a2:
1942				if self.sessions[session]['scrambling_drift']:
1943					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1944				else:
1945					out[-1] += ['']
1946			if include_b2:
1947				if self.sessions[session]['slope_drift']:
1948					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1949				else:
1950					out[-1] += ['']
1951			if include_c2:
1952				if self.sessions[session]['wg_drift']:
1953					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1954				else:
1955					out[-1] += ['']
1956
1957		if save_to_file:
1958			if not os.path.exists(dir):
1959				os.makedirs(dir)
1960			if filename is None:
1961				filename = f'D{self._4x}_sessions.csv'
1962			with open(f'{dir}/{filename}', 'w') as fid:
1963				fid.write(make_csv(out))
1964		if print_out:
1965			self.msg('\n' + pretty_table(out))
1966		if output == 'raw':
1967			return out
1968		elif output == 'pretty':
1969			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1972	@make_verbal
1973	def table_of_analyses(
1974		self,
1975		dir = 'output',
1976		filename = None,
1977		save_to_file = True,
1978		print_out = True,
1979		output = None,
1980		):
1981		'''
1982		Print out an/or save to disk a table of analyses.
1983
1984		**Parameters**
1985
1986		+ `dir`: the directory in which to save the table
1987		+ `filename`: the name to the csv file to write to
1988		+ `save_to_file`: whether to save the table to disk
1989		+ `print_out`: whether to print out the table
1990		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1991		    if set to `'raw'`: return a list of list of strings
1992		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1993		'''
1994
1995		out = [['UID','Session','Sample']]
1996		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1997		for f in extra_fields:
1998			out[-1] += [f[0]]
1999		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
2000		for r in self:
2001			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
2002			for f in extra_fields:
2003				out[-1] += [f"{r[f[0]]:{f[1]}}"]
2004			out[-1] += [
2005				f"{r['d13Cwg_VPDB']:.3f}",
2006				f"{r['d18Owg_VSMOW']:.3f}",
2007				f"{r['d45']:.6f}",
2008				f"{r['d46']:.6f}",
2009				f"{r['d47']:.6f}",
2010				f"{r['d48']:.6f}",
2011				f"{r['d49']:.6f}",
2012				f"{r['d13C_VPDB']:.6f}",
2013				f"{r['d18O_VSMOW']:.6f}",
2014				f"{r['D47raw']:.6f}",
2015				f"{r['D48raw']:.6f}",
2016				f"{r['D49raw']:.6f}",
2017				f"{r[f'D{self._4x}']:.6f}"
2018				]
2019		if save_to_file:
2020			if not os.path.exists(dir):
2021				os.makedirs(dir)
2022			if filename is None:
2023				filename = f'D{self._4x}_analyses.csv'
2024			with open(f'{dir}/{filename}', 'w') as fid:
2025				fid.write(make_csv(out))
2026		if print_out:
2027			self.msg('\n' + pretty_table(out))
2028		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2030	@make_verbal
2031	def covar_table(
2032		self,
2033		correl = False,
2034		dir = 'output',
2035		filename = None,
2036		save_to_file = True,
2037		print_out = True,
2038		output = None,
2039		):
2040		'''
2041		Print out, save to disk and/or return the variance-covariance matrix of D4x
2042		for all unknown samples.
2043
2044		**Parameters**
2045
2046		+ `dir`: the directory in which to save the csv
2047		+ `filename`: the name of the csv file to write to
2048		+ `save_to_file`: whether to save the csv
2049		+ `print_out`: whether to print out the matrix
2050		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2051		    if set to `'raw'`: return a list of list of strings
2052		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2053		'''
2054		samples = sorted([u for u in self.unknowns])
2055		out = [[''] + samples]
2056		for s1 in samples:
2057			out.append([s1])
2058			for s2 in samples:
2059				if correl:
2060					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2061				else:
2062					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2063
2064		if save_to_file:
2065			if not os.path.exists(dir):
2066				os.makedirs(dir)
2067			if filename is None:
2068				if correl:
2069					filename = f'D{self._4x}_correl.csv'
2070				else:
2071					filename = f'D{self._4x}_covar.csv'
2072			with open(f'{dir}/{filename}', 'w') as fid:
2073				fid.write(make_csv(out))
2074		if print_out:
2075			self.msg('\n'+pretty_table(out))
2076		if output == 'raw':
2077			return out
2078		elif output == 'pretty':
2079			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2081	@make_verbal
2082	def table_of_samples(
2083		self,
2084		dir = 'output',
2085		filename = None,
2086		save_to_file = True,
2087		print_out = True,
2088		output = None,
2089		):
2090		'''
2091		Print out, save to disk and/or return a table of samples.
2092
2093		**Parameters**
2094
2095		+ `dir`: the directory in which to save the csv
2096		+ `filename`: the name of the csv file to write to
2097		+ `save_to_file`: whether to save the csv
2098		+ `print_out`: whether to print out the table
2099		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2100		    if set to `'raw'`: return a list of list of strings
2101		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2102		'''
2103
2104		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2105		for sample in self.anchors:
2106			out += [[
2107				f"{sample}",
2108				f"{self.samples[sample]['N']}",
2109				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2110				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2111				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2112				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2113				]]
2114		for sample in self.unknowns:
2115			out += [[
2116				f"{sample}",
2117				f"{self.samples[sample]['N']}",
2118				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2119				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2120				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2121				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2122				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2123				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2124				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2125				]]
2126		if save_to_file:
2127			if not os.path.exists(dir):
2128				os.makedirs(dir)
2129			if filename is None:
2130				filename = f'D{self._4x}_samples.csv'
2131			with open(f'{dir}/{filename}', 'w') as fid:
2132				fid.write(make_csv(out))
2133		if print_out:
2134			self.msg('\n'+pretty_table(out))
2135		if output == 'raw':
2136			return out
2137		elif output == 'pretty':
2138			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2141	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2142		'''
2143		Generate session plots and save them to disk.
2144
2145		**Parameters**
2146
2147		+ `dir`: the directory in which to save the plots
2148		+ `figsize`: the width and height (in inches) of each plot
2149		+ `filetype`: 'pdf' or 'png'
2150		+ `dpi`: resolution for PNG output
2151		'''
2152		if not os.path.exists(dir):
2153			os.makedirs(dir)
2154
2155		for session in self.sessions:
2156			sp = self.plot_single_session(session, xylimits = 'constant')
2157			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2158			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2162	@make_verbal
2163	def consolidate_samples(self):
2164		'''
2165		Compile various statistics for each sample.
2166
2167		For each anchor sample:
2168
2169		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2170		+ `SE_D47` or `SE_D48`: set to zero by definition
2171
2172		For each unknown sample:
2173
2174		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2175		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2176
2177		For each anchor and unknown:
2178
2179		+ `N`: the total number of analyses of this sample
2180		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2181		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2182		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2183		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2184		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2185		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2186		'''
2187		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2188		for sample in self.samples:
2189			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2190			if self.samples[sample]['N'] > 1:
2191				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2192
2193			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2194			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2195
2196			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2197			if len(D4x_pop) > 2:
2198				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2199			
2200		if self.standardization_method == 'pooled':
2201			for sample in self.anchors:
2202				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2203				self.samples[sample][f'SE_D{self._4x}'] = 0.
2204			for sample in self.unknowns:
2205				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2206				try:
2207					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2208				except ValueError:
2209					# when `sample` is constrained by self.standardize(constraints = {...}),
2210					# it is no longer listed in self.standardization.var_names.
2211					# Temporary fix: define SE as zero for now
2212					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2213
2214		elif self.standardization_method == 'indep_sessions':
2215			for sample in self.anchors:
2216				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2217				self.samples[sample][f'SE_D{self._4x}'] = 0.
2218			for sample in self.unknowns:
2219				self.msg(f'Consolidating sample {sample}')
2220				self.unknowns[sample][f'session_D{self._4x}'] = {}
2221				session_avg = []
2222				for session in self.sessions:
2223					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2224					if sdata:
2225						self.msg(f'{sample} found in session {session}')
2226						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2227						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2228						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2229						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2230						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2231						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2232						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2233				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2234				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2235				wsum = sum([weights[s] for s in weights])
2236				for s in weights:
2237					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2238
2239		for r in self:
2240			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2244	def consolidate_sessions(self):
2245		'''
2246		Compute various statistics for each session.
2247
2248		+ `Na`: Number of anchor analyses in the session
2249		+ `Nu`: Number of unknown analyses in the session
2250		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2251		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2252		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2253		+ `a`: scrambling factor
2254		+ `b`: compositional slope
2255		+ `c`: WG offset
2256		+ `SE_a`: Model stadard erorr of `a`
2257		+ `SE_b`: Model stadard erorr of `b`
2258		+ `SE_c`: Model stadard erorr of `c`
2259		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2260		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2261		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2262		+ `a2`: scrambling factor drift
2263		+ `b2`: compositional slope drift
2264		+ `c2`: WG offset drift
2265		+ `Np`: Number of standardization parameters to fit
2266		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2267		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2268		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2269		'''
2270		for session in self.sessions:
2271			if 'd13Cwg_VPDB' not in self.sessions[session]:
2272				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2273			if 'd18Owg_VSMOW' not in self.sessions[session]:
2274				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2275			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2276			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2277
2278			self.msg(f'Computing repeatabilities for session {session}')
2279			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2280			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2281			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2282
2283		if self.standardization_method == 'pooled':
2284			for session in self.sessions:
2285
2286				# different (better?) computation of D4x repeatability for each session:
2287				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2288				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2289
2290				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2291				i = self.standardization.var_names.index(f'a_{pf(session)}')
2292				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2293
2294				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2295				i = self.standardization.var_names.index(f'b_{pf(session)}')
2296				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2297
2298				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2299				i = self.standardization.var_names.index(f'c_{pf(session)}')
2300				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2301
2302				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2303				if self.sessions[session]['scrambling_drift']:
2304					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2305					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2306				else:
2307					self.sessions[session]['SE_a2'] = 0.
2308
2309				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2310				if self.sessions[session]['slope_drift']:
2311					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2312					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2313				else:
2314					self.sessions[session]['SE_b2'] = 0.
2315
2316				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2317				if self.sessions[session]['wg_drift']:
2318					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2319					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2320				else:
2321					self.sessions[session]['SE_c2'] = 0.
2322
2323				i = self.standardization.var_names.index(f'a_{pf(session)}')
2324				j = self.standardization.var_names.index(f'b_{pf(session)}')
2325				k = self.standardization.var_names.index(f'c_{pf(session)}')
2326				CM = np.zeros((6,6))
2327				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2328				try:
2329					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2330					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2331					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2332					try:
2333						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2334						CM[3,4] = self.standardization.covar[i2,j2]
2335						CM[4,3] = self.standardization.covar[j2,i2]
2336					except ValueError:
2337						pass
2338					try:
2339						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2340						CM[3,5] = self.standardization.covar[i2,k2]
2341						CM[5,3] = self.standardization.covar[k2,i2]
2342					except ValueError:
2343						pass
2344				except ValueError:
2345					pass
2346				try:
2347					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2348					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2349					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2350					try:
2351						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2352						CM[4,5] = self.standardization.covar[j2,k2]
2353						CM[5,4] = self.standardization.covar[k2,j2]
2354					except ValueError:
2355						pass
2356				except ValueError:
2357					pass
2358				try:
2359					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2360					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2361					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2362				except ValueError:
2363					pass
2364
2365				self.sessions[session]['CM'] = CM
2366
2367		elif self.standardization_method == 'indep_sessions':
2368			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2371	@make_verbal
2372	def repeatabilities(self):
2373		'''
2374		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2375		(for all samples, for anchors, and for unknowns).
2376		'''
2377		self.msg('Computing reproducibilities for all sessions')
2378
2379		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2380		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2381		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2382		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2383		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2386	@make_verbal
2387	def consolidate(self, tables = True, plots = True):
2388		'''
2389		Collect information about samples, sessions and repeatabilities.
2390		'''
2391		self.consolidate_samples()
2392		self.consolidate_sessions()
2393		self.repeatabilities()
2394
2395		if tables:
2396			self.summary()
2397			self.table_of_sessions()
2398			self.table_of_analyses()
2399			self.table_of_samples()
2400
2401		if plots:
2402			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2405	@make_verbal
2406	def rmswd(self,
2407		samples = 'all samples',
2408		sessions = 'all sessions',
2409		):
2410		'''
2411		Compute the χ2, root mean squared weighted deviation
2412		(i.e. reduced χ2), and corresponding degrees of freedom of the
2413		Δ4x values for samples in `samples` and sessions in `sessions`.
2414		
2415		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2416		'''
2417		if samples == 'all samples':
2418			mysamples = [k for k in self.samples]
2419		elif samples == 'anchors':
2420			mysamples = [k for k in self.anchors]
2421		elif samples == 'unknowns':
2422			mysamples = [k for k in self.unknowns]
2423		else:
2424			mysamples = samples
2425
2426		if sessions == 'all sessions':
2427			sessions = [k for k in self.sessions]
2428
2429		chisq, Nf = 0, 0
2430		for sample in mysamples :
2431			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2432			if len(G) > 1 :
2433				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2434				Nf += (len(G) - 1)
2435				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2436		r = (chisq / Nf)**.5 if Nf > 0 else 0
2437		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2438		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2441	@make_verbal
2442	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2443		'''
2444		Compute the repeatability of `[r[key] for r in self]`
2445		'''
2446
2447		if samples == 'all samples':
2448			mysamples = [k for k in self.samples]
2449		elif samples == 'anchors':
2450			mysamples = [k for k in self.anchors]
2451		elif samples == 'unknowns':
2452			mysamples = [k for k in self.unknowns]
2453		else:
2454			mysamples = samples
2455
2456		if sessions == 'all sessions':
2457			sessions = [k for k in self.sessions]
2458
2459		if key in ['D47', 'D48']:
2460			# Full disclosure: the definition of Nf is tricky/debatable
2461			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2462			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2463			Nf = len(G)
2464# 			print(f'len(G) = {Nf}')
2465			Nf -= len([s for s in mysamples if s in self.unknowns])
2466# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2467			for session in sessions:
2468				Np = len([
2469					_ for _ in self.standardization.params
2470					if (
2471						self.standardization.params[_].expr is not None
2472						and (
2473							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2474							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2475							)
2476						)
2477					])
2478# 				print(f'session {session}: {Np} parameters to consider')
2479				Na = len({
2480					r['Sample'] for r in self.sessions[session]['data']
2481					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2482					})
2483# 				print(f'session {session}: {Na} different anchors in that session')
2484				Nf -= min(Np, Na)
2485# 			print(f'Nf = {Nf}')
2486
2487# 			for sample in mysamples :
2488# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2489# 				if len(X) > 1 :
2490# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2491# 					if sample in self.unknowns:
2492# 						Nf += len(X) - 1
2493# 					else:
2494# 						Nf += len(X)
2495# 			if samples in ['anchors', 'all samples']:
2496# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2497			r = (chisq / Nf)**.5 if Nf > 0 else 0
2498
2499		else: # if key not in ['D47', 'D48']
2500			chisq, Nf = 0, 0
2501			for sample in mysamples :
2502				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2503				if len(X) > 1 :
2504					Nf += len(X) - 1
2505					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2506			r = (chisq / Nf)**.5 if Nf > 0 else 0
2507
2508		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2509		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2511	def sample_average(self, samples, weights = 'equal', normalize = True):
2512		'''
2513		Weighted average Δ4x value of a group of samples, accounting for covariance.
2514
2515		Returns the weighed average Δ4x value and associated SE
2516		of a group of samples. Weights are equal by default. If `normalize` is
2517		true, `weights` will be rescaled so that their sum equals 1.
2518
2519		**Examples**
2520
2521		```python
2522		self.sample_average(['X','Y'], [1, 2])
2523		```
2524
2525		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2526		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2527		values of samples X and Y, respectively.
2528
2529		```python
2530		self.sample_average(['X','Y'], [1, -1], normalize = False)
2531		```
2532
2533		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2534		'''
2535		if weights == 'equal':
2536			weights = [1/len(samples)] * len(samples)
2537
2538		if normalize:
2539			s = sum(weights)
2540			if s:
2541				weights = [w/s for w in weights]
2542
2543		try:
2544# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2545# 			C = self.standardization.covar[indices,:][:,indices]
2546			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2547			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2548			return correlated_sum(X, C, weights)
2549		except ValueError:
2550			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2553	def sample_D4x_covar(self, sample1, sample2 = None):
2554		'''
2555		Covariance between Δ4x values of samples
2556
2557		Returns the error covariance between the average Δ4x values of two
2558		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2559		returns the Δ4x variance for that sample.
2560		'''
2561		if sample2 is None:
2562			sample2 = sample1
2563		if self.standardization_method == 'pooled':
2564			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2565			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2566			return self.standardization.covar[i, j]
2567		elif self.standardization_method == 'indep_sessions':
2568			if sample1 == sample2:
2569				return self.samples[sample1][f'SE_D{self._4x}']**2
2570			else:
2571				c = 0
2572				for session in self.sessions:
2573					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2574					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2575					if sdata1 and sdata2:
2576						a = self.sessions[session]['a']
2577						# !! TODO: CM below does not account for temporal changes in standardization parameters
2578						CM = self.sessions[session]['CM'][:3,:3]
2579						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2580						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2581						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2582						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2583						c += (
2584							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2585							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2586							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2587							@ CM
2588							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2589							) / a**2
2590				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2592	def sample_D4x_correl(self, sample1, sample2 = None):
2593		'''
2594		Correlation between Δ4x errors of samples
2595
2596		Returns the error correlation between the average Δ4x values of two samples.
2597		'''
2598		if sample2 is None or sample2 == sample1:
2599			return 1.
2600		return (
2601			self.sample_D4x_covar(sample1, sample2)
2602			/ self.unknowns[sample1][f'SE_D{self._4x}']
2603			/ self.unknowns[sample2][f'SE_D{self._4x}']
2604			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2606	def plot_single_session(self,
2607		session,
2608		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2609		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2610		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2611		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2612		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2613		xylimits = 'free', # | 'constant'
2614		x_label = None,
2615		y_label = None,
2616		error_contour_interval = 'auto',
2617		fig = 'new',
2618		):
2619		'''
2620		Generate plot for a single session
2621		'''
2622		if x_label is None:
2623			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2624		if y_label is None:
2625			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2626
2627		out = _SessionPlot()
2628		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2629		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2630		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2631		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2632		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2633		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2634		anchor_avg = (np.array([ np.array([
2635				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2636				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2637				]) for sample in anchors]).T,
2638			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2639		unknown_avg = (np.array([ np.array([
2640				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2641				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2642				]) for sample in unknowns]).T,
2643			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2644		
2645		
2646		if fig == 'new':
2647			out.fig = ppl.figure(figsize = (6,6))
2648			ppl.subplots_adjust(.1,.1,.9,.9)
2649
2650		out.anchor_analyses, = ppl.plot(
2651			anchors_d,
2652			anchors_D,
2653			**kw_plot_anchors)
2654		out.unknown_analyses, = ppl.plot(
2655			unknowns_d,
2656			unknowns_D,
2657			**kw_plot_unknowns)
2658		out.anchor_avg = ppl.plot(
2659			*anchor_avg,
2660			**kw_plot_anchor_avg)
2661		out.unknown_avg = ppl.plot(
2662			*unknown_avg,
2663			**kw_plot_unknown_avg)
2664		if xylimits == 'constant':
2665			x = [r[f'd{self._4x}'] for r in self]
2666			y = [r[f'D{self._4x}'] for r in self]
2667			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2668			w, h = x2-x1, y2-y1
2669			x1 -= w/20
2670			x2 += w/20
2671			y1 -= h/20
2672			y2 += h/20
2673			ppl.axis([x1, x2, y1, y2])
2674		elif xylimits == 'free':
2675			x1, x2, y1, y2 = ppl.axis()
2676		else:
2677			x1, x2, y1, y2 = ppl.axis(xylimits)
2678				
2679		if error_contour_interval != 'none':
2680			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2681			XI,YI = np.meshgrid(xi, yi)
2682			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2683			if error_contour_interval == 'auto':
2684				rng = np.max(SI) - np.min(SI)
2685				if rng <= 0.01:
2686					cinterval = 0.001
2687				elif rng <= 0.03:
2688					cinterval = 0.004
2689				elif rng <= 0.1:
2690					cinterval = 0.01
2691				elif rng <= 0.3:
2692					cinterval = 0.03
2693				elif rng <= 1.:
2694					cinterval = 0.1
2695				else:
2696					cinterval = 0.5
2697			else:
2698				cinterval = error_contour_interval
2699
2700			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2701			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2702			out.clabel = ppl.clabel(out.contour)
2703			contour = (XI, YI, SI, cval, cinterval)
2704
2705		if fig == None:
2706			return {
2707			'anchors':anchors,
2708			'unknowns':unknowns,
2709			'anchors_d':anchors_d,
2710			'anchors_D':anchors_D,
2711			'unknowns_d':unknowns_d,
2712			'unknowns_D':unknowns_D,
2713			'anchor_avg':anchor_avg,
2714			'unknown_avg':unknown_avg,
2715			'contour':contour,
2716			}
2717
2718		ppl.xlabel(x_label)
2719		ppl.ylabel(y_label)
2720		ppl.title(session, weight = 'bold')
2721		ppl.grid(alpha = .2)
2722		out.ax = ppl.gca()		
2723
2724		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2726	def plot_residuals(
2727		self,
2728		kde = False,
2729		hist = False,
2730		binwidth = 2/3,
2731		dir = 'output',
2732		filename = None,
2733		highlight = [],
2734		colors = None,
2735		figsize = None,
2736		dpi = 100,
2737		yspan = None,
2738		):
2739		'''
2740		Plot residuals of each analysis as a function of time (actually, as a function of
2741		the order of analyses in the `D4xdata` object)
2742
2743		+ `kde`: whether to add a kernel density estimate of residuals
2744		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2745		+ `histbins`: specify bin edges for the histogram
2746		+ `dir`: the directory in which to save the plot
2747		+ `highlight`: a list of samples to highlight
2748		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2749		+ `figsize`: (width, height) of figure
2750		+ `dpi`: resolution for PNG output
2751		+ `yspan`: factor controlling the range of y values shown in plot
2752		  (by default: `yspan = 1.5 if kde else 1.0`)
2753		'''
2754		
2755		from matplotlib import ticker
2756
2757		if yspan is None:
2758			if kde:
2759				yspan = 1.5
2760			else:
2761				yspan = 1.0
2762		
2763		# Layout
2764		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2765		if hist or kde:
2766			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2767			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2768		else:
2769			ppl.subplots_adjust(.08,.05,.78,.8)
2770			ax1 = ppl.subplot(111)
2771		
2772		# Colors
2773		N = len(self.anchors)
2774		if colors is None:
2775			if len(highlight) > 0:
2776				Nh = len(highlight)
2777				if Nh == 1:
2778					colors = {highlight[0]: (0,0,0)}
2779				elif Nh == 3:
2780					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2781				elif Nh == 4:
2782					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2783				else:
2784					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2785			else:
2786				if N == 3:
2787					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2788				elif N == 4:
2789					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2790				else:
2791					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2792
2793		ppl.sca(ax1)
2794		
2795		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2796
2797		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2798
2799		session = self[0]['Session']
2800		x1 = 0
2801# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2802		x_sessions = {}
2803		one_or_more_singlets = False
2804		one_or_more_multiplets = False
2805		multiplets = set()
2806		for k,r in enumerate(self):
2807			if r['Session'] != session:
2808				x2 = k-1
2809				x_sessions[session] = (x1+x2)/2
2810				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2811				session = r['Session']
2812				x1 = k
2813			singlet = len(self.samples[r['Sample']]['data']) == 1
2814			if not singlet:
2815				multiplets.add(r['Sample'])
2816			if r['Sample'] in self.unknowns:
2817				if singlet:
2818					one_or_more_singlets = True
2819				else:
2820					one_or_more_multiplets = True
2821			kw = dict(
2822				marker = 'x' if singlet else '+',
2823				ms = 4 if singlet else 5,
2824				ls = 'None',
2825				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2826				mew = 1,
2827				alpha = 0.2 if singlet else 1,
2828				)
2829			if highlight and r['Sample'] not in highlight:
2830				kw['alpha'] = 0.2
2831			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2832		x2 = k
2833		x_sessions[session] = (x1+x2)/2
2834
2835		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2836		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2837		if not (hist or kde):
2838			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2839			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2840
2841		xmin, xmax, ymin, ymax = ppl.axis()
2842		if yspan != 1:
2843			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2844		for s in x_sessions:
2845			ppl.text(
2846				x_sessions[s],
2847				ymax +1,
2848				s,
2849				va = 'bottom',
2850				**(
2851					dict(ha = 'center')
2852					if len(self.sessions[s]['data']) > (0.15 * len(self))
2853					else dict(ha = 'left', rotation = 45)
2854					)
2855				)
2856
2857		if hist or kde:
2858			ppl.sca(ax2)
2859
2860		for s in colors:
2861			kw['marker'] = '+'
2862			kw['ms'] = 5
2863			kw['mec'] = colors[s]
2864			kw['label'] = s
2865			kw['alpha'] = 1
2866			ppl.plot([], [], **kw)
2867
2868		kw['mec'] = (0,0,0)
2869
2870		if one_or_more_singlets:
2871			kw['marker'] = 'x'
2872			kw['ms'] = 4
2873			kw['alpha'] = .2
2874			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2875			ppl.plot([], [], **kw)
2876
2877		if one_or_more_multiplets:
2878			kw['marker'] = '+'
2879			kw['ms'] = 4
2880			kw['alpha'] = 1
2881			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2882			ppl.plot([], [], **kw)
2883
2884		if hist or kde:
2885			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2886		else:
2887			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2888		leg.set_zorder(-1000)
2889
2890		ppl.sca(ax1)
2891
2892		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2893		ppl.xticks([])
2894		ppl.axis([-1, len(self), None, None])
2895
2896		if hist or kde:
2897			ppl.sca(ax2)
2898			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2899
2900			if kde:
2901				from scipy.stats import gaussian_kde
2902				yi = np.linspace(ymin, ymax, 201)
2903				xi = gaussian_kde(X).evaluate(yi)
2904				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2905# 				ppl.plot(xi, yi, 'k-', lw = 1)
2906			elif hist:
2907				ppl.hist(
2908					X,
2909					orientation = 'horizontal',
2910					histtype = 'stepfilled',
2911					ec = [.4]*3,
2912					fc = [.25]*3,
2913					alpha = .25,
2914					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2915					)
2916			ppl.text(0, 0,
2917				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2918				size = 7.5,
2919				alpha = 1,
2920				va = 'center',
2921				ha = 'left',
2922				)
2923
2924			ppl.axis([0, None, ymin, ymax])
2925			ppl.xticks([])
2926			ppl.yticks([])
2927# 			ax2.spines['left'].set_visible(False)
2928			ax2.spines['right'].set_visible(False)
2929			ax2.spines['top'].set_visible(False)
2930			ax2.spines['bottom'].set_visible(False)
2931
2932		ax1.axis([None, None, ymin, ymax])
2933
2934		if not os.path.exists(dir):
2935			os.makedirs(dir)
2936		if filename is None:
2937			return fig
2938		elif filename == '':
2939			filename = f'D{self._4x}_residuals.pdf'
2940		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2941		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: (r, g, b)} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2944	def simulate(self, *args, **kwargs):
2945		'''
2946		Legacy function with warning message pointing to `virtual_data()`
2947		'''
2948		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_anchor_residuals( self, dir='output', filename='', figsize=None, subplots_adjust=(0.05, 0.1, 0.95, 0.98, 0.25, 0.25), dpi=100, colors=None):
2950	def plot_anchor_residuals(
2951		self,
2952		dir = 'output',
2953		filename = '',
2954		figsize = None,
2955		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2956		dpi = 100,
2957		colors = None,
2958		):
2959		'''
2960		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2961		
2962		**Parameters**
2963
2964		+ `dir`: the directory in which to save the plot
2965		+ `filename`: the file name to save to.
2966		+ `dpi`: resolution for PNG output
2967		+ `figsize`: (width, height) of figure
2968		+ `subplots_adjust`: passed to the figure
2969		+ `dpi`: resolution for PNG output
2970		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2971		'''
2972
2973		# Colors
2974		N = len(self.anchors)
2975		if colors is None:
2976			if N == 3:
2977				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2978			elif N == 4:
2979				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2980			else:
2981				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2982
2983		if figsize is None:
2984			figsize = (4, 1.5*N+1)
2985		fig = ppl.figure(figsize = figsize)
2986		ppl.subplots_adjust(*subplots_adjust)
2987		axs = {}
2988		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2989		sigma = self.repeatability['r_D47a'] * 1000
2990		D = max(np.abs(X))
2991
2992		for k,a in enumerate(self.anchors):
2993			color = colors[a]
2994			axs[a] = ppl.subplot(N, 1, 1+k)
2995			axs[a].text(
2996				0.02, 1-0.05, a,
2997				va = 'top',
2998				ha = 'left',
2999				weight = 'bold',
3000				size = 9,
3001				color = [_*0.75 for _ in color],
3002				transform = axs[a].transAxes,
3003			)
3004			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
3005			axs[a].axvline(0, lw = 0.5, color = color)
3006			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
3007
3008			xi = np.linspace(-3*D, 3*D, 601)
3009			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
3010			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
3011			
3012			axs[a].errorbar(
3013				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3014				ecolor = color,
3015				marker = 's',
3016				ls = 'None',
3017				mec = color,
3018				mew = 1,
3019				mfc = 'w',
3020				ms = 8,
3021				elinewidth = 1,
3022				capsize = 4,
3023				capthick = 1,
3024			)
3025			
3026			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3027			ppl.yticks([])
3028
3029		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3030
3031		if not os.path.exists(dir):
3032			os.makedirs(dir)
3033		if filename is None:
3034			return fig
3035		elif filename == '':
3036			filename = f'D{self._4x}_anchor_residuals.pdf'
3037		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3038		ppl.close(fig)

Plot a summary of the residuals for all anchors, intended to help detect systematic bias.

Parameters

  • dir: the directory in which to save the plot
  • filename: the file name to save to.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • subplots_adjust: passed to the figure
  • dpi: resolution for PNG output
  • colors: a dict of {<sample>: (r, g, b)} for all samples
def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
3041	def plot_distribution_of_analyses(
3042		self,
3043		dir = 'output',
3044		filename = None,
3045		vs_time = False,
3046		figsize = (6,4),
3047		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3048		output = None,
3049		dpi = 100,
3050		):
3051		'''
3052		Plot temporal distribution of all analyses in the data set.
3053		
3054		**Parameters**
3055
3056		+ `dir`: the directory in which to save the plot
3057		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3058		+ `dpi`: resolution for PNG output
3059		+ `figsize`: (width, height) of figure
3060		+ `dpi`: resolution for PNG output
3061		'''
3062
3063		asamples = [s for s in self.anchors]
3064		usamples = [s for s in self.unknowns]
3065		if output is None or output == 'fig':
3066			fig = ppl.figure(figsize = figsize)
3067			ppl.subplots_adjust(*subplots_adjust)
3068		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3069		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3070		Xmax += (Xmax-Xmin)/40
3071		Xmin -= (Xmax-Xmin)/41
3072		for k, s in enumerate(asamples + usamples):
3073			if vs_time:
3074				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3075			else:
3076				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3077			Y = [-k for x in X]
3078			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3079			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3080			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3081		ppl.axis([Xmin, Xmax, -k-1, 1])
3082		ppl.xlabel('\ntime')
3083		ppl.gca().annotate('',
3084			xy = (0.6, -0.02),
3085			xycoords = 'axes fraction',
3086			xytext = (.4, -0.02), 
3087            arrowprops = dict(arrowstyle = "->", color = 'k'),
3088            )
3089			
3090
3091		x2 = -1
3092		for session in self.sessions:
3093			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3094			if vs_time:
3095				ppl.axvline(x1, color = 'k', lw = .75)
3096			if x2 > -1:
3097				if not vs_time:
3098					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3099			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3100# 			from xlrd import xldate_as_datetime
3101# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3102			if vs_time:
3103				ppl.axvline(x2, color = 'k', lw = .75)
3104				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3105			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3106
3107		ppl.xticks([])
3108		ppl.yticks([])
3109
3110		if output is None:
3111			if not os.path.exists(dir):
3112				os.makedirs(dir)
3113			if filename == None:
3114				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3115			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3116			ppl.close(fig)
3117		elif output == 'ax':
3118			return ppl.gca()
3119		elif output == 'fig':
3120			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
3123	def plot_bulk_compositions(
3124		self,
3125		samples = None,
3126		dir = 'output/bulk_compositions',
3127		figsize = (6,6),
3128		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3129		show = False,
3130		sample_color = (0,.5,1),
3131		analysis_color = (.7,.7,.7),
3132		labeldist = 0.3,
3133		radius = 0.05,
3134		):
3135		'''
3136		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3137		
3138		By default, creates a directory `./output/bulk_compositions` where plots for
3139		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3140		
3141		
3142		**Parameters**
3143
3144		+ `samples`: Only these samples are processed (by default: all samples).
3145		+ `dir`: where to save the plots
3146		+ `figsize`: (width, height) of figure
3147		+ `subplots_adjust`: passed to `subplots_adjust()`
3148		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3149		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3150		+ `sample_color`: color used for replicate markers/labels
3151		+ `analysis_color`: color used for sample markers/labels
3152		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3153		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3154		'''
3155
3156		from matplotlib.patches import Ellipse
3157
3158		if samples is None:
3159			samples = [_ for _ in self.samples]
3160
3161		saved = {}
3162
3163		for s in samples:
3164
3165			fig = ppl.figure(figsize = figsize)
3166			fig.subplots_adjust(*subplots_adjust)
3167			ax = ppl.subplot(111)
3168			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3169			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3170			ppl.title(s)
3171
3172
3173			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3174			UID = [_['UID'] for _ in self.samples[s]['data']]
3175			XY0 = XY.mean(0)
3176
3177			for xy in XY:
3178				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3179				
3180			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3181			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3182			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3183			saved[s] = [XY, XY0]
3184			
3185			x1, x2, y1, y2 = ppl.axis()
3186			x0, dx = (x1+x2)/2, (x2-x1)/2
3187			y0, dy = (y1+y2)/2, (y2-y1)/2
3188			dx, dy = [max(max(dx, dy), radius)]*2
3189
3190			ppl.axis([
3191				x0 - 1.2*dx,
3192				x0 + 1.2*dx,
3193				y0 - 1.2*dy,
3194				y0 + 1.2*dy,
3195				])			
3196
3197			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3198
3199			for xy, uid in zip(XY, UID):
3200
3201				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3202				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3203
3204				if (vector_in_display_space**2).sum() > 0:
3205
3206					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3207					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3208					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3209					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3210
3211					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3212
3213				else:
3214
3215					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3216
3217			if radius:
3218				ax.add_artist(Ellipse(
3219					xy = XY0,
3220					width = radius*2,
3221					height = radius*2,
3222					ls = (0, (2,2)),
3223					lw = .7,
3224					ec = analysis_color,
3225					fc = 'None',
3226					))
3227				ppl.text(
3228					XY0[0],
3229					XY0[1]-radius,
3230					f'\n± {radius*1e3:.0f} ppm',
3231					color = analysis_color,
3232					va = 'top',
3233					ha = 'center',
3234					linespacing = 0.4,
3235					size = 8,
3236					)
3237
3238			if not os.path.exists(dir):
3239				os.makedirs(dir)
3240			fig.savefig(f'{dir}/{s}.pdf')
3241			ppl.close(fig)
3242
3243		fig = ppl.figure(figsize = figsize)
3244		fig.subplots_adjust(*subplots_adjust)
3245		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3246		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3247
3248		for s in saved:
3249			for xy in saved[s][0]:
3250				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3251			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3252			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3253			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3254
3255		x1, x2, y1, y2 = ppl.axis()
3256		ppl.axis([
3257			x1 - (x2-x1)/10,
3258			x2 + (x2-x1)/10,
3259			y1 - (y2-y1)/10,
3260			y2 + (y2-y1)/10,
3261			])			
3262
3263
3264		if not os.path.exists(dir):
3265			os.makedirs(dir)
3266		fig.savefig(f'{dir}/__all__.pdf')
3267		if show:
3268			ppl.show()
3269		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3315class D47data(D4xdata):
3316	'''
3317	Store and process data for a large set of Δ47 analyses,
3318	usually comprising more than one analytical session.
3319	'''
3320
3321	Nominal_D4x = {
3322		'ETH-1':   0.2052,
3323		'ETH-2':   0.2085,
3324		'ETH-3':   0.6132,
3325		'ETH-4':   0.4511,
3326		'IAEA-C1': 0.3018,
3327		'IAEA-C2': 0.6409,
3328		'MERCK':   0.5135,
3329		} # I-CDES (Bernasconi et al., 2021)
3330	'''
3331	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3332	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3333	reference frame.
3334
3335	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3336	```py
3337	{
3338		'ETH-1'   : 0.2052,
3339		'ETH-2'   : 0.2085,
3340		'ETH-3'   : 0.6132,
3341		'ETH-4'   : 0.4511,
3342		'IAEA-C1' : 0.3018,
3343		'IAEA-C2' : 0.6409,
3344		'MERCK'   : 0.5135,
3345	}
3346	```
3347	'''
3348
3349
3350	@property
3351	def Nominal_D47(self):
3352		return self.Nominal_D4x
3353	
3354
3355	@Nominal_D47.setter
3356	def Nominal_D47(self, new):
3357		self.Nominal_D4x = dict(**new)
3358		self.refresh()
3359
3360
3361	def __init__(self, l = [], **kwargs):
3362		'''
3363		**Parameters:** same as `D4xdata.__init__()`
3364		'''
3365		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3366
3367
3368	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3369		'''
3370		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3371		value for that temperature, and add treat these samples as additional anchors.
3372
3373		**Parameters**
3374
3375		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3376		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3377		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3378		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3379		if `new`: keep pre-existing anchors but update them in case of conflict
3380		between old and new Δ47 values;
3381		if `old`: keep pre-existing anchors but preserve their original Δ47
3382		values in case of conflict.
3383		'''
3384		f = {
3385			'petersen': fCO2eqD47_Petersen,
3386			'wang': fCO2eqD47_Wang,
3387			}[fCo2eqD47]
3388		foo = {}
3389		for r in self:
3390			if 'Teq' in r:
3391				if r['Sample'] in foo:
3392					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3393				else:
3394					foo[r['Sample']] = f(r['Teq'])
3395			else:
3396					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3397
3398		if priority == 'replace':
3399			self.Nominal_D47 = {}
3400		for s in foo:
3401			if priority != 'old' or s not in self.Nominal_D47:
3402				self.Nominal_D47[s] = foo[s]
3403	
3404	def save_D47_correl(self, *args, **kwargs):
3405		return self._save_D4x_correl(*args, **kwargs)
3406
3407	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3361	def __init__(self, l = [], **kwargs):
3362		'''
3363		**Parameters:** same as `D4xdata.__init__()`
3364		'''
3365		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3350	@property
3351	def Nominal_D47(self):
3352		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3368	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3369		'''
3370		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3371		value for that temperature, and add treat these samples as additional anchors.
3372
3373		**Parameters**
3374
3375		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3376		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3377		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3378		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3379		if `new`: keep pre-existing anchors but update them in case of conflict
3380		between old and new Δ47 values;
3381		if `old`: keep pre-existing anchors but preserve their original Δ47
3382		values in case of conflict.
3383		'''
3384		f = {
3385			'petersen': fCO2eqD47_Petersen,
3386			'wang': fCO2eqD47_Wang,
3387			}[fCo2eqD47]
3388		foo = {}
3389		for r in self:
3390			if 'Teq' in r:
3391				if r['Sample'] in foo:
3392					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3393				else:
3394					foo[r['Sample']] = f(r['Teq'])
3395			else:
3396					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3397
3398		if priority == 'replace':
3399			self.Nominal_D47 = {}
3400		for s in foo:
3401			if priority != 'old' or s not in self.Nominal_D47:
3402				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3404	def save_D47_correl(self, *args, **kwargs):
3405		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string
class D48data(D4xdata):
3410class D48data(D4xdata):
3411	'''
3412	Store and process data for a large set of Δ48 analyses,
3413	usually comprising more than one analytical session.
3414	'''
3415
3416	Nominal_D4x = {
3417		'ETH-1':  0.138,
3418		'ETH-2':  0.138,
3419		'ETH-3':  0.270,
3420		'ETH-4':  0.223,
3421		'GU-1':  -0.419,
3422		} # (Fiebig et al., 2019, 2021)
3423	'''
3424	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3425	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3426	reference frame.
3427
3428	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3429	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3430
3431	```py
3432	{
3433		'ETH-1' :  0.138,
3434		'ETH-2' :  0.138,
3435		'ETH-3' :  0.270,
3436		'ETH-4' :  0.223,
3437		'GU-1'  : -0.419,
3438	}
3439	```
3440	'''
3441
3442
3443	@property
3444	def Nominal_D48(self):
3445		return self.Nominal_D4x
3446
3447	
3448	@Nominal_D48.setter
3449	def Nominal_D48(self, new):
3450		self.Nominal_D4x = dict(**new)
3451		self.refresh()
3452
3453
3454	def __init__(self, l = [], **kwargs):
3455		'''
3456		**Parameters:** same as `D4xdata.__init__()`
3457		'''
3458		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3459
3460	def save_D48_correl(self, *args, **kwargs):
3461		return self._save_D4x_correl(*args, **kwargs)
3462
3463	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3454	def __init__(self, l = [], **kwargs):
3455		'''
3456		**Parameters:** same as `D4xdata.__init__()`
3457		'''
3458		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3443	@property
3444	def Nominal_D48(self):
3445		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3460	def save_D48_correl(self, *args, **kwargs):
3461		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string
class D49data(D4xdata):
3466class D49data(D4xdata):
3467	'''
3468	Store and process data for a large set of Δ49 analyses,
3469	usually comprising more than one analytical session.
3470	'''
3471	
3472	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3473	'''
3474	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3475	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3476	reference frame.
3477
3478	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3479
3480	```py
3481	{
3482		"1000C": 0.0,
3483		"25C": 2.228
3484	}
3485	```
3486	'''
3487	
3488	@property
3489	def Nominal_D49(self):
3490		return self.Nominal_D4x
3491	
3492	@Nominal_D49.setter
3493	def Nominal_D49(self, new):
3494		self.Nominal_D4x = dict(**new)
3495		self.refresh()
3496	
3497	def __init__(self, l=[], **kwargs):
3498		'''
3499		**Parameters:** same as `D4xdata.__init__()`
3500		'''
3501		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3502	
3503	def save_D49_correl(self, *args, **kwargs):
3504		return self._save_D4x_correl(*args, **kwargs)
3505	
3506	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3497	def __init__(self, l=[], **kwargs):
3498		'''
3499		**Parameters:** same as `D4xdata.__init__()`
3500		'''
3501		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3488	@property
3489	def Nominal_D49(self):
3490		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3503	def save_D49_correl(self, *args, **kwargs):
3504		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string