D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16<h1>API Documentation</h1>
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) Mathieu Daëron'
  23__license__   = 'MIT License - https://opensource.org/licenses/MIT'
  24__date__      = '2025-12-14'
  25__version__   = '2.5.1'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42from typer import rich_utils
  43
  44rich_utils.STYLE_HELPTEXT = ''
  45
  46rcParams['font.family'] = 'sans-serif'
  47rcParams['font.sans-serif'] = 'Helvetica'
  48rcParams['font.size'] = 10
  49rcParams['mathtext.fontset'] = 'custom'
  50rcParams['mathtext.rm'] = 'sans'
  51rcParams['mathtext.bf'] = 'sans:bold'
  52rcParams['mathtext.it'] = 'sans:italic'
  53rcParams['mathtext.cal'] = 'sans:italic'
  54rcParams['mathtext.default'] = 'rm'
  55rcParams['xtick.major.size'] = 4
  56rcParams['xtick.major.width'] = 1
  57rcParams['ytick.major.size'] = 4
  58rcParams['ytick.major.width'] = 1
  59rcParams['axes.grid'] = False
  60rcParams['axes.linewidth'] = 1
  61rcParams['grid.linewidth'] = .75
  62rcParams['grid.linestyle'] = '-'
  63rcParams['grid.alpha'] = .15
  64rcParams['savefig.dpi'] = 150
  65
  66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  68def fCO2eqD47_Petersen(T):
  69	'''
  70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  72
  73	'''
  74	return float(_fCO2eqD47_Petersen(T))
  75
  76
  77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  79def fCO2eqD47_Wang(T):
  80	'''
  81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  84	'''
  85	return float(_fCO2eqD47_Wang(T))
  86
  87
  88def correlated_sum(X, C, w = None):
  89	'''
  90	Compute covariance-aware linear combinations
  91
  92	**Parameters**
  93	
  94	+ `X`: list or 1-D array of values to sum
  95	+ `C`: covariance matrix for the elements of `X`
  96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  97	       (all equal to 1 by default)
  98
  99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
 100	to the elements of `w`, accounting for covariances between the elements of `X`.
 101	'''
 102	if w is None:
 103		w = [1 for x in X]
 104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 105
 106
 107def make_csv(x, hsep = ',', vsep = '\n'):
 108	'''
 109	Formats a list of lists of strings as a CSV
 110
 111	**Parameters**
 112
 113	+ `x`: the list of lists of strings to format
 114	+ `hsep`: the field separator (`,` by default)
 115	+ `vsep`: the line-ending convention to use (`\\n` by default)
 116
 117	**Example**
 118
 119	```py
 120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 121	```
 122
 123	outputs:
 124
 125	```py
 126	a,b,c
 127	d,e,f
 128	```
 129	'''
 130	return vsep.join([hsep.join(l) for l in x])
 131
 132
 133def pf(txt):
 134	'''
 135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 136	'''
 137	return txt.replace('-','_').replace('.','_').replace(' ','_')
 138
 139
 140def smart_type(x):
 141	'''
 142	Tries to convert string `x` to a float if it includes a decimal point, or
 143	to an integer if it does not. If both attempts fail, return the original
 144	string unchanged.
 145	'''
 146	try:
 147		y = float(x)
 148	except ValueError:
 149		return x
 150	if '.' not in x:
 151		return int(y)
 152	return y
 153
 154class _Defaults():
 155	def __init__(self):
 156		pass
 157
 158D47crunch_defaults = _Defaults()
 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—'
 160
 161def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
 162	'''
 163	Reads a list of lists of strings and outputs an ascii table
 164
 165	**Parameters**
 166
 167	+ `x`: a list of lists of strings
 168	+ `header`: the number of lines to treat as header lines
 169	+ `hsep`: the horizontal separator between columns
 170	+ `vsep`: the character to use as vertical separator
 171	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 172
 173	**Example**
 174
 175	```py
 176	print(pretty_table([
 177		['A', 'B', 'C'],
 178		['1', '1.9999', 'foo'],
 179		['10', 'x', 'bar'],
 180	]))
 181	```
 182	yields:	
 183	```
 184	——  ——————  ———
 185	A        B    C
 186	——  ——————  ———
 187	1   1.9999  foo
 188	10       x  bar
 189	——  ——————  ———
 190	```
 191
 192	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
 193	
 194	```py
 195	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
 196	print(pretty_table([
 197		['A', 'B', 'C'],
 198		['1', '1.9999', 'foo'],
 199		['10', 'x', 'bar'],
 200	]))
 201	```
 202	yields:	
 203	```
 204	==  ======  ===
 205	A        B    C
 206	==  ======  ===
 207	1   1.9999  foo
 208	10       x  bar
 209	==  ======  ===
 210	```
 211	'''
 212	
 213	if vsep is None:
 214		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
 215	
 216	txt = []
 217	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 218
 219	if len(widths) > len(align):
 220		align += '>' * (len(widths)-len(align))
 221	sepline = hsep.join([vsep*w for w in widths])
 222	txt += [sepline]
 223	for k,l in enumerate(x):
 224		if k and k == header:
 225			txt += [sepline]
 226		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 227	txt += [sepline]
 228	txt += ['']
 229	return '\n'.join(txt)
 230
 231
 232def transpose_table(x):
 233	'''
 234	Transpose a list if lists
 235
 236	**Parameters**
 237
 238	+ `x`: a list of lists
 239
 240	**Example**
 241
 242	```py
 243	x = [[1, 2], [3, 4]]
 244	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 245	```
 246	'''
 247	return [[e for e in c] for c in zip(*x)]
 248
 249
 250def w_avg(X, sX) :
 251	'''
 252	Compute variance-weighted average
 253
 254	Returns the value and SE of the weighted average of the elements of `X`,
 255	with relative weights equal to their inverse variances (`1/sX**2`).
 256
 257	**Parameters**
 258
 259	+ `X`: array-like of elements to average
 260	+ `sX`: array-like of the corresponding SE values
 261
 262	**Tip**
 263
 264	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 265	they may be rearranged using `zip()`:
 266
 267	```python
 268	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 269	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 270	```
 271	'''
 272	X = [ x for x in X ]
 273	sX = [ sx for sx in sX ]
 274	W = [ sx**-2 for sx in sX ]
 275	W = [ w/sum(W) for w in W ]
 276	Xavg = sum([ w*x for w,x in zip(W,X) ])
 277	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 278	return Xavg, sXavg
 279
 280
 281def read_csv(filename, sep = ''):
 282	'''
 283	Read contents of `filename` in csv format and return a list of dictionaries.
 284
 285	In the csv string, spaces before and after field separators (`','` by default)
 286	are optional.
 287
 288	**Parameters**
 289
 290	+ `filename`: the csv file to read
 291	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 292	whichever appers most often in the contents of `filename`.
 293	'''
 294	with open(filename) as fid:
 295		txt = fid.read()
 296
 297	if sep == '':
 298		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 299	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 300	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 301
 302
 303def simulate_single_analysis(
 304	sample = 'MYSAMPLE',
 305	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 306	d13C_VPDB = None, d18O_VPDB = None,
 307	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 308	a47 = 1., b47 = 0., c47 = -0.9,
 309	a48 = 1., b48 = 0., c48 = -0.45,
 310	Nominal_D47 = None,
 311	Nominal_D48 = None,
 312	Nominal_d13C_VPDB = None,
 313	Nominal_d18O_VPDB = None,
 314	ALPHA_18O_ACID_REACTION = None,
 315	R13_VPDB = None,
 316	R17_VSMOW = None,
 317	R18_VSMOW = None,
 318	LAMBDA_17 = None,
 319	R18_VPDB = None,
 320	):
 321	'''
 322	Compute working-gas delta values for a single analysis, assuming a stochastic working
 323	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 324	
 325	**Parameters**
 326
 327	+ `sample`: sample name
 328	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 329		(respectively –4 and +26 ‰ by default)
 330	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 331	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 332		of the carbonate sample
 333	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 334		Δ48 values if `D47` or `D48` are not specified
 335	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 336		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 337	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 338	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 339		correction parameters (by default equal to the `D4xdata` default values)
 340	
 341	Returns a dictionary with fields
 342	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 343	'''
 344
 345	if Nominal_d13C_VPDB is None:
 346		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 347
 348	if Nominal_d18O_VPDB is None:
 349		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 350
 351	if ALPHA_18O_ACID_REACTION is None:
 352		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 353
 354	if R13_VPDB is None:
 355		R13_VPDB = D4xdata().R13_VPDB
 356
 357	if R17_VSMOW is None:
 358		R17_VSMOW = D4xdata().R17_VSMOW
 359
 360	if R18_VSMOW is None:
 361		R18_VSMOW = D4xdata().R18_VSMOW
 362
 363	if LAMBDA_17 is None:
 364		LAMBDA_17 = D4xdata().LAMBDA_17
 365
 366	if R18_VPDB is None:
 367		R18_VPDB = D4xdata().R18_VPDB
 368	
 369	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 370	
 371	if Nominal_D47 is None:
 372		Nominal_D47 = D47data().Nominal_D47
 373
 374	if Nominal_D48 is None:
 375		Nominal_D48 = D48data().Nominal_D48
 376	
 377	if d13C_VPDB is None:
 378		if sample in Nominal_d13C_VPDB:
 379			d13C_VPDB = Nominal_d13C_VPDB[sample]
 380		else:
 381			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
 382
 383	if d18O_VPDB is None:
 384		if sample in Nominal_d18O_VPDB:
 385			d18O_VPDB = Nominal_d18O_VPDB[sample]
 386		else:
 387			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 388
 389	if D47 is None:
 390		if sample in Nominal_D47:
 391			D47 = Nominal_D47[sample]
 392		else:
 393			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 394
 395	if D48 is None:
 396		if sample in Nominal_D48:
 397			D48 = Nominal_D48[sample]
 398		else:
 399			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 400
 401	X = D4xdata()
 402	X.R13_VPDB = R13_VPDB
 403	X.R17_VSMOW = R17_VSMOW
 404	X.R18_VSMOW = R18_VSMOW
 405	X.LAMBDA_17 = LAMBDA_17
 406	X.R18_VPDB = R18_VPDB
 407	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 408
 409	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 410		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 411		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 412		)
 413	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 414		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 415		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 416		D17O=D17O, D47=D47, D48=D48, D49=D49,
 417		)
 418	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 419		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 420		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 421		D17O=D17O,
 422		)
 423	
 424	d45 = 1000 * (R45/R45wg - 1)
 425	d46 = 1000 * (R46/R46wg - 1)
 426	d47 = 1000 * (R47/R47wg - 1)
 427	d48 = 1000 * (R48/R48wg - 1)
 428	d49 = 1000 * (R49/R49wg - 1)
 429
 430	for k in range(3): # dumb iteration to adjust for small changes in d47
 431		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 432		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 433		d47 = 1000 * (R47raw/R47wg - 1)
 434		d48 = 1000 * (R48raw/R48wg - 1)
 435
 436	return dict(
 437		Sample = sample,
 438		D17O = D17O,
 439		d13Cwg_VPDB = d13Cwg_VPDB,
 440		d18Owg_VSMOW = d18Owg_VSMOW,
 441		d45 = d45,
 442		d46 = d46,
 443		d47 = d47,
 444		d48 = d48,
 445		d49 = d49,
 446		)
 447
 448
 449def virtual_data(
 450	samples = [],
 451	a47 = 1., b47 = 0., c47 = -0.9,
 452	a48 = 1., b48 = 0., c48 = -0.45,
 453	rd45 = 0.020, rd46 = 0.060,
 454	rD47 = 0.015, rD48 = 0.045,
 455	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 456	session = None,
 457	Nominal_D47 = None, Nominal_D48 = None,
 458	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 459	ALPHA_18O_ACID_REACTION = None,
 460	R13_VPDB = None,
 461	R17_VSMOW = None,
 462	R18_VSMOW = None,
 463	LAMBDA_17 = None,
 464	R18_VPDB = None,
 465	seed = 0,
 466	shuffle = True,
 467	):
 468	'''
 469	Return list with simulated analyses from a single session.
 470	
 471	**Parameters**
 472	
 473	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 474	    * `Sample`: the name of the sample
 475	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 476	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 477	    * `N`: how many analyses to generate for this sample
 478	+ `a47`: scrambling factor for Δ47
 479	+ `b47`: compositional nonlinearity for Δ47
 480	+ `c47`: working gas offset for Δ47
 481	+ `a48`: scrambling factor for Δ48
 482	+ `b48`: compositional nonlinearity for Δ48
 483	+ `c48`: working gas offset for Δ48
 484	+ `rd45`: analytical repeatability of δ45
 485	+ `rd46`: analytical repeatability of δ46
 486	+ `rD47`: analytical repeatability of Δ47
 487	+ `rD48`: analytical repeatability of Δ48
 488	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 489		(by default equal to the `simulate_single_analysis` default values)
 490	+ `session`: name of the session (no name by default)
 491	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 492		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 493	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 494		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 495		(by default equal to the `simulate_single_analysis` defaults)
 496	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 497		(by default equal to the `simulate_single_analysis` defaults)
 498	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 499		correction parameters (by default equal to the `simulate_single_analysis` default)
 500	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 501	+ `shuffle`: randomly reorder the sequence of analyses
 502	
 503		
 504	Here is an example of using this method to generate an arbitrary combination of
 505	anchors and unknowns for a bunch of sessions:
 506
 507	```py
 508	.. include:: ../../code_examples/virtual_data/example.py
 509	```
 510	
 511	This should output something like:
 512	
 513	```
 514	.. include:: ../../code_examples/virtual_data/output.txt
 515	```
 516	'''
 517	
 518	kwargs = locals().copy()
 519
 520	from numpy import random as nprandom
 521	if seed:
 522		nprandom.seed(seed)
 523		rng = nprandom.default_rng(seed)
 524	else:
 525		rng = nprandom.default_rng()
 526	
 527	N = sum([s['N'] for s in samples])
 528	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 529	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 530	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 531	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 532	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 533	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 534	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 535	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 536	
 537	k = 0
 538	out = []
 539	for s in samples:
 540		kw = {}
 541		kw['sample'] = s['Sample']
 542		kw = {
 543			**kw,
 544			**{var: kwargs[var]
 545				for var in [
 546					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 547					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 548					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 549					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 550					]
 551				if kwargs[var] is not None},
 552			**{var: s[var]
 553				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 554				if var in s},
 555			}
 556
 557		sN = s['N']
 558		while sN:
 559			out.append(simulate_single_analysis(**kw))
 560			out[-1]['d45'] += errors45[k]
 561			out[-1]['d46'] += errors46[k]
 562			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 563			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 564			sN -= 1
 565			k += 1
 566
 567		if session is not None:
 568			for r in out:
 569				r['Session'] = session
 570
 571		if shuffle:
 572			nprandom.shuffle(out)
 573
 574	return out
 575
 576def table_of_samples(
 577	data47 = None,
 578	data48 = None,
 579	dir = 'output',
 580	filename = None,
 581	save_to_file = True,
 582	print_out = True,
 583	output = None,
 584	):
 585	'''
 586	Print out, save to disk and/or return a combined table of samples
 587	for a pair of `D47data` and `D48data` objects.
 588
 589	**Parameters**
 590
 591	+ `data47`: `D47data` instance
 592	+ `data48`: `D48data` instance
 593	+ `dir`: the directory in which to save the table
 594	+ `filename`: the name to the csv file to write to
 595	+ `save_to_file`: whether to save the table to disk
 596	+ `print_out`: whether to print out the table
 597	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 598		if set to `'raw'`: return a list of list of strings
 599		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 600	'''
 601	if data47 is None:
 602		if data48 is None:
 603			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 604		else:
 605			return data48.table_of_samples(
 606				dir = dir,
 607				filename = filename,
 608				save_to_file = save_to_file,
 609				print_out = print_out,
 610				output = output
 611				)
 612	else:
 613		if data48 is None:
 614			return data47.table_of_samples(
 615				dir = dir,
 616				filename = filename,
 617				save_to_file = save_to_file,
 618				print_out = print_out,
 619				output = output
 620				)
 621		else:
 622			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 623			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 624			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 625
 626			if save_to_file:
 627				if not os.path.exists(dir):
 628					os.makedirs(dir)
 629				if filename is None:
 630					filename = f'D47D48_samples.csv'
 631				with open(f'{dir}/{filename}', 'w') as fid:
 632					fid.write(make_csv(out))
 633			if print_out:
 634				print('\n'+pretty_table(out))
 635			if output == 'raw':
 636				return out
 637			elif output == 'pretty':
 638				return pretty_table(out)
 639
 640
 641def table_of_sessions(
 642	data47 = None,
 643	data48 = None,
 644	dir = 'output',
 645	filename = None,
 646	save_to_file = True,
 647	print_out = True,
 648	output = None,
 649	):
 650	'''
 651	Print out, save to disk and/or return a combined table of sessions
 652	for a pair of `D47data` and `D48data` objects.
 653	***Only applicable if the sessions in `data47` and those in `data48`
 654	consist of the exact same sets of analyses.***
 655
 656	**Parameters**
 657
 658	+ `data47`: `D47data` instance
 659	+ `data48`: `D48data` instance
 660	+ `dir`: the directory in which to save the table
 661	+ `filename`: the name to the csv file to write to
 662	+ `save_to_file`: whether to save the table to disk
 663	+ `print_out`: whether to print out the table
 664	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 665		if set to `'raw'`: return a list of list of strings
 666		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 667	'''
 668	if data47 is None:
 669		if data48 is None:
 670			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 671		else:
 672			return data48.table_of_sessions(
 673				dir = dir,
 674				filename = filename,
 675				save_to_file = save_to_file,
 676				print_out = print_out,
 677				output = output
 678				)
 679	else:
 680		if data48 is None:
 681			return data47.table_of_sessions(
 682				dir = dir,
 683				filename = filename,
 684				save_to_file = save_to_file,
 685				print_out = print_out,
 686				output = output
 687				)
 688		else:
 689			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 690			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 691			for k,x in enumerate(out47[0]):
 692				if k>7:
 693					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 694					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 695			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 696
 697			if save_to_file:
 698				if not os.path.exists(dir):
 699					os.makedirs(dir)
 700				if filename is None:
 701					filename = f'D47D48_sessions.csv'
 702				with open(f'{dir}/{filename}', 'w') as fid:
 703					fid.write(make_csv(out))
 704			if print_out:
 705				print('\n'+pretty_table(out))
 706			if output == 'raw':
 707				return out
 708			elif output == 'pretty':
 709				return pretty_table(out)
 710
 711
 712def table_of_analyses(
 713	data47 = None,
 714	data48 = None,
 715	dir = 'output',
 716	filename = None,
 717	save_to_file = True,
 718	print_out = True,
 719	output = None,
 720	):
 721	'''
 722	Print out, save to disk and/or return a combined table of analyses
 723	for a pair of `D47data` and `D48data` objects.
 724
 725	If the sessions in `data47` and those in `data48` do not consist of
 726	the exact same sets of analyses, the table will have two columns
 727	`Session_47` and `Session_48` instead of a single `Session` column.
 728
 729	**Parameters**
 730
 731	+ `data47`: `D47data` instance
 732	+ `data48`: `D48data` instance
 733	+ `dir`: the directory in which to save the table
 734	+ `filename`: the name to the csv file to write to
 735	+ `save_to_file`: whether to save the table to disk
 736	+ `print_out`: whether to print out the table
 737	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 738		if set to `'raw'`: return a list of list of strings
 739		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 740	'''
 741	if data47 is None:
 742		if data48 is None:
 743			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 744		else:
 745			return data48.table_of_analyses(
 746				dir = dir,
 747				filename = filename,
 748				save_to_file = save_to_file,
 749				print_out = print_out,
 750				output = output
 751				)
 752	else:
 753		if data48 is None:
 754			return data47.table_of_analyses(
 755				dir = dir,
 756				filename = filename,
 757				save_to_file = save_to_file,
 758				print_out = print_out,
 759				output = output
 760				)
 761		else:
 762			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 763			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 764			
 765			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 766				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 767			else:
 768				out47[0][1] = 'Session_47'
 769				out48[0][1] = 'Session_48'
 770				out47 = transpose_table(out47)
 771				out48 = transpose_table(out48)
 772				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 773
 774			if save_to_file:
 775				if not os.path.exists(dir):
 776					os.makedirs(dir)
 777				if filename is None:
 778					filename = f'D47D48_sessions.csv'
 779				with open(f'{dir}/{filename}', 'w') as fid:
 780					fid.write(make_csv(out))
 781			if print_out:
 782				print('\n'+pretty_table(out))
 783			if output == 'raw':
 784				return out
 785			elif output == 'pretty':
 786				return pretty_table(out)
 787
 788
 789def _fullcovar(minresult, epsilon = 0.01, named = False):
 790	'''
 791	Construct full covariance matrix in the case of constrained parameters
 792	'''
 793	
 794	import asteval
 795	
 796	def f(values):
 797		interp = asteval.Interpreter()
 798		for n,v in zip(minresult.var_names, values):
 799			interp(f'{n} = {v}')
 800		for q in minresult.params:
 801			if minresult.params[q].expr:
 802				interp(f'{q} = {minresult.params[q].expr}')
 803		return np.array([interp.symtable[q] for q in minresult.params])
 804
 805	# construct Jacobian
 806	J = np.zeros((minresult.nvarys, len(minresult.params)))
 807	X = np.array([minresult.params[p].value for p in minresult.var_names])
 808	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 809
 810	for j in range(minresult.nvarys):
 811		x1 = [_ for _ in X]
 812		x1[j] += epsilon * sX[j]
 813		x2 = [_ for _ in X]
 814		x2[j] -= epsilon * sX[j]
 815		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 816
 817	_names = [q for q in minresult.params]
 818	_covar = J.T @ minresult.covar @ J
 819	_se = np.diag(_covar)**.5
 820	_correl = _covar.copy()
 821	for k,s in enumerate(_se):
 822		if s:
 823			_correl[k,:] /= s
 824			_correl[:,k] /= s
 825
 826	if named:
 827		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 828		_se = {i: _se[i] for i in minresult.params}
 829		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 830
 831	return _names, _covar, _se, _correl
 832
 833
 834class D4xdata(list):
 835	'''
 836	Store and process data for a large set of Δ47 and/or Δ48
 837	analyses, usually comprising more than one analytical session.
 838	'''
 839
 840	### 17O CORRECTION PARAMETERS
 841	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 842	'''
 843	Absolute (13C/12C) ratio of VPDB.
 844	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 845	'''
 846
 847	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 848	'''
 849	Absolute (18O/16C) ratio of VSMOW.
 850	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 851	'''
 852
 853	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 854	'''
 855	Mass-dependent exponent for triple oxygen isotopes.
 856	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 857	'''
 858
 859	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 860	'''
 861	Absolute (17O/16C) ratio of VSMOW.
 862	By default equal to 0.00038475
 863	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 864	rescaled to `R13_VPDB`)
 865	'''
 866
 867	R18_VPDB = R18_VSMOW * 1.03092
 868	'''
 869	Absolute (18O/16C) ratio of VPDB.
 870	By definition equal to `R18_VSMOW * 1.03092`.
 871	'''
 872
 873	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 874	'''
 875	Absolute (17O/16C) ratio of VPDB.
 876	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 877	'''
 878
 879	LEVENE_REF_SAMPLE = 'ETH-3'
 880	'''
 881	After the Δ4x standardization step, each sample is tested to
 882	assess whether the Δ4x variance within all analyses for that
 883	sample differs significantly from that observed for a given reference
 884	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 885	which yields a p-value corresponding to the null hypothesis that the
 886	underlying variances are equal).
 887
 888	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 889	sample should be used as a reference for this test.
 890	'''
 891
 892	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 893	'''
 894	Specifies the 18O/16O fractionation factor generally applicable
 895	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 896	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 897
 898	By default equal to 1.008129 (calcite reacted at 90 °C,
 899	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 900	'''
 901
 902	Nominal_d13C_VPDB = {
 903		'ETH-1': 2.02,
 904		'ETH-2': -10.17,
 905		'ETH-3': 1.71,
 906		}	# (Bernasconi et al., 2018)
 907	'''
 908	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 909	`D4xdata.standardize_d13C()`.
 910
 911	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 912	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 913	'''
 914
 915	Nominal_d18O_VPDB = {
 916		'ETH-1': -2.19,
 917		'ETH-2': -18.69,
 918		'ETH-3': -1.78,
 919		}	# (Bernasconi et al., 2018)
 920	'''
 921	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 922	`D4xdata.standardize_d18O()`.
 923
 924	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 925	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 926	'''
 927
 928	d13C_STANDARDIZATION_METHOD = '2pt'
 929	'''
 930	Method by which to standardize δ13C values:
 931	
 932	+ `none`: do not apply any δ13C standardization.
 933	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 934	minimize the difference between final δ13C_VPDB values and
 935	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 936	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 937	values so as to minimize the difference between final δ13C_VPDB
 938	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 939	is defined).
 940	'''
 941
 942	d18O_STANDARDIZATION_METHOD = '2pt'
 943	'''
 944	Method by which to standardize δ18O values:
 945	
 946	+ `none`: do not apply any δ18O standardization.
 947	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 948	minimize the difference between final δ18O_VPDB values and
 949	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 950	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 951	values so as to minimize the difference between final δ18O_VPDB
 952	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 953	is defined).
 954	'''
 955
 956	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 957		'''
 958		**Parameters**
 959
 960		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 961		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 962		+ `mass`: `'47'` or `'48'`
 963		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 964		+ `session`: define session name for analyses without a `Session` key
 965		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 966
 967		Returns a `D4xdata` object derived from `list`.
 968		'''
 969		self._4x = mass
 970		self.verbose = verbose
 971		self.prefix = 'D4xdata'
 972		self.logfile = logfile
 973		list.__init__(self, l)
 974		self.Nf = None
 975		self.repeatability = {}
 976		self.refresh(session = session)
 977
 978
 979	def make_verbal(oldfun):
 980		'''
 981		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 982		'''
 983		@wraps(oldfun)
 984		def newfun(*args, verbose = '', **kwargs):
 985			myself = args[0]
 986			oldprefix = myself.prefix
 987			myself.prefix = oldfun.__name__
 988			if verbose != '':
 989				oldverbose = myself.verbose
 990				myself.verbose = verbose
 991			out = oldfun(*args, **kwargs)
 992			myself.prefix = oldprefix
 993			if verbose != '':
 994				myself.verbose = oldverbose
 995			return out
 996		return newfun
 997
 998
 999	def msg(self, txt):
1000		'''
1001		Log a message to `self.logfile`, and print it out if `verbose = True`
1002		'''
1003		self.log(txt)
1004		if self.verbose:
1005			print(f'{f"[{self.prefix}]":<16} {txt}')
1006
1007
1008	def vmsg(self, txt):
1009		'''
1010		Log a message to `self.logfile` and print it out
1011		'''
1012		self.log(txt)
1013		print(txt)
1014
1015
1016	def log(self, *txts):
1017		'''
1018		Log a message to `self.logfile`
1019		'''
1020		if self.logfile:
1021			with open(self.logfile, 'a') as fid:
1022				for txt in txts:
1023					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1024
1025
1026	def refresh(self, session = 'mySession'):
1027		'''
1028		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1029		'''
1030		self.fill_in_missing_info(session = session)
1031		self.refresh_sessions()
1032		self.refresh_samples()
1033
1034
1035	def refresh_sessions(self):
1036		'''
1037		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1038		to `False` for all sessions.
1039		'''
1040		self.sessions = {
1041			s: {'data': [r for r in self if r['Session'] == s]}
1042			for s in sorted({r['Session'] for r in self})
1043			}
1044		for s in self.sessions:
1045			self.sessions[s]['scrambling_drift'] = False
1046			self.sessions[s]['slope_drift'] = False
1047			self.sessions[s]['wg_drift'] = False
1048			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1049			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1050
1051
1052	def refresh_samples(self):
1053		'''
1054		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1055		'''
1056		self.samples = {
1057			s: {'data': [r for r in self if r['Sample'] == s]}
1058			for s in sorted({r['Sample'] for r in self})
1059			}
1060		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1061		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1062
1063
1064	def read(self, filename, sep = '', session = ''):
1065		'''
1066		Read file in csv format to load data into a `D47data` object.
1067
1068		In the csv file, spaces before and after field separators (`','` by default)
1069		are optional. Each line corresponds to a single analysis.
1070
1071		The required fields are:
1072
1073		+ `UID`: a unique identifier
1074		+ `Session`: an identifier for the analytical session
1075		+ `Sample`: a sample identifier
1076		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1077
1078		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1079		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1080		and `d49` are optional, and set to NaN by default.
1081
1082		**Parameters**
1083
1084		+ `fileneme`: the path of the file to read
1085		+ `sep`: csv separator delimiting the fields
1086		+ `session`: set `Session` field to this string for all analyses
1087		'''
1088		with open(filename) as fid:
1089			self.input(fid.read(), sep = sep, session = session)
1090
1091
1092	def input(self, txt, sep = '', session = ''):
1093		'''
1094		Read `txt` string in csv format to load analysis data into a `D47data` object.
1095
1096		In the csv string, spaces before and after field separators (`','` by default)
1097		are optional. Each line corresponds to a single analysis.
1098
1099		The required fields are:
1100
1101		+ `UID`: a unique identifier
1102		+ `Session`: an identifier for the analytical session
1103		+ `Sample`: a sample identifier
1104		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1105
1106		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1107		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1108		and `d49` are optional, and set to NaN by default.
1109
1110		**Parameters**
1111
1112		+ `txt`: the csv string to read
1113		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1114		whichever appers most often in `txt`.
1115		+ `session`: set `Session` field to this string for all analyses
1116		'''
1117		if sep == '':
1118			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1119		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1120		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1121
1122		if session != '':
1123			for r in data:
1124				r['Session'] = session
1125
1126		self += data
1127		self.refresh()
1128
1129
1130	@make_verbal
1131	def wg(self,
1132		samples = None,
1133		session_groups = None,
1134	):
1135		'''
1136		Compute bulk composition of the working gas for each session based (by default)
1137		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1138		`self.Nominal_d18O_VPDB`.
1139
1140		**Parameters**
1141
1142		+ `samples`: A list of samples specifying the subset of samples (defined in both
1143		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1144		when computing the working gas. By default, use all samples defined both in
1145		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1146		+ `session_groups`: a list of lists of sessions
1147		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1148		specifying which sessions groups, if any, have the exact same WG composition.
1149		If set to `'all'`, force all sessions to have the same WG composition (use with
1150		caution and on short time scales, since the WG may drift slowly a long time scales).
1151		'''
1152
1153		self.msg('Computing WG composition:')
1154
1155		a18_acid = self.ALPHA_18O_ACID_REACTION
1156		
1157		if samples is None:
1158			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1159		if session_groups is None:
1160			session_groups = [[s] for s in self.sessions]
1161		elif session_groups == 'all':
1162			session_groups = [[s for s in self.sessions]]
1163
1164		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1165		R45R46_standards = {}
1166		for sample in samples:
1167			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1168			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1169			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1170			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1171			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1172
1173			C12_s = 1 / (1 + R13_s)
1174			C13_s = R13_s / (1 + R13_s)
1175			C16_s = 1 / (1 + R17_s + R18_s)
1176			C17_s = R17_s / (1 + R17_s + R18_s)
1177			C18_s = R18_s / (1 + R17_s + R18_s)
1178
1179			C626_s = C12_s * C16_s ** 2
1180			C627_s = 2 * C12_s * C16_s * C17_s
1181			C628_s = 2 * C12_s * C16_s * C18_s
1182			C636_s = C13_s * C16_s ** 2
1183			C637_s = 2 * C13_s * C16_s * C17_s
1184			C727_s = C12_s * C17_s ** 2
1185
1186			R45_s = (C627_s + C636_s) / C626_s
1187			R46_s = (C628_s + C637_s + C727_s) / C626_s
1188			R45R46_standards[sample] = (R45_s, R46_s)
1189		
1190		for sg in session_groups:
1191			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1192			assert db, f'No sample from {samples} found in session group {sg}.'
1193
1194			X = [r['d45'] for r in db]
1195			Y = [R45R46_standards[r['Sample']][0] for r in db]
1196			x1, x2 = np.min(X), np.max(X)
1197
1198			if x1 < x2:
1199				wgcoord = x1/(x1-x2)
1200			else:
1201				wgcoord = 999
1202
1203			if wgcoord < -.5 or wgcoord > 1.5:
1204				# unreasonable to extrapolate to d45 = 0
1205				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1206			else :
1207				# d45 = 0 is reasonably well bracketed
1208				R45_wg = np.polyfit(X, Y, 1)[1]
1209
1210			X = [r['d46'] for r in db]
1211			Y = [R45R46_standards[r['Sample']][1] for r in db]
1212			x1, x2 = np.min(X), np.max(X)
1213
1214			if x1 < x2:
1215				wgcoord = x1/(x1-x2)
1216			else:
1217				wgcoord = 999
1218
1219			if wgcoord < -.5 or wgcoord > 1.5:
1220				# unreasonable to extrapolate to d46 = 0
1221				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1222			else :
1223				# d46 = 0 is reasonably well bracketed
1224				R46_wg = np.polyfit(X, Y, 1)[1]
1225
1226			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1227
1228			for s in sg:
1229				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1230	
1231				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1232				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1233				for r in self.sessions[s]['data']:
1234					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1235					r['d18Owg_VSMOW'] = d18Owg_VSMOW
1236
1237
1238	def compute_bulk_delta(self, R45, R46, D17O = 0):
1239		'''
1240		Compute δ13C_VPDB and δ18O_VSMOW,
1241		by solving the generalized form of equation (17) from
1242		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1243		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1244		solving the corresponding second-order Taylor polynomial.
1245		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1246		'''
1247
1248		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1249
1250		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1251		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1252		C = 2 * self.R18_VSMOW
1253		D = -R46
1254
1255		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1256		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1257		cc = A + B + C + D
1258
1259		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1260
1261		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1262		R17 = K * R18 ** self.LAMBDA_17
1263		R13 = R45 - 2 * R17
1264
1265		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1266
1267		return d13C_VPDB, d18O_VSMOW
1268
1269
1270	@make_verbal
1271	def crunch(self, verbose = ''):
1272		'''
1273		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1274		'''
1275		for r in self:
1276			self.compute_bulk_and_clumping_deltas(r)
1277		self.standardize_d13C()
1278		self.standardize_d18O()
1279		self.msg(f"Crunched {len(self)} analyses.")
1280
1281
1282	def fill_in_missing_info(self, session = 'mySession'):
1283		'''
1284		Fill in optional fields with default values
1285		'''
1286		for i,r in enumerate(self):
1287			if 'D17O' not in r:
1288				r['D17O'] = 0.
1289			if 'UID' not in r:
1290				r['UID'] = f'{i+1}'
1291			if 'Session' not in r:
1292				r['Session'] = session
1293			for k in ['d47', 'd48', 'd49']:
1294				if k not in r:
1295					r[k] = np.nan
1296
1297
1298	def standardize_d13C(self):
1299		'''
1300		Perform δ13C standadization within each session `s` according to
1301		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1302		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1303		may be redefined abitrarily at a later stage.
1304		'''
1305		for s in self.sessions:
1306			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1307				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1308				X,Y = zip(*XY)
1309				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1310					offset = np.mean(Y) - np.mean(X)
1311					for r in self.sessions[s]['data']:
1312						r['d13C_VPDB'] += offset				
1313				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1314					a,b = np.polyfit(X,Y,1)
1315					for r in self.sessions[s]['data']:
1316						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1317
1318	def standardize_d18O(self):
1319		'''
1320		Perform δ18O standadization within each session `s` according to
1321		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1322		which is defined by default by `D47data.refresh_sessions()`as equal to
1323		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1324		'''
1325		for s in self.sessions:
1326			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1327				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1328				X,Y = zip(*XY)
1329				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1330				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1331					offset = np.mean(Y) - np.mean(X)
1332					for r in self.sessions[s]['data']:
1333						r['d18O_VSMOW'] += offset				
1334				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1335					a,b = np.polyfit(X,Y,1)
1336					for r in self.sessions[s]['data']:
1337						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1338	
1339
1340	def compute_bulk_and_clumping_deltas(self, r):
1341		'''
1342		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1343		'''
1344
1345		# Compute working gas R13, R18, and isobar ratios
1346		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1347		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1348		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1349
1350		# Compute analyte isobar ratios
1351		R45 = (1 + r['d45'] / 1000) * R45_wg
1352		R46 = (1 + r['d46'] / 1000) * R46_wg
1353		R47 = (1 + r['d47'] / 1000) * R47_wg
1354		R48 = (1 + r['d48'] / 1000) * R48_wg
1355		R49 = (1 + r['d49'] / 1000) * R49_wg
1356
1357		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1358		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1359		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1360
1361		# Compute stochastic isobar ratios of the analyte
1362		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1363			R13, R18, D17O = r['D17O']
1364		)
1365
1366		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1367		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1368		if (R45 / R45stoch - 1) > 5e-8:
1369			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1370		if (R46 / R46stoch - 1) > 5e-8:
1371			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1372
1373		# Compute raw clumped isotope anomalies
1374		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1375		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1376		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1377
1378
1379	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1380		'''
1381		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1382		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1383		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1384		'''
1385
1386		# Compute R17
1387		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1388
1389		# Compute isotope concentrations
1390		C12 = (1 + R13) ** -1
1391		C13 = C12 * R13
1392		C16 = (1 + R17 + R18) ** -1
1393		C17 = C16 * R17
1394		C18 = C16 * R18
1395
1396		# Compute stochastic isotopologue concentrations
1397		C626 = C16 * C12 * C16
1398		C627 = C16 * C12 * C17 * 2
1399		C628 = C16 * C12 * C18 * 2
1400		C636 = C16 * C13 * C16
1401		C637 = C16 * C13 * C17 * 2
1402		C638 = C16 * C13 * C18 * 2
1403		C727 = C17 * C12 * C17
1404		C728 = C17 * C12 * C18 * 2
1405		C737 = C17 * C13 * C17
1406		C738 = C17 * C13 * C18 * 2
1407		C828 = C18 * C12 * C18
1408		C838 = C18 * C13 * C18
1409
1410		# Compute stochastic isobar ratios
1411		R45 = (C636 + C627) / C626
1412		R46 = (C628 + C637 + C727) / C626
1413		R47 = (C638 + C728 + C737) / C626
1414		R48 = (C738 + C828) / C626
1415		R49 = C838 / C626
1416
1417		# Account for stochastic anomalies
1418		R47 *= 1 + D47 / 1000
1419		R48 *= 1 + D48 / 1000
1420		R49 *= 1 + D49 / 1000
1421
1422		# Return isobar ratios
1423		return R45, R46, R47, R48, R49
1424
1425
1426	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1427		'''
1428		Split unknown samples by UID (treat all analyses as different samples)
1429		or by session (treat analyses of a given sample in different sessions as
1430		different samples).
1431
1432		**Parameters**
1433
1434		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1435		+ `grouping`: `by_uid` | `by_session`
1436		'''
1437		if samples_to_split == 'all':
1438			samples_to_split = [s for s in self.unknowns]
1439		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1440		self.grouping = grouping.lower()
1441		if self.grouping in gkeys:
1442			gkey = gkeys[self.grouping]
1443		for r in self:
1444			if r['Sample'] in samples_to_split:
1445				r['Sample_original'] = r['Sample']
1446				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1447			elif r['Sample'] in self.unknowns:
1448				r['Sample_original'] = r['Sample']
1449		self.refresh_samples()
1450
1451
1452	def unsplit_samples(self, tables = False):
1453		'''
1454		Reverse the effects of `D47data.split_samples()`.
1455		
1456		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1457		
1458		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1459		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1460		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1461		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1462		that case session-averaged Δ4x values are statistically independent).
1463		'''
1464		unknowns_old = sorted({s for s in self.unknowns})
1465		CM_old = self.standardization.covar[:,:]
1466		VD_old = self.standardization.params.valuesdict().copy()
1467		vars_old = self.standardization.var_names
1468
1469		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1470
1471		Ns = len(vars_old) - len(unknowns_old)
1472		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1473		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1474
1475		W = np.zeros((len(vars_new), len(vars_old)))
1476		W[:Ns,:Ns] = np.eye(Ns)
1477		for u in unknowns_new:
1478			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1479			if self.grouping == 'by_session':
1480				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1481			elif self.grouping == 'by_uid':
1482				weights = [1 for s in splits]
1483			sw = sum(weights)
1484			weights = [w/sw for w in weights]
1485			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1486
1487		CM_new = W @ CM_old @ W.T
1488		V = W @ np.array([[VD_old[k]] for k in vars_old])
1489		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1490
1491		self.standardization.covar = CM_new
1492		self.standardization.params.valuesdict = lambda : VD_new
1493		self.standardization.var_names = vars_new
1494
1495		for r in self:
1496			if r['Sample'] in self.unknowns:
1497				r['Sample_split'] = r['Sample']
1498				r['Sample'] = r['Sample_original']
1499
1500		self.refresh_samples()
1501		self.consolidate_samples()
1502		self.repeatabilities()
1503
1504		if tables:
1505			self.table_of_analyses()
1506			self.table_of_samples()
1507
1508	def assign_timestamps(self):
1509		'''
1510		Assign a time field `t` of type `float` to each analysis.
1511
1512		If `TimeTag` is one of the data fields, `t` is equal within a given session
1513		to `TimeTag` minus the mean value of `TimeTag` for that session.
1514		Otherwise, `TimeTag` is by default equal to the index of each analysis
1515		in the dataset and `t` is defined as above.
1516		'''
1517		for session in self.sessions:
1518			sdata = self.sessions[session]['data']
1519			try:
1520				t0 = np.mean([r['TimeTag'] for r in sdata])
1521				for r in sdata:
1522					r['t'] = r['TimeTag'] - t0
1523			except KeyError:
1524				t0 = (len(sdata)-1)/2
1525				for t,r in enumerate(sdata):
1526					r['t'] = t - t0
1527
1528
1529	def report(self):
1530		'''
1531		Prints a report on the standardization fit.
1532		Only applicable after `D4xdata.standardize(method='pooled')`.
1533		'''
1534		report_fit(self.standardization)
1535
1536
1537	def combine_samples(self, sample_groups):
1538		'''
1539		Combine analyses of different samples to compute weighted average Δ4x
1540		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1541		dictionary.
1542		
1543		Caution: samples are weighted by number of replicate analyses, which is a
1544		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1545		correlated analytical errors for one or more samples).
1546		
1547		Returns a tuplet of:
1548		
1549		+ the list of group names
1550		+ an array of the corresponding Δ4x values
1551		+ the corresponding (co)variance matrix
1552		
1553		**Parameters**
1554
1555		+ `sample_groups`: a dictionary of the form:
1556		```py
1557		{'group1': ['sample_1', 'sample_2'],
1558		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1559		```
1560		'''
1561		
1562		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1563		groups = sorted(sample_groups.keys())
1564		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1565		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1566		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1567		W = np.array([
1568			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1569			for j in groups])
1570		D4x_new = W @ D4x_old
1571		CM_new = W @ CM_old @ W.T
1572
1573		return groups, D4x_new[:,0], CM_new
1574		
1575
1576	@make_verbal
1577	def standardize(self,
1578		method = 'pooled',
1579		weighted_sessions = [],
1580		consolidate = True,
1581		consolidate_tables = False,
1582		consolidate_plots = False,
1583		constraints = {},
1584		):
1585		'''
1586		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1587		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1588		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1589		i.e. that their true Δ4x value does not change between sessions,
1590		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1591		`'indep_sessions'`, the standardization processes each session independently, based only
1592		on anchors analyses.
1593		'''
1594
1595		self.standardization_method = method
1596		self.assign_timestamps()
1597
1598		if method == 'pooled':
1599			if weighted_sessions:
1600				for session_group in weighted_sessions:
1601					if self._4x == '47':
1602						X = D47data([r for r in self if r['Session'] in session_group])
1603					elif self._4x == '48':
1604						X = D48data([r for r in self if r['Session'] in session_group])
1605					X.Nominal_D4x = self.Nominal_D4x.copy()
1606					X.refresh()
1607					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1608					w = np.sqrt(result.redchi)
1609					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1610					for r in X:
1611						r[f'wD{self._4x}raw'] *= w
1612			else:
1613				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1614				for r in self:
1615					r[f'wD{self._4x}raw'] = 1.
1616
1617			params = Parameters()
1618			for k,session in enumerate(self.sessions):
1619				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1620				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1621				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1622				s = pf(session)
1623				params.add(f'a_{s}', value = 0.9)
1624				params.add(f'b_{s}', value = 0.)
1625				params.add(f'c_{s}', value = -0.9)
1626				params.add(f'a2_{s}', value = 0.,
1627# 					vary = self.sessions[session]['scrambling_drift'],
1628					)
1629				params.add(f'b2_{s}', value = 0.,
1630# 					vary = self.sessions[session]['slope_drift'],
1631					)
1632				params.add(f'c2_{s}', value = 0.,
1633# 					vary = self.sessions[session]['wg_drift'],
1634					)
1635				if not self.sessions[session]['scrambling_drift']:
1636					params[f'a2_{s}'].expr = '0'
1637				if not self.sessions[session]['slope_drift']:
1638					params[f'b2_{s}'].expr = '0'
1639				if not self.sessions[session]['wg_drift']:
1640					params[f'c2_{s}'].expr = '0'
1641
1642			for sample in self.unknowns:
1643				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1644
1645			for k in constraints:
1646				params[k].expr = constraints[k]
1647
1648			def residuals(p):
1649				R = []
1650				for r in self:
1651					session = pf(r['Session'])
1652					sample = pf(r['Sample'])
1653					if r['Sample'] in self.Nominal_D4x:
1654						R += [ (
1655							r[f'D{self._4x}raw'] - (
1656								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1657								+ p[f'b_{session}'] * r[f'd{self._4x}']
1658								+	p[f'c_{session}']
1659								+ r['t'] * (
1660									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1661									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1662									+	p[f'c2_{session}']
1663									)
1664								)
1665							) / r[f'wD{self._4x}raw'] ]
1666					else:
1667						R += [ (
1668							r[f'D{self._4x}raw'] - (
1669								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1670								+ p[f'b_{session}'] * r[f'd{self._4x}']
1671								+	p[f'c_{session}']
1672								+ r['t'] * (
1673									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1674									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1675									+	p[f'c2_{session}']
1676									)
1677								)
1678							) / r[f'wD{self._4x}raw'] ]
1679				return R
1680
1681			M = Minimizer(residuals, params)
1682			result = M.least_squares()
1683			self.Nf = result.nfree
1684			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1685			new_names, new_covar, new_se = _fullcovar(result)[:3]
1686			result.var_names = new_names
1687			result.covar = new_covar
1688
1689			for r in self:
1690				s = pf(r["Session"])
1691				a = result.params.valuesdict()[f'a_{s}']
1692				b = result.params.valuesdict()[f'b_{s}']
1693				c = result.params.valuesdict()[f'c_{s}']
1694				a2 = result.params.valuesdict()[f'a2_{s}']
1695				b2 = result.params.valuesdict()[f'b2_{s}']
1696				c2 = result.params.valuesdict()[f'c2_{s}']
1697				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1698				
1699
1700			self.standardization = result
1701
1702			for session in self.sessions:
1703				self.sessions[session]['Np'] = 3
1704				for k in ['scrambling', 'slope', 'wg']:
1705					if self.sessions[session][f'{k}_drift']:
1706						self.sessions[session]['Np'] += 1
1707
1708			if consolidate:
1709				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1710			return result
1711
1712
1713		elif method == 'indep_sessions':
1714
1715			if weighted_sessions:
1716				for session_group in weighted_sessions:
1717					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1718					X.Nominal_D4x = self.Nominal_D4x.copy()
1719					X.refresh()
1720					# This is only done to assign r['wD47raw'] for r in X:
1721					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1722					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1723			else:
1724				self.msg('All weights set to 1 ‰')
1725				for r in self:
1726					r[f'wD{self._4x}raw'] = 1
1727
1728			for session in self.sessions:
1729				s = self.sessions[session]
1730				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1731				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1732				s['Np'] = sum(p_active)
1733				sdata = s['data']
1734
1735				A = np.array([
1736					[
1737						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1738						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1739						1 / r[f'wD{self._4x}raw'],
1740						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1741						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1742						r['t'] / r[f'wD{self._4x}raw']
1743						]
1744					for r in sdata if r['Sample'] in self.anchors
1745					])[:,p_active] # only keep columns for the active parameters
1746				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1747				s['Na'] = Y.size
1748				CM = linalg.inv(A.T @ A)
1749				bf = (CM @ A.T @ Y).T[0,:]
1750				k = 0
1751				for n,a in zip(p_names, p_active):
1752					if a:
1753						s[n] = bf[k]
1754# 						self.msg(f'{n} = {bf[k]}')
1755						k += 1
1756					else:
1757						s[n] = 0.
1758# 						self.msg(f'{n} = 0.0')
1759
1760				for r in sdata :
1761					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1762					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1763					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1764
1765				s['CM'] = np.zeros((6,6))
1766				i = 0
1767				k_active = [j for j,a in enumerate(p_active) if a]
1768				for j,a in enumerate(p_active):
1769					if a:
1770						s['CM'][j,k_active] = CM[i,:]
1771						i += 1
1772
1773			if not weighted_sessions:
1774				w = self.rmswd()['rmswd']
1775				for r in self:
1776						r[f'wD{self._4x}'] *= w
1777						r[f'wD{self._4x}raw'] *= w
1778				for session in self.sessions:
1779					self.sessions[session]['CM'] *= w**2
1780
1781			for session in self.sessions:
1782				s = self.sessions[session]
1783				s['SE_a'] = s['CM'][0,0]**.5
1784				s['SE_b'] = s['CM'][1,1]**.5
1785				s['SE_c'] = s['CM'][2,2]**.5
1786				s['SE_a2'] = s['CM'][3,3]**.5
1787				s['SE_b2'] = s['CM'][4,4]**.5
1788				s['SE_c2'] = s['CM'][5,5]**.5
1789
1790			if not weighted_sessions:
1791				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1792			else:
1793				self.Nf = 0
1794				for sg in weighted_sessions:
1795					self.Nf += self.rmswd(sessions = sg)['Nf']
1796
1797			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1798
1799			avgD4x = {
1800				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1801				for sample in self.samples
1802				}
1803			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1804			rD4x = (chi2/self.Nf)**.5
1805			self.repeatability[f'sigma_{self._4x}'] = rD4x
1806
1807			if consolidate:
1808				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1809
1810
1811	def standardization_error(self, session, d4x, D4x, t = 0):
1812		'''
1813		Compute standardization error for a given session and
1814		(δ47, Δ47) composition.
1815		'''
1816		a = self.sessions[session]['a']
1817		b = self.sessions[session]['b']
1818		c = self.sessions[session]['c']
1819		a2 = self.sessions[session]['a2']
1820		b2 = self.sessions[session]['b2']
1821		c2 = self.sessions[session]['c2']
1822		CM = self.sessions[session]['CM']
1823
1824		x, y = D4x, d4x
1825		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1826# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1827		dxdy = -(b+b2*t) / (a+a2*t)
1828		dxdz = 1. / (a+a2*t)
1829		dxda = -x / (a+a2*t)
1830		dxdb = -y / (a+a2*t)
1831		dxdc = -1. / (a+a2*t)
1832		dxda2 = -x * a2 / (a+a2*t)
1833		dxdb2 = -y * t / (a+a2*t)
1834		dxdc2 = -t / (a+a2*t)
1835		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1836		sx = (V @ CM @ V.T) ** .5
1837		return sx
1838
1839
1840	@make_verbal
1841	def summary(self,
1842		dir = 'output',
1843		filename = None,
1844		save_to_file = True,
1845		print_out = True,
1846		):
1847		'''
1848		Print out an/or save to disk a summary of the standardization results.
1849
1850		**Parameters**
1851
1852		+ `dir`: the directory in which to save the table
1853		+ `filename`: the name to the csv file to write to
1854		+ `save_to_file`: whether to save the table to disk
1855		+ `print_out`: whether to print out the table
1856		'''
1857
1858		out = []
1859		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1860		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1861		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1862		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1863		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1864		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1865		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1866		out += [['Model degrees of freedom', f"{self.Nf}"]]
1867		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1868		out += [['Standardization method', self.standardization_method]]
1869
1870		if save_to_file:
1871			if not os.path.exists(dir):
1872				os.makedirs(dir)
1873			if filename is None:
1874				filename = f'D{self._4x}_summary.csv'
1875			with open(f'{dir}/{filename}', 'w') as fid:
1876				fid.write(make_csv(out))
1877		if print_out:
1878			self.msg('\n' + pretty_table(out, header = 0))
1879
1880
1881	@make_verbal
1882	def table_of_sessions(self,
1883		dir = 'output',
1884		filename = None,
1885		save_to_file = True,
1886		print_out = True,
1887		output = None,
1888		):
1889		'''
1890		Print out an/or save to disk a table of sessions.
1891
1892		**Parameters**
1893
1894		+ `dir`: the directory in which to save the table
1895		+ `filename`: the name to the csv file to write to
1896		+ `save_to_file`: whether to save the table to disk
1897		+ `print_out`: whether to print out the table
1898		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1899		    if set to `'raw'`: return a list of list of strings
1900		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1901		'''
1902		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1903		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1904		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1905
1906		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1907		if include_a2:
1908			out[-1] += ['a2 ± SE']
1909		if include_b2:
1910			out[-1] += ['b2 ± SE']
1911		if include_c2:
1912			out[-1] += ['c2 ± SE']
1913		for session in self.sessions:
1914			out += [[
1915				session,
1916				f"{self.sessions[session]['Na']}",
1917				f"{self.sessions[session]['Nu']}",
1918				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1919				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1920				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1921				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1922				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1923				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1924				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1925				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1926				]]
1927			if include_a2:
1928				if self.sessions[session]['scrambling_drift']:
1929					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1930				else:
1931					out[-1] += ['']
1932			if include_b2:
1933				if self.sessions[session]['slope_drift']:
1934					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1935				else:
1936					out[-1] += ['']
1937			if include_c2:
1938				if self.sessions[session]['wg_drift']:
1939					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1940				else:
1941					out[-1] += ['']
1942
1943		if save_to_file:
1944			if not os.path.exists(dir):
1945				os.makedirs(dir)
1946			if filename is None:
1947				filename = f'D{self._4x}_sessions.csv'
1948			with open(f'{dir}/{filename}', 'w') as fid:
1949				fid.write(make_csv(out))
1950		if print_out:
1951			self.msg('\n' + pretty_table(out))
1952		if output == 'raw':
1953			return out
1954		elif output == 'pretty':
1955			return pretty_table(out)
1956
1957
1958	@make_verbal
1959	def table_of_analyses(
1960		self,
1961		dir = 'output',
1962		filename = None,
1963		save_to_file = True,
1964		print_out = True,
1965		output = None,
1966		):
1967		'''
1968		Print out an/or save to disk a table of analyses.
1969
1970		**Parameters**
1971
1972		+ `dir`: the directory in which to save the table
1973		+ `filename`: the name to the csv file to write to
1974		+ `save_to_file`: whether to save the table to disk
1975		+ `print_out`: whether to print out the table
1976		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1977		    if set to `'raw'`: return a list of list of strings
1978		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1979		'''
1980
1981		out = [['UID','Session','Sample']]
1982		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1983		for f in extra_fields:
1984			out[-1] += [f[0]]
1985		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1986		for r in self:
1987			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1988			for f in extra_fields:
1989				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1990			out[-1] += [
1991				f"{r['d13Cwg_VPDB']:.3f}",
1992				f"{r['d18Owg_VSMOW']:.3f}",
1993				f"{r['d45']:.6f}",
1994				f"{r['d46']:.6f}",
1995				f"{r['d47']:.6f}",
1996				f"{r['d48']:.6f}",
1997				f"{r['d49']:.6f}",
1998				f"{r['d13C_VPDB']:.6f}",
1999				f"{r['d18O_VSMOW']:.6f}",
2000				f"{r['D47raw']:.6f}",
2001				f"{r['D48raw']:.6f}",
2002				f"{r['D49raw']:.6f}",
2003				f"{r[f'D{self._4x}']:.6f}"
2004				]
2005		if save_to_file:
2006			if not os.path.exists(dir):
2007				os.makedirs(dir)
2008			if filename is None:
2009				filename = f'D{self._4x}_analyses.csv'
2010			with open(f'{dir}/{filename}', 'w') as fid:
2011				fid.write(make_csv(out))
2012		if print_out:
2013			self.msg('\n' + pretty_table(out))
2014		return out
2015
2016	@make_verbal
2017	def covar_table(
2018		self,
2019		correl = False,
2020		dir = 'output',
2021		filename = None,
2022		save_to_file = True,
2023		print_out = True,
2024		output = None,
2025		):
2026		'''
2027		Print out, save to disk and/or return the variance-covariance matrix of D4x
2028		for all unknown samples.
2029
2030		**Parameters**
2031
2032		+ `dir`: the directory in which to save the csv
2033		+ `filename`: the name of the csv file to write to
2034		+ `save_to_file`: whether to save the csv
2035		+ `print_out`: whether to print out the matrix
2036		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2037		    if set to `'raw'`: return a list of list of strings
2038		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2039		'''
2040		samples = sorted([u for u in self.unknowns])
2041		out = [[''] + samples]
2042		for s1 in samples:
2043			out.append([s1])
2044			for s2 in samples:
2045				if correl:
2046					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2047				else:
2048					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2049
2050		if save_to_file:
2051			if not os.path.exists(dir):
2052				os.makedirs(dir)
2053			if filename is None:
2054				if correl:
2055					filename = f'D{self._4x}_correl.csv'
2056				else:
2057					filename = f'D{self._4x}_covar.csv'
2058			with open(f'{dir}/{filename}', 'w') as fid:
2059				fid.write(make_csv(out))
2060		if print_out:
2061			self.msg('\n'+pretty_table(out))
2062		if output == 'raw':
2063			return out
2064		elif output == 'pretty':
2065			return pretty_table(out)
2066
2067	@make_verbal
2068	def table_of_samples(
2069		self,
2070		dir = 'output',
2071		filename = None,
2072		save_to_file = True,
2073		print_out = True,
2074		output = None,
2075		):
2076		'''
2077		Print out, save to disk and/or return a table of samples.
2078
2079		**Parameters**
2080
2081		+ `dir`: the directory in which to save the csv
2082		+ `filename`: the name of the csv file to write to
2083		+ `save_to_file`: whether to save the csv
2084		+ `print_out`: whether to print out the table
2085		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2086		    if set to `'raw'`: return a list of list of strings
2087		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2088		'''
2089
2090		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2091		for sample in self.anchors:
2092			out += [[
2093				f"{sample}",
2094				f"{self.samples[sample]['N']}",
2095				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2096				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2097				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2098				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2099				]]
2100		for sample in self.unknowns:
2101			out += [[
2102				f"{sample}",
2103				f"{self.samples[sample]['N']}",
2104				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2105				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2106				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2107				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2108				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2109				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2110				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2111				]]
2112		if save_to_file:
2113			if not os.path.exists(dir):
2114				os.makedirs(dir)
2115			if filename is None:
2116				filename = f'D{self._4x}_samples.csv'
2117			with open(f'{dir}/{filename}', 'w') as fid:
2118				fid.write(make_csv(out))
2119		if print_out:
2120			self.msg('\n'+pretty_table(out))
2121		if output == 'raw':
2122			return out
2123		elif output == 'pretty':
2124			return pretty_table(out)
2125
2126
2127	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2128		'''
2129		Generate session plots and save them to disk.
2130
2131		**Parameters**
2132
2133		+ `dir`: the directory in which to save the plots
2134		+ `figsize`: the width and height (in inches) of each plot
2135		+ `filetype`: 'pdf' or 'png'
2136		+ `dpi`: resolution for PNG output
2137		'''
2138		if not os.path.exists(dir):
2139			os.makedirs(dir)
2140
2141		for session in self.sessions:
2142			sp = self.plot_single_session(session, xylimits = 'constant')
2143			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2144			ppl.close(sp.fig)
2145			
2146
2147
2148	@make_verbal
2149	def consolidate_samples(self):
2150		'''
2151		Compile various statistics for each sample.
2152
2153		For each anchor sample:
2154
2155		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2156		+ `SE_D47` or `SE_D48`: set to zero by definition
2157
2158		For each unknown sample:
2159
2160		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2161		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2162
2163		For each anchor and unknown:
2164
2165		+ `N`: the total number of analyses of this sample
2166		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2167		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2168		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2169		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2170		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2171		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2172		'''
2173		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2174		for sample in self.samples:
2175			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2176			if self.samples[sample]['N'] > 1:
2177				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2178
2179			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2180			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2181
2182			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2183			if len(D4x_pop) > 2:
2184				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2185			
2186		if self.standardization_method == 'pooled':
2187			for sample in self.anchors:
2188				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2189				self.samples[sample][f'SE_D{self._4x}'] = 0.
2190			for sample in self.unknowns:
2191				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2192				try:
2193					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2194				except ValueError:
2195					# when `sample` is constrained by self.standardize(constraints = {...}),
2196					# it is no longer listed in self.standardization.var_names.
2197					# Temporary fix: define SE as zero for now
2198					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2199
2200		elif self.standardization_method == 'indep_sessions':
2201			for sample in self.anchors:
2202				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2203				self.samples[sample][f'SE_D{self._4x}'] = 0.
2204			for sample in self.unknowns:
2205				self.msg(f'Consolidating sample {sample}')
2206				self.unknowns[sample][f'session_D{self._4x}'] = {}
2207				session_avg = []
2208				for session in self.sessions:
2209					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2210					if sdata:
2211						self.msg(f'{sample} found in session {session}')
2212						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2213						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2214						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2215						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2216						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2217						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2218						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2219				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2220				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2221				wsum = sum([weights[s] for s in weights])
2222				for s in weights:
2223					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2224
2225		for r in self:
2226			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2227
2228
2229
2230	def consolidate_sessions(self):
2231		'''
2232		Compute various statistics for each session.
2233
2234		+ `Na`: Number of anchor analyses in the session
2235		+ `Nu`: Number of unknown analyses in the session
2236		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2237		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2238		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2239		+ `a`: scrambling factor
2240		+ `b`: compositional slope
2241		+ `c`: WG offset
2242		+ `SE_a`: Model stadard erorr of `a`
2243		+ `SE_b`: Model stadard erorr of `b`
2244		+ `SE_c`: Model stadard erorr of `c`
2245		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2246		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2247		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2248		+ `a2`: scrambling factor drift
2249		+ `b2`: compositional slope drift
2250		+ `c2`: WG offset drift
2251		+ `Np`: Number of standardization parameters to fit
2252		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2253		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2254		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2255		'''
2256		for session in self.sessions:
2257			if 'd13Cwg_VPDB' not in self.sessions[session]:
2258				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2259			if 'd18Owg_VSMOW' not in self.sessions[session]:
2260				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2261			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2262			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2263
2264			self.msg(f'Computing repeatabilities for session {session}')
2265			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2266			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2267			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2268
2269		if self.standardization_method == 'pooled':
2270			for session in self.sessions:
2271
2272				# different (better?) computation of D4x repeatability for each session:
2273				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2274				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2275
2276				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2277				i = self.standardization.var_names.index(f'a_{pf(session)}')
2278				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2279
2280				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2281				i = self.standardization.var_names.index(f'b_{pf(session)}')
2282				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2283
2284				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2285				i = self.standardization.var_names.index(f'c_{pf(session)}')
2286				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2287
2288				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2289				if self.sessions[session]['scrambling_drift']:
2290					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2291					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2292				else:
2293					self.sessions[session]['SE_a2'] = 0.
2294
2295				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2296				if self.sessions[session]['slope_drift']:
2297					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2298					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2299				else:
2300					self.sessions[session]['SE_b2'] = 0.
2301
2302				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2303				if self.sessions[session]['wg_drift']:
2304					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2305					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2306				else:
2307					self.sessions[session]['SE_c2'] = 0.
2308
2309				i = self.standardization.var_names.index(f'a_{pf(session)}')
2310				j = self.standardization.var_names.index(f'b_{pf(session)}')
2311				k = self.standardization.var_names.index(f'c_{pf(session)}')
2312				CM = np.zeros((6,6))
2313				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2314				try:
2315					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2316					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2317					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2318					try:
2319						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2320						CM[3,4] = self.standardization.covar[i2,j2]
2321						CM[4,3] = self.standardization.covar[j2,i2]
2322					except ValueError:
2323						pass
2324					try:
2325						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2326						CM[3,5] = self.standardization.covar[i2,k2]
2327						CM[5,3] = self.standardization.covar[k2,i2]
2328					except ValueError:
2329						pass
2330				except ValueError:
2331					pass
2332				try:
2333					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2334					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2335					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2336					try:
2337						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2338						CM[4,5] = self.standardization.covar[j2,k2]
2339						CM[5,4] = self.standardization.covar[k2,j2]
2340					except ValueError:
2341						pass
2342				except ValueError:
2343					pass
2344				try:
2345					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2346					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2347					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2348				except ValueError:
2349					pass
2350
2351				self.sessions[session]['CM'] = CM
2352
2353		elif self.standardization_method == 'indep_sessions':
2354			pass # Not implemented yet
2355
2356
2357	@make_verbal
2358	def repeatabilities(self):
2359		'''
2360		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2361		(for all samples, for anchors, and for unknowns).
2362		'''
2363		self.msg('Computing reproducibilities for all sessions')
2364
2365		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2366		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2367		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2368		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2369		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2370
2371
2372	@make_verbal
2373	def consolidate(self, tables = True, plots = True):
2374		'''
2375		Collect information about samples, sessions and repeatabilities.
2376		'''
2377		self.consolidate_samples()
2378		self.consolidate_sessions()
2379		self.repeatabilities()
2380
2381		if tables:
2382			self.summary()
2383			self.table_of_sessions()
2384			self.table_of_analyses()
2385			self.table_of_samples()
2386
2387		if plots:
2388			self.plot_sessions()
2389
2390
2391	@make_verbal
2392	def rmswd(self,
2393		samples = 'all samples',
2394		sessions = 'all sessions',
2395		):
2396		'''
2397		Compute the χ2, root mean squared weighted deviation
2398		(i.e. reduced χ2), and corresponding degrees of freedom of the
2399		Δ4x values for samples in `samples` and sessions in `sessions`.
2400		
2401		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2402		'''
2403		if samples == 'all samples':
2404			mysamples = [k for k in self.samples]
2405		elif samples == 'anchors':
2406			mysamples = [k for k in self.anchors]
2407		elif samples == 'unknowns':
2408			mysamples = [k for k in self.unknowns]
2409		else:
2410			mysamples = samples
2411
2412		if sessions == 'all sessions':
2413			sessions = [k for k in self.sessions]
2414
2415		chisq, Nf = 0, 0
2416		for sample in mysamples :
2417			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2418			if len(G) > 1 :
2419				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2420				Nf += (len(G) - 1)
2421				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2422		r = (chisq / Nf)**.5 if Nf > 0 else 0
2423		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2424		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2425
2426	
2427	@make_verbal
2428	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2429		'''
2430		Compute the repeatability of `[r[key] for r in self]`
2431		'''
2432
2433		if samples == 'all samples':
2434			mysamples = [k for k in self.samples]
2435		elif samples == 'anchors':
2436			mysamples = [k for k in self.anchors]
2437		elif samples == 'unknowns':
2438			mysamples = [k for k in self.unknowns]
2439		else:
2440			mysamples = samples
2441
2442		if sessions == 'all sessions':
2443			sessions = [k for k in self.sessions]
2444
2445		if key in ['D47', 'D48']:
2446			# Full disclosure: the definition of Nf is tricky/debatable
2447			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2448			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2449			Nf = len(G)
2450# 			print(f'len(G) = {Nf}')
2451			Nf -= len([s for s in mysamples if s in self.unknowns])
2452# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2453			for session in sessions:
2454				Np = len([
2455					_ for _ in self.standardization.params
2456					if (
2457						self.standardization.params[_].expr is not None
2458						and (
2459							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2460							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2461							)
2462						)
2463					])
2464# 				print(f'session {session}: {Np} parameters to consider')
2465				Na = len({
2466					r['Sample'] for r in self.sessions[session]['data']
2467					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2468					})
2469# 				print(f'session {session}: {Na} different anchors in that session')
2470				Nf -= min(Np, Na)
2471# 			print(f'Nf = {Nf}')
2472
2473# 			for sample in mysamples :
2474# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2475# 				if len(X) > 1 :
2476# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2477# 					if sample in self.unknowns:
2478# 						Nf += len(X) - 1
2479# 					else:
2480# 						Nf += len(X)
2481# 			if samples in ['anchors', 'all samples']:
2482# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2483			r = (chisq / Nf)**.5 if Nf > 0 else 0
2484
2485		else: # if key not in ['D47', 'D48']
2486			chisq, Nf = 0, 0
2487			for sample in mysamples :
2488				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2489				if len(X) > 1 :
2490					Nf += len(X) - 1
2491					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2492			r = (chisq / Nf)**.5 if Nf > 0 else 0
2493
2494		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2495		return r
2496
2497	def sample_average(self, samples, weights = 'equal', normalize = True):
2498		'''
2499		Weighted average Δ4x value of a group of samples, accounting for covariance.
2500
2501		Returns the weighed average Δ4x value and associated SE
2502		of a group of samples. Weights are equal by default. If `normalize` is
2503		true, `weights` will be rescaled so that their sum equals 1.
2504
2505		**Examples**
2506
2507		```python
2508		self.sample_average(['X','Y'], [1, 2])
2509		```
2510
2511		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2512		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2513		values of samples X and Y, respectively.
2514
2515		```python
2516		self.sample_average(['X','Y'], [1, -1], normalize = False)
2517		```
2518
2519		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2520		'''
2521		if weights == 'equal':
2522			weights = [1/len(samples)] * len(samples)
2523
2524		if normalize:
2525			s = sum(weights)
2526			if s:
2527				weights = [w/s for w in weights]
2528
2529		try:
2530# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2531# 			C = self.standardization.covar[indices,:][:,indices]
2532			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2533			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2534			return correlated_sum(X, C, weights)
2535		except ValueError:
2536			return (0., 0.)
2537
2538
2539	def sample_D4x_covar(self, sample1, sample2 = None):
2540		'''
2541		Covariance between Δ4x values of samples
2542
2543		Returns the error covariance between the average Δ4x values of two
2544		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2545		returns the Δ4x variance for that sample.
2546		'''
2547		if sample2 is None:
2548			sample2 = sample1
2549		if self.standardization_method == 'pooled':
2550			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2551			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2552			return self.standardization.covar[i, j]
2553		elif self.standardization_method == 'indep_sessions':
2554			if sample1 == sample2:
2555				return self.samples[sample1][f'SE_D{self._4x}']**2
2556			else:
2557				c = 0
2558				for session in self.sessions:
2559					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2560					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2561					if sdata1 and sdata2:
2562						a = self.sessions[session]['a']
2563						# !! TODO: CM below does not account for temporal changes in standardization parameters
2564						CM = self.sessions[session]['CM'][:3,:3]
2565						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2566						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2567						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2568						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2569						c += (
2570							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2571							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2572							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2573							@ CM
2574							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2575							) / a**2
2576				return float(c)
2577
2578	def sample_D4x_correl(self, sample1, sample2 = None):
2579		'''
2580		Correlation between Δ4x errors of samples
2581
2582		Returns the error correlation between the average Δ4x values of two samples.
2583		'''
2584		if sample2 is None or sample2 == sample1:
2585			return 1.
2586		return (
2587			self.sample_D4x_covar(sample1, sample2)
2588			/ self.unknowns[sample1][f'SE_D{self._4x}']
2589			/ self.unknowns[sample2][f'SE_D{self._4x}']
2590			)
2591
2592	def plot_single_session(self,
2593		session,
2594		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2595		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2596		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2597		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2598		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2599		xylimits = 'free', # | 'constant'
2600		x_label = None,
2601		y_label = None,
2602		error_contour_interval = 'auto',
2603		fig = 'new',
2604		):
2605		'''
2606		Generate plot for a single session
2607		'''
2608		if x_label is None:
2609			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2610		if y_label is None:
2611			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2612
2613		out = _SessionPlot()
2614		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2615		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2616		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2617		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2618		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2619		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2620		anchor_avg = (np.array([ np.array([
2621				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2622				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2623				]) for sample in anchors]).T,
2624			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2625		unknown_avg = (np.array([ np.array([
2626				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2627				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2628				]) for sample in unknowns]).T,
2629			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2630		
2631		
2632		if fig == 'new':
2633			out.fig = ppl.figure(figsize = (6,6))
2634			ppl.subplots_adjust(.1,.1,.9,.9)
2635
2636		out.anchor_analyses, = ppl.plot(
2637			anchors_d,
2638			anchors_D,
2639			**kw_plot_anchors)
2640		out.unknown_analyses, = ppl.plot(
2641			unknowns_d,
2642			unknowns_D,
2643			**kw_plot_unknowns)
2644		out.anchor_avg = ppl.plot(
2645			*anchor_avg,
2646			**kw_plot_anchor_avg)
2647		out.unknown_avg = ppl.plot(
2648			*unknown_avg,
2649			**kw_plot_unknown_avg)
2650		if xylimits == 'constant':
2651			x = [r[f'd{self._4x}'] for r in self]
2652			y = [r[f'D{self._4x}'] for r in self]
2653			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2654			w, h = x2-x1, y2-y1
2655			x1 -= w/20
2656			x2 += w/20
2657			y1 -= h/20
2658			y2 += h/20
2659			ppl.axis([x1, x2, y1, y2])
2660		elif xylimits == 'free':
2661			x1, x2, y1, y2 = ppl.axis()
2662		else:
2663			x1, x2, y1, y2 = ppl.axis(xylimits)
2664				
2665		if error_contour_interval != 'none':
2666			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2667			XI,YI = np.meshgrid(xi, yi)
2668			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2669			if error_contour_interval == 'auto':
2670				rng = np.max(SI) - np.min(SI)
2671				if rng <= 0.01:
2672					cinterval = 0.001
2673				elif rng <= 0.03:
2674					cinterval = 0.004
2675				elif rng <= 0.1:
2676					cinterval = 0.01
2677				elif rng <= 0.3:
2678					cinterval = 0.03
2679				elif rng <= 1.:
2680					cinterval = 0.1
2681				else:
2682					cinterval = 0.5
2683			else:
2684				cinterval = error_contour_interval
2685
2686			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2687			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2688			out.clabel = ppl.clabel(out.contour)
2689			contour = (XI, YI, SI, cval, cinterval)
2690
2691		if fig == None:
2692			return {
2693			'anchors':anchors,
2694			'unknowns':unknowns,
2695			'anchors_d':anchors_d,
2696			'anchors_D':anchors_D,
2697			'unknowns_d':unknowns_d,
2698			'unknowns_D':unknowns_D,
2699			'anchor_avg':anchor_avg,
2700			'unknown_avg':unknown_avg,
2701			'contour':contour,
2702			}
2703
2704		ppl.xlabel(x_label)
2705		ppl.ylabel(y_label)
2706		ppl.title(session, weight = 'bold')
2707		ppl.grid(alpha = .2)
2708		out.ax = ppl.gca()		
2709
2710		return out
2711
2712	def plot_residuals(
2713		self,
2714		kde = False,
2715		hist = False,
2716		binwidth = 2/3,
2717		dir = 'output',
2718		filename = None,
2719		highlight = [],
2720		colors = None,
2721		figsize = None,
2722		dpi = 100,
2723		yspan = None,
2724		):
2725		'''
2726		Plot residuals of each analysis as a function of time (actually, as a function of
2727		the order of analyses in the `D4xdata` object)
2728
2729		+ `kde`: whether to add a kernel density estimate of residuals
2730		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2731		+ `histbins`: specify bin edges for the histogram
2732		+ `dir`: the directory in which to save the plot
2733		+ `highlight`: a list of samples to highlight
2734		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2735		+ `figsize`: (width, height) of figure
2736		+ `dpi`: resolution for PNG output
2737		+ `yspan`: factor controlling the range of y values shown in plot
2738		  (by default: `yspan = 1.5 if kde else 1.0`)
2739		'''
2740		
2741		from matplotlib import ticker
2742
2743		if yspan is None:
2744			if kde:
2745				yspan = 1.5
2746			else:
2747				yspan = 1.0
2748		
2749		# Layout
2750		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2751		if hist or kde:
2752			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2753			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2754		else:
2755			ppl.subplots_adjust(.08,.05,.78,.8)
2756			ax1 = ppl.subplot(111)
2757		
2758		# Colors
2759		N = len(self.anchors)
2760		if colors is None:
2761			if len(highlight) > 0:
2762				Nh = len(highlight)
2763				if Nh == 1:
2764					colors = {highlight[0]: (0,0,0)}
2765				elif Nh == 3:
2766					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2767				elif Nh == 4:
2768					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2769				else:
2770					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2771			else:
2772				if N == 3:
2773					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2774				elif N == 4:
2775					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2776				else:
2777					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2778
2779		ppl.sca(ax1)
2780		
2781		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2782
2783		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2784
2785		session = self[0]['Session']
2786		x1 = 0
2787# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2788		x_sessions = {}
2789		one_or_more_singlets = False
2790		one_or_more_multiplets = False
2791		multiplets = set()
2792		for k,r in enumerate(self):
2793			if r['Session'] != session:
2794				x2 = k-1
2795				x_sessions[session] = (x1+x2)/2
2796				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2797				session = r['Session']
2798				x1 = k
2799			singlet = len(self.samples[r['Sample']]['data']) == 1
2800			if not singlet:
2801				multiplets.add(r['Sample'])
2802			if r['Sample'] in self.unknowns:
2803				if singlet:
2804					one_or_more_singlets = True
2805				else:
2806					one_or_more_multiplets = True
2807			kw = dict(
2808				marker = 'x' if singlet else '+',
2809				ms = 4 if singlet else 5,
2810				ls = 'None',
2811				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2812				mew = 1,
2813				alpha = 0.2 if singlet else 1,
2814				)
2815			if highlight and r['Sample'] not in highlight:
2816				kw['alpha'] = 0.2
2817			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2818		x2 = k
2819		x_sessions[session] = (x1+x2)/2
2820
2821		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2822		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2823		if not (hist or kde):
2824			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2825			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2826
2827		xmin, xmax, ymin, ymax = ppl.axis()
2828		if yspan != 1:
2829			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2830		for s in x_sessions:
2831			ppl.text(
2832				x_sessions[s],
2833				ymax +1,
2834				s,
2835				va = 'bottom',
2836				**(
2837					dict(ha = 'center')
2838					if len(self.sessions[s]['data']) > (0.15 * len(self))
2839					else dict(ha = 'left', rotation = 45)
2840					)
2841				)
2842
2843		if hist or kde:
2844			ppl.sca(ax2)
2845
2846		for s in colors:
2847			kw['marker'] = '+'
2848			kw['ms'] = 5
2849			kw['mec'] = colors[s]
2850			kw['label'] = s
2851			kw['alpha'] = 1
2852			ppl.plot([], [], **kw)
2853
2854		kw['mec'] = (0,0,0)
2855
2856		if one_or_more_singlets:
2857			kw['marker'] = 'x'
2858			kw['ms'] = 4
2859			kw['alpha'] = .2
2860			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2861			ppl.plot([], [], **kw)
2862
2863		if one_or_more_multiplets:
2864			kw['marker'] = '+'
2865			kw['ms'] = 4
2866			kw['alpha'] = 1
2867			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2868			ppl.plot([], [], **kw)
2869
2870		if hist or kde:
2871			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2872		else:
2873			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2874		leg.set_zorder(-1000)
2875
2876		ppl.sca(ax1)
2877
2878		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2879		ppl.xticks([])
2880		ppl.axis([-1, len(self), None, None])
2881
2882		if hist or kde:
2883			ppl.sca(ax2)
2884			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2885
2886			if kde:
2887				from scipy.stats import gaussian_kde
2888				yi = np.linspace(ymin, ymax, 201)
2889				xi = gaussian_kde(X).evaluate(yi)
2890				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2891# 				ppl.plot(xi, yi, 'k-', lw = 1)
2892			elif hist:
2893				ppl.hist(
2894					X,
2895					orientation = 'horizontal',
2896					histtype = 'stepfilled',
2897					ec = [.4]*3,
2898					fc = [.25]*3,
2899					alpha = .25,
2900					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2901					)
2902			ppl.text(0, 0,
2903				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2904				size = 7.5,
2905				alpha = 1,
2906				va = 'center',
2907				ha = 'left',
2908				)
2909
2910			ppl.axis([0, None, ymin, ymax])
2911			ppl.xticks([])
2912			ppl.yticks([])
2913# 			ax2.spines['left'].set_visible(False)
2914			ax2.spines['right'].set_visible(False)
2915			ax2.spines['top'].set_visible(False)
2916			ax2.spines['bottom'].set_visible(False)
2917
2918		ax1.axis([None, None, ymin, ymax])
2919
2920		if not os.path.exists(dir):
2921			os.makedirs(dir)
2922		if filename is None:
2923			return fig
2924		elif filename == '':
2925			filename = f'D{self._4x}_residuals.pdf'
2926		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2927		ppl.close(fig)
2928				
2929
2930	def simulate(self, *args, **kwargs):
2931		'''
2932		Legacy function with warning message pointing to `virtual_data()`
2933		'''
2934		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2935
2936	def plot_anchor_residuals(
2937		self,
2938		dir = 'output',
2939		filename = '',
2940		figsize = None,
2941		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2942		dpi = 100,
2943		colors = None,
2944		):
2945		'''
2946		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2947		
2948		**Parameters**
2949
2950		+ `dir`: the directory in which to save the plot
2951		+ `filename`: the file name to save to.
2952		+ `dpi`: resolution for PNG output
2953		+ `figsize`: (width, height) of figure
2954		+ `subplots_adjust`: passed to the figure
2955		+ `dpi`: resolution for PNG output
2956		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2957		'''
2958
2959		# Colors
2960		N = len(self.anchors)
2961		if colors is None:
2962			if N == 3:
2963				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2964			elif N == 4:
2965				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2966			else:
2967				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2968
2969		if figsize is None:
2970			figsize = (4, 1.5*N+1)
2971		fig = ppl.figure(figsize = figsize)
2972		ppl.subplots_adjust(*subplots_adjust)
2973		axs = {}
2974		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2975		sigma = self.repeatability['r_D47a'] * 1000
2976		D = max(np.abs(X))
2977
2978		for k,a in enumerate(self.anchors):
2979			color = colors[a]
2980			axs[a] = ppl.subplot(N, 1, 1+k)
2981			axs[a].text(
2982				0.02, 1-0.05, a,
2983				va = 'top',
2984				ha = 'left',
2985				weight = 'bold',
2986				size = 9,
2987				color = [_*0.75 for _ in color],
2988				transform = axs[a].transAxes,
2989			)
2990			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
2991			axs[a].axvline(0, lw = 0.5, color = color)
2992			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
2993
2994			xi = np.linspace(-3*D, 3*D, 601)
2995			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
2996			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
2997			
2998			axs[a].errorbar(
2999				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3000				ecolor = color,
3001				marker = 's',
3002				ls = 'None',
3003				mec = color,
3004				mew = 1,
3005				mfc = 'w',
3006				ms = 8,
3007				elinewidth = 1,
3008				capsize = 4,
3009				capthick = 1,
3010			)
3011			
3012			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3013			ppl.yticks([])
3014
3015		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3016
3017		if not os.path.exists(dir):
3018			os.makedirs(dir)
3019		if filename is None:
3020			return fig
3021		elif filename == '':
3022			filename = f'D{self._4x}_anchor_residuals.pdf'
3023		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3024		ppl.close(fig)
3025		
3026
3027	def plot_distribution_of_analyses(
3028		self,
3029		dir = 'output',
3030		filename = None,
3031		vs_time = False,
3032		figsize = (6,4),
3033		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3034		output = None,
3035		dpi = 100,
3036		):
3037		'''
3038		Plot temporal distribution of all analyses in the data set.
3039		
3040		**Parameters**
3041
3042		+ `dir`: the directory in which to save the plot
3043		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3044		+ `dpi`: resolution for PNG output
3045		+ `figsize`: (width, height) of figure
3046		+ `dpi`: resolution for PNG output
3047		'''
3048
3049		asamples = [s for s in self.anchors]
3050		usamples = [s for s in self.unknowns]
3051		if output is None or output == 'fig':
3052			fig = ppl.figure(figsize = figsize)
3053			ppl.subplots_adjust(*subplots_adjust)
3054		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3055		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3056		Xmax += (Xmax-Xmin)/40
3057		Xmin -= (Xmax-Xmin)/41
3058		for k, s in enumerate(asamples + usamples):
3059			if vs_time:
3060				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3061			else:
3062				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3063			Y = [-k for x in X]
3064			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3065			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3066			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3067		ppl.axis([Xmin, Xmax, -k-1, 1])
3068		ppl.xlabel('\ntime')
3069		ppl.gca().annotate('',
3070			xy = (0.6, -0.02),
3071			xycoords = 'axes fraction',
3072			xytext = (.4, -0.02), 
3073            arrowprops = dict(arrowstyle = "->", color = 'k'),
3074            )
3075			
3076
3077		x2 = -1
3078		for session in self.sessions:
3079			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3080			if vs_time:
3081				ppl.axvline(x1, color = 'k', lw = .75)
3082			if x2 > -1:
3083				if not vs_time:
3084					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3085			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3086# 			from xlrd import xldate_as_datetime
3087# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3088			if vs_time:
3089				ppl.axvline(x2, color = 'k', lw = .75)
3090				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3091			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3092
3093		ppl.xticks([])
3094		ppl.yticks([])
3095
3096		if output is None:
3097			if not os.path.exists(dir):
3098				os.makedirs(dir)
3099			if filename == None:
3100				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3101			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3102			ppl.close(fig)
3103		elif output == 'ax':
3104			return ppl.gca()
3105		elif output == 'fig':
3106			return fig
3107
3108
3109	def plot_bulk_compositions(
3110		self,
3111		samples = None,
3112		dir = 'output/bulk_compositions',
3113		figsize = (6,6),
3114		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3115		show = False,
3116		sample_color = (0,.5,1),
3117		analysis_color = (.7,.7,.7),
3118		labeldist = 0.3,
3119		radius = 0.05,
3120		):
3121		'''
3122		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3123		
3124		By default, creates a directory `./output/bulk_compositions` where plots for
3125		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3126		
3127		
3128		**Parameters**
3129
3130		+ `samples`: Only these samples are processed (by default: all samples).
3131		+ `dir`: where to save the plots
3132		+ `figsize`: (width, height) of figure
3133		+ `subplots_adjust`: passed to `subplots_adjust()`
3134		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3135		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3136		+ `sample_color`: color used for replicate markers/labels
3137		+ `analysis_color`: color used for sample markers/labels
3138		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3139		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3140		'''
3141
3142		from matplotlib.patches import Ellipse
3143
3144		if samples is None:
3145			samples = [_ for _ in self.samples]
3146
3147		saved = {}
3148
3149		for s in samples:
3150
3151			fig = ppl.figure(figsize = figsize)
3152			fig.subplots_adjust(*subplots_adjust)
3153			ax = ppl.subplot(111)
3154			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3155			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3156			ppl.title(s)
3157
3158
3159			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3160			UID = [_['UID'] for _ in self.samples[s]['data']]
3161			XY0 = XY.mean(0)
3162
3163			for xy in XY:
3164				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3165				
3166			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3167			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3168			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3169			saved[s] = [XY, XY0]
3170			
3171			x1, x2, y1, y2 = ppl.axis()
3172			x0, dx = (x1+x2)/2, (x2-x1)/2
3173			y0, dy = (y1+y2)/2, (y2-y1)/2
3174			dx, dy = [max(max(dx, dy), radius)]*2
3175
3176			ppl.axis([
3177				x0 - 1.2*dx,
3178				x0 + 1.2*dx,
3179				y0 - 1.2*dy,
3180				y0 + 1.2*dy,
3181				])			
3182
3183			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3184
3185			for xy, uid in zip(XY, UID):
3186
3187				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3188				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3189
3190				if (vector_in_display_space**2).sum() > 0:
3191
3192					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3193					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3194					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3195					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3196
3197					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3198
3199				else:
3200
3201					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3202
3203			if radius:
3204				ax.add_artist(Ellipse(
3205					xy = XY0,
3206					width = radius*2,
3207					height = radius*2,
3208					ls = (0, (2,2)),
3209					lw = .7,
3210					ec = analysis_color,
3211					fc = 'None',
3212					))
3213				ppl.text(
3214					XY0[0],
3215					XY0[1]-radius,
3216					f'\n± {radius*1e3:.0f} ppm',
3217					color = analysis_color,
3218					va = 'top',
3219					ha = 'center',
3220					linespacing = 0.4,
3221					size = 8,
3222					)
3223
3224			if not os.path.exists(dir):
3225				os.makedirs(dir)
3226			fig.savefig(f'{dir}/{s}.pdf')
3227			ppl.close(fig)
3228
3229		fig = ppl.figure(figsize = figsize)
3230		fig.subplots_adjust(*subplots_adjust)
3231		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3232		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3233
3234		for s in saved:
3235			for xy in saved[s][0]:
3236				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3237			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3238			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3239			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3240
3241		x1, x2, y1, y2 = ppl.axis()
3242		ppl.axis([
3243			x1 - (x2-x1)/10,
3244			x2 + (x2-x1)/10,
3245			y1 - (y2-y1)/10,
3246			y2 + (y2-y1)/10,
3247			])			
3248
3249
3250		if not os.path.exists(dir):
3251			os.makedirs(dir)
3252		fig.savefig(f'{dir}/__all__.pdf')
3253		if show:
3254			ppl.show()
3255		ppl.close(fig)
3256		
3257
3258	def _save_D4x_correl(
3259		self,
3260		samples = None,
3261		dir = 'output',
3262		filename = None,
3263		D4x_precision = 4,
3264		correl_precision = 4,
3265		save_to_file = True,
3266		):
3267		'''
3268		Save D4x values along with their SE and correlation matrix.
3269
3270		**Parameters**
3271
3272		+ `samples`: Only these samples are output (by default: all samples).
3273		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3274		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3275		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3276		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3277		+ `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`,
3278		returns the output as a string
3279		'''
3280		if samples is None:
3281			samples = sorted([s for s in self.unknowns])
3282		
3283		out = [['Sample']] + [[s] for s in samples]
3284		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3285		for k,s in enumerate(samples):
3286			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3287			for s2 in samples:
3288				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3289		
3290		if save_to_file:
3291			if not os.path.exists(dir):
3292				os.makedirs(dir)
3293			if filename is None:
3294				filename = f'D{self._4x}_correl.csv'
3295			with open(f'{dir}/{filename}', 'w') as fid:
3296				fid.write(make_csv(out))
3297		else:
3298			return make_csv(out)
3299		
3300
3301class D47data(D4xdata):
3302	'''
3303	Store and process data for a large set of Δ47 analyses,
3304	usually comprising more than one analytical session.
3305	'''
3306
3307	Nominal_D4x = {
3308		'ETH-1':   0.2052,
3309		'ETH-2':   0.2085,
3310		'ETH-3':   0.6132,
3311		'ETH-4':   0.4511,
3312		'IAEA-C1': 0.3018,
3313		'IAEA-C2': 0.6409,
3314		'MERCK':   0.5135,
3315		} # I-CDES (Bernasconi et al., 2021)
3316	'''
3317	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3318	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3319	reference frame.
3320
3321	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3322	```py
3323	{
3324		'ETH-1'   : 0.2052,
3325		'ETH-2'   : 0.2085,
3326		'ETH-3'   : 0.6132,
3327		'ETH-4'   : 0.4511,
3328		'IAEA-C1' : 0.3018,
3329		'IAEA-C2' : 0.6409,
3330		'MERCK'   : 0.5135,
3331	}
3332	```
3333	'''
3334
3335
3336	@property
3337	def Nominal_D47(self):
3338		return self.Nominal_D4x
3339	
3340
3341	@Nominal_D47.setter
3342	def Nominal_D47(self, new):
3343		self.Nominal_D4x = dict(**new)
3344		self.refresh()
3345
3346
3347	def __init__(self, l = [], **kwargs):
3348		'''
3349		**Parameters:** same as `D4xdata.__init__()`
3350		'''
3351		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3352
3353
3354	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3355		'''
3356		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3357		value for that temperature, and add treat these samples as additional anchors.
3358
3359		**Parameters**
3360
3361		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3362		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3363		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3364		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3365		if `new`: keep pre-existing anchors but update them in case of conflict
3366		between old and new Δ47 values;
3367		if `old`: keep pre-existing anchors but preserve their original Δ47
3368		values in case of conflict.
3369		'''
3370		f = {
3371			'petersen': fCO2eqD47_Petersen,
3372			'wang': fCO2eqD47_Wang,
3373			}[fCo2eqD47]
3374		foo = {}
3375		for r in self:
3376			if 'Teq' in r:
3377				if r['Sample'] in foo:
3378					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3379				else:
3380					foo[r['Sample']] = f(r['Teq'])
3381			else:
3382					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3383
3384		if priority == 'replace':
3385			self.Nominal_D47 = {}
3386		for s in foo:
3387			if priority != 'old' or s not in self.Nominal_D47:
3388				self.Nominal_D47[s] = foo[s]
3389	
3390	def save_D47_correl(self, *args, **kwargs):
3391		return self._save_D4x_correl(*args, **kwargs)
3392
3393	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3394
3395
3396class D48data(D4xdata):
3397	'''
3398	Store and process data for a large set of Δ48 analyses,
3399	usually comprising more than one analytical session.
3400	'''
3401
3402	Nominal_D4x = {
3403		'ETH-1':  0.138,
3404		'ETH-2':  0.138,
3405		'ETH-3':  0.270,
3406		'ETH-4':  0.223,
3407		'GU-1':  -0.419,
3408		} # (Fiebig et al., 2019, 2021)
3409	'''
3410	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3411	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3412	reference frame.
3413
3414	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3415	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3416
3417	```py
3418	{
3419		'ETH-1' :  0.138,
3420		'ETH-2' :  0.138,
3421		'ETH-3' :  0.270,
3422		'ETH-4' :  0.223,
3423		'GU-1'  : -0.419,
3424	}
3425	```
3426	'''
3427
3428
3429	@property
3430	def Nominal_D48(self):
3431		return self.Nominal_D4x
3432
3433	
3434	@Nominal_D48.setter
3435	def Nominal_D48(self, new):
3436		self.Nominal_D4x = dict(**new)
3437		self.refresh()
3438
3439
3440	def __init__(self, l = [], **kwargs):
3441		'''
3442		**Parameters:** same as `D4xdata.__init__()`
3443		'''
3444		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3445
3446	def save_D48_correl(self, *args, **kwargs):
3447		return self._save_D4x_correl(*args, **kwargs)
3448
3449	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3450
3451
3452class D49data(D4xdata):
3453	'''
3454	Store and process data for a large set of Δ49 analyses,
3455	usually comprising more than one analytical session.
3456	'''
3457	
3458	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3459	'''
3460	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3461	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3462	reference frame.
3463
3464	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3465
3466	```py
3467	{
3468		"1000C": 0.0,
3469		"25C": 2.228
3470	}
3471	```
3472	'''
3473	
3474	@property
3475	def Nominal_D49(self):
3476		return self.Nominal_D4x
3477	
3478	@Nominal_D49.setter
3479	def Nominal_D49(self, new):
3480		self.Nominal_D4x = dict(**new)
3481		self.refresh()
3482	
3483	def __init__(self, l=[], **kwargs):
3484		'''
3485		**Parameters:** same as `D4xdata.__init__()`
3486		'''
3487		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3488	
3489	def save_D49_correl(self, *args, **kwargs):
3490		return self._save_D4x_correl(*args, **kwargs)
3491	
3492	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3493
3494class _SessionPlot():
3495	'''
3496	Simple placeholder class
3497	'''
3498	def __init__(self):
3499		pass
3500
3501_app = typer.Typer(
3502	add_completion = False,
3503	context_settings={'help_option_names': ['-h', '--help']},
3504	rich_markup_mode = 'rich',
3505	)
3506
3507@_app.command()
3508def _cli(
3509	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3510	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3511	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3512	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3513	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3514	):
3515	"""
3516	Process raw D47 data and return standardized results.
3517	
3518	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3519	
3520	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3521	the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3522	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3523	the following sequence is applied:
3524	
3525	* [b]D47data.wg()[/b]
3526	* [b]D47data.crunch()[/b]
3527	* [b]D47data.standardize()[/b]
3528	* [b]D47data.summary()[/b]
3529	* [b]D47data.table_of_samples()[/b]
3530	* [b]D47data.table_of_sessions()[/b]
3531	* [b]D47data.plot_sessions()[/b]
3532	* [b]D47data.plot_residuals()[/b]
3533	* [b]D47data.table_of_analyses()[/b]
3534	* [b]D47data.plot_distribution_of_analyses()[/b]
3535	* [b]D47data.plot_bulk_compositions()[/b]
3536	* [b]D47data.save_D47_correl()[/b]
3537	
3538	Optionally, also apply similar methods for [b]]D48[/b].
3539	
3540	[b]Example CSV file for --anchors option:[/b]	
3541	[i]
3542	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3543	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3544	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3545	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3546	ETH-4,            ,           ,  0.4511,  0.223
3547	[/i]
3548	Except for [i]Sample[/i], none of the columns above are mandatory.
3549
3550	[b]Example CSV file for --exclude option:[/b]	
3551	[i]
3552	Sample,  UID
3553	 FOO-1,
3554	 BAR-2,
3555	      ,  A04
3556	      ,  A17
3557	      ,  A88
3558	[/i]
3559	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3560	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3561	Neither column is mandatory.
3562	"""
3563
3564	data = D47data()
3565	data.read(rawdata)
3566
3567	if exclude != 'none':
3568		exclude = read_csv(exclude)
3569		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3570		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3571	else:
3572		exclude_uid = []
3573		exclude_sample = []
3574	
3575	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3576
3577	if anchors != 'none':
3578		anchors = read_csv(anchors)
3579		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3580			data.Nominal_d13C_VPDB = {
3581				_['Sample']: _['d13C_VPDB']
3582				for _ in anchors
3583				if 'd13C_VPDB' in _
3584				}
3585		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3586			data.Nominal_d18O_VPDB = {
3587				_['Sample']: _['d18O_VPDB']
3588				for _ in anchors
3589				if 'd18O_VPDB' in _
3590				}
3591		if len([_ for _ in anchors if 'D47' in _]):
3592			data.Nominal_D4x = {
3593				_['Sample']: _['D47']
3594				for _ in anchors
3595				if 'D47' in _
3596				}
3597
3598	data.refresh()
3599	data.wg()
3600	data.crunch()
3601	data.standardize()
3602	data.summary(dir = output_dir)
3603	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3604	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3605	data.plot_sessions(dir = output_dir)
3606	data.save_D47_correl(dir = output_dir)
3607	
3608	if not run_D48:
3609		data.table_of_samples(dir = output_dir)
3610		data.table_of_analyses(dir = output_dir)
3611		data.table_of_sessions(dir = output_dir)
3612
3613
3614	if run_D48:
3615		data2 = D48data()
3616		print(rawdata)
3617		data2.read(rawdata)
3618
3619		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3620
3621		if anchors != 'none':
3622			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3623				data2.Nominal_d13C_VPDB = {
3624					_['Sample']: _['d13C_VPDB']
3625					for _ in anchors
3626					if 'd13C_VPDB' in _
3627					}
3628			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3629				data2.Nominal_d18O_VPDB = {
3630					_['Sample']: _['d18O_VPDB']
3631					for _ in anchors
3632					if 'd18O_VPDB' in _
3633					}
3634			if len([_ for _ in anchors if 'D48' in _]):
3635				data2.Nominal_D4x = {
3636					_['Sample']: _['D48']
3637					for _ in anchors
3638					if 'D48' in _
3639					}
3640
3641		data2.refresh()
3642		data2.wg()
3643		data2.crunch()
3644		data2.standardize()
3645		data2.summary(dir = output_dir)
3646		data2.plot_sessions(dir = output_dir)
3647		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3648		data2.plot_distribution_of_analyses(dir = output_dir)
3649		data2.save_D48_correl(dir = output_dir)
3650
3651		table_of_analyses(data, data2, dir = output_dir)
3652		table_of_samples(data, data2, dir = output_dir)
3653		table_of_sessions(data, data2, dir = output_dir)
3654		
3655def __cli():
3656	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
69def fCO2eqD47_Petersen(T):
70	'''
71	CO2 equilibrium Δ47 value as a function of T (in degrees C)
72	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
73
74	'''
75	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
80def fCO2eqD47_Wang(T):
81	'''
82	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
83	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
84	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
85	'''
86	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 89def correlated_sum(X, C, w = None):
 90	'''
 91	Compute covariance-aware linear combinations
 92
 93	**Parameters**
 94	
 95	+ `X`: list or 1-D array of values to sum
 96	+ `C`: covariance matrix for the elements of `X`
 97	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 98	       (all equal to 1 by default)
 99
100	Return the sum (and its SE) of the elements of `X`, with optional weights equal
101	to the elements of `w`, accounting for covariances between the elements of `X`.
102	'''
103	if w is None:
104		w = [1 for x in X]
105	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
108def make_csv(x, hsep = ',', vsep = '\n'):
109	'''
110	Formats a list of lists of strings as a CSV
111
112	**Parameters**
113
114	+ `x`: the list of lists of strings to format
115	+ `hsep`: the field separator (`,` by default)
116	+ `vsep`: the line-ending convention to use (`\\n` by default)
117
118	**Example**
119
120	```py
121	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
122	```
123
124	outputs:
125
126	```py
127	a,b,c
128	d,e,f
129	```
130	'''
131	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
134def pf(txt):
135	'''
136	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
137	'''
138	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
141def smart_type(x):
142	'''
143	Tries to convert string `x` to a float if it includes a decimal point, or
144	to an integer if it does not. If both attempts fail, return the original
145	string unchanged.
146	'''
147	try:
148		y = float(x)
149	except ValueError:
150		return x
151	if '.' not in x:
152		return int(y)
153	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

D47crunch_defaults = <D47crunch._Defaults object>
def pretty_table(x, header=1, hsep=' ', vsep=None, align='<'):
162def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
163	'''
164	Reads a list of lists of strings and outputs an ascii table
165
166	**Parameters**
167
168	+ `x`: a list of lists of strings
169	+ `header`: the number of lines to treat as header lines
170	+ `hsep`: the horizontal separator between columns
171	+ `vsep`: the character to use as vertical separator
172	+ `align`: string of left (`<`) or right (`>`) alignment characters.
173
174	**Example**
175
176	```py
177	print(pretty_table([
178		['A', 'B', 'C'],
179		['1', '1.9999', 'foo'],
180		['10', 'x', 'bar'],
181	]))
182	```
183	yields:	
184	```
185	——  ——————  ———
186	A        B    C
187	——  ——————  ———
188	1   1.9999  foo
189	10       x  bar
190	——  ——————  ———
191	```
192
193	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
194	
195	```py
196	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
197	print(pretty_table([
198		['A', 'B', 'C'],
199		['1', '1.9999', 'foo'],
200		['10', 'x', 'bar'],
201	]))
202	```
203	yields:	
204	```
205	==  ======  ===
206	A        B    C
207	==  ======  ===
208	1   1.9999  foo
209	10       x  bar
210	==  ======  ===
211	```
212	'''
213	
214	if vsep is None:
215		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
216	
217	txt = []
218	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
219
220	if len(widths) > len(align):
221		align += '>' * (len(widths)-len(align))
222	sepline = hsep.join([vsep*w for w in widths])
223	txt += [sepline]
224	for k,l in enumerate(x):
225		if k and k == header:
226			txt += [sepline]
227		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
228	txt += [sepline]
229	txt += ['']
230	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

——  ——————  ———
A        B    C
——  ——————  ———
1   1.9999  foo
10       x  bar
——  ——————  ———

To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:

D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

==  ======  ===
A        B    C
==  ======  ===
1   1.9999  foo
10       x  bar
==  ======  ===
def transpose_table(x):
233def transpose_table(x):
234	'''
235	Transpose a list if lists
236
237	**Parameters**
238
239	+ `x`: a list of lists
240
241	**Example**
242
243	```py
244	x = [[1, 2], [3, 4]]
245	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
246	```
247	'''
248	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
251def w_avg(X, sX) :
252	'''
253	Compute variance-weighted average
254
255	Returns the value and SE of the weighted average of the elements of `X`,
256	with relative weights equal to their inverse variances (`1/sX**2`).
257
258	**Parameters**
259
260	+ `X`: array-like of elements to average
261	+ `sX`: array-like of the corresponding SE values
262
263	**Tip**
264
265	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
266	they may be rearranged using `zip()`:
267
268	```python
269	foo = [(0, 1), (1, 0.5), (2, 0.5)]
270	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
271	```
272	'''
273	X = [ x for x in X ]
274	sX = [ sx for sx in sX ]
275	W = [ sx**-2 for sx in sX ]
276	W = [ w/sum(W) for w in W ]
277	Xavg = sum([ w*x for w,x in zip(W,X) ])
278	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
279	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
282def read_csv(filename, sep = ''):
283	'''
284	Read contents of `filename` in csv format and return a list of dictionaries.
285
286	In the csv string, spaces before and after field separators (`','` by default)
287	are optional.
288
289	**Parameters**
290
291	+ `filename`: the csv file to read
292	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
293	whichever appers most often in the contents of `filename`.
294	'''
295	with open(filename) as fid:
296		txt = fid.read()
297
298	if sep == '':
299		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
300	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
301	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
304def simulate_single_analysis(
305	sample = 'MYSAMPLE',
306	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
307	d13C_VPDB = None, d18O_VPDB = None,
308	D47 = None, D48 = None, D49 = 0., D17O = 0.,
309	a47 = 1., b47 = 0., c47 = -0.9,
310	a48 = 1., b48 = 0., c48 = -0.45,
311	Nominal_D47 = None,
312	Nominal_D48 = None,
313	Nominal_d13C_VPDB = None,
314	Nominal_d18O_VPDB = None,
315	ALPHA_18O_ACID_REACTION = None,
316	R13_VPDB = None,
317	R17_VSMOW = None,
318	R18_VSMOW = None,
319	LAMBDA_17 = None,
320	R18_VPDB = None,
321	):
322	'''
323	Compute working-gas delta values for a single analysis, assuming a stochastic working
324	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
325	
326	**Parameters**
327
328	+ `sample`: sample name
329	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
330		(respectively –4 and +26 ‰ by default)
331	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
332	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
333		of the carbonate sample
334	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
335		Δ48 values if `D47` or `D48` are not specified
336	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
337		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
338	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
339	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
340		correction parameters (by default equal to the `D4xdata` default values)
341	
342	Returns a dictionary with fields
343	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
344	'''
345
346	if Nominal_d13C_VPDB is None:
347		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
348
349	if Nominal_d18O_VPDB is None:
350		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
351
352	if ALPHA_18O_ACID_REACTION is None:
353		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
354
355	if R13_VPDB is None:
356		R13_VPDB = D4xdata().R13_VPDB
357
358	if R17_VSMOW is None:
359		R17_VSMOW = D4xdata().R17_VSMOW
360
361	if R18_VSMOW is None:
362		R18_VSMOW = D4xdata().R18_VSMOW
363
364	if LAMBDA_17 is None:
365		LAMBDA_17 = D4xdata().LAMBDA_17
366
367	if R18_VPDB is None:
368		R18_VPDB = D4xdata().R18_VPDB
369	
370	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
371	
372	if Nominal_D47 is None:
373		Nominal_D47 = D47data().Nominal_D47
374
375	if Nominal_D48 is None:
376		Nominal_D48 = D48data().Nominal_D48
377	
378	if d13C_VPDB is None:
379		if sample in Nominal_d13C_VPDB:
380			d13C_VPDB = Nominal_d13C_VPDB[sample]
381		else:
382			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
383
384	if d18O_VPDB is None:
385		if sample in Nominal_d18O_VPDB:
386			d18O_VPDB = Nominal_d18O_VPDB[sample]
387		else:
388			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
389
390	if D47 is None:
391		if sample in Nominal_D47:
392			D47 = Nominal_D47[sample]
393		else:
394			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
395
396	if D48 is None:
397		if sample in Nominal_D48:
398			D48 = Nominal_D48[sample]
399		else:
400			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
401
402	X = D4xdata()
403	X.R13_VPDB = R13_VPDB
404	X.R17_VSMOW = R17_VSMOW
405	X.R18_VSMOW = R18_VSMOW
406	X.LAMBDA_17 = LAMBDA_17
407	X.R18_VPDB = R18_VPDB
408	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
409
410	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
411		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
412		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
413		)
414	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
415		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
416		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
417		D17O=D17O, D47=D47, D48=D48, D49=D49,
418		)
419	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
420		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
421		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
422		D17O=D17O,
423		)
424	
425	d45 = 1000 * (R45/R45wg - 1)
426	d46 = 1000 * (R46/R46wg - 1)
427	d47 = 1000 * (R47/R47wg - 1)
428	d48 = 1000 * (R48/R48wg - 1)
429	d49 = 1000 * (R49/R49wg - 1)
430
431	for k in range(3): # dumb iteration to adjust for small changes in d47
432		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
433		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
434		d47 = 1000 * (R47raw/R47wg - 1)
435		d48 = 1000 * (R48raw/R48wg - 1)
436
437	return dict(
438		Sample = sample,
439		D17O = D17O,
440		d13Cwg_VPDB = d13Cwg_VPDB,
441		d18Owg_VSMOW = d18Owg_VSMOW,
442		d45 = d45,
443		d46 = d46,
444		d47 = d47,
445		d48 = d48,
446		d49 = d49,
447		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
450def virtual_data(
451	samples = [],
452	a47 = 1., b47 = 0., c47 = -0.9,
453	a48 = 1., b48 = 0., c48 = -0.45,
454	rd45 = 0.020, rd46 = 0.060,
455	rD47 = 0.015, rD48 = 0.045,
456	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
457	session = None,
458	Nominal_D47 = None, Nominal_D48 = None,
459	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
460	ALPHA_18O_ACID_REACTION = None,
461	R13_VPDB = None,
462	R17_VSMOW = None,
463	R18_VSMOW = None,
464	LAMBDA_17 = None,
465	R18_VPDB = None,
466	seed = 0,
467	shuffle = True,
468	):
469	'''
470	Return list with simulated analyses from a single session.
471	
472	**Parameters**
473	
474	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
475	    * `Sample`: the name of the sample
476	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
477	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
478	    * `N`: how many analyses to generate for this sample
479	+ `a47`: scrambling factor for Δ47
480	+ `b47`: compositional nonlinearity for Δ47
481	+ `c47`: working gas offset for Δ47
482	+ `a48`: scrambling factor for Δ48
483	+ `b48`: compositional nonlinearity for Δ48
484	+ `c48`: working gas offset for Δ48
485	+ `rd45`: analytical repeatability of δ45
486	+ `rd46`: analytical repeatability of δ46
487	+ `rD47`: analytical repeatability of Δ47
488	+ `rD48`: analytical repeatability of Δ48
489	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
490		(by default equal to the `simulate_single_analysis` default values)
491	+ `session`: name of the session (no name by default)
492	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
493		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
494	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
495		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
496		(by default equal to the `simulate_single_analysis` defaults)
497	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
498		(by default equal to the `simulate_single_analysis` defaults)
499	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
500		correction parameters (by default equal to the `simulate_single_analysis` default)
501	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
502	+ `shuffle`: randomly reorder the sequence of analyses
503	
504		
505	Here is an example of using this method to generate an arbitrary combination of
506	anchors and unknowns for a bunch of sessions:
507
508	```py
509	.. include:: ../../code_examples/virtual_data/example.py
510	```
511	
512	This should output something like:
513	
514	```
515	.. include:: ../../code_examples/virtual_data/output.txt
516	```
517	'''
518	
519	kwargs = locals().copy()
520
521	from numpy import random as nprandom
522	if seed:
523		nprandom.seed(seed)
524		rng = nprandom.default_rng(seed)
525	else:
526		rng = nprandom.default_rng()
527	
528	N = sum([s['N'] for s in samples])
529	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
530	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
531	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
532	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
533	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
534	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
535	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
536	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
537	
538	k = 0
539	out = []
540	for s in samples:
541		kw = {}
542		kw['sample'] = s['Sample']
543		kw = {
544			**kw,
545			**{var: kwargs[var]
546				for var in [
547					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
548					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
549					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
550					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
551					]
552				if kwargs[var] is not None},
553			**{var: s[var]
554				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
555				if var in s},
556			}
557
558		sN = s['N']
559		while sN:
560			out.append(simulate_single_analysis(**kw))
561			out[-1]['d45'] += errors45[k]
562			out[-1]['d46'] += errors46[k]
563			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
564			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
565			sN -= 1
566			k += 1
567
568		if session is not None:
569			for r in out:
570				r['Session'] = session
571
572		if shuffle:
573			nprandom.shuffle(out)
574
575	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————

[table_of_samples] 
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————

[table_of_analyses] 
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
1    Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
2    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
3    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
4    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
5    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
6    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
7    Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
8    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
9    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
10   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
11   Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
12   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
15   Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
16   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
17   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
18   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
19   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
20   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
21   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
22   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
23   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
24   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
25   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
26   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
27   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
28   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
29   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
30   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
31   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
32   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
33   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
34   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
35   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
36   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
37   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
38   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
39   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
40   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
41   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
42   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
43   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
44   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
45   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
46   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
47   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
48   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
49   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
50   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
51   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
52   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
53   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
54   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
55   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
56   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
57   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
58   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
59   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
60   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
577def table_of_samples(
578	data47 = None,
579	data48 = None,
580	dir = 'output',
581	filename = None,
582	save_to_file = True,
583	print_out = True,
584	output = None,
585	):
586	'''
587	Print out, save to disk and/or return a combined table of samples
588	for a pair of `D47data` and `D48data` objects.
589
590	**Parameters**
591
592	+ `data47`: `D47data` instance
593	+ `data48`: `D48data` instance
594	+ `dir`: the directory in which to save the table
595	+ `filename`: the name to the csv file to write to
596	+ `save_to_file`: whether to save the table to disk
597	+ `print_out`: whether to print out the table
598	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
599		if set to `'raw'`: return a list of list of strings
600		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
601	'''
602	if data47 is None:
603		if data48 is None:
604			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
605		else:
606			return data48.table_of_samples(
607				dir = dir,
608				filename = filename,
609				save_to_file = save_to_file,
610				print_out = print_out,
611				output = output
612				)
613	else:
614		if data48 is None:
615			return data47.table_of_samples(
616				dir = dir,
617				filename = filename,
618				save_to_file = save_to_file,
619				print_out = print_out,
620				output = output
621				)
622		else:
623			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
624			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
625			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
626
627			if save_to_file:
628				if not os.path.exists(dir):
629					os.makedirs(dir)
630				if filename is None:
631					filename = f'D47D48_samples.csv'
632				with open(f'{dir}/{filename}', 'w') as fid:
633					fid.write(make_csv(out))
634			if print_out:
635				print('\n'+pretty_table(out))
636			if output == 'raw':
637				return out
638			elif output == 'pretty':
639				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
642def table_of_sessions(
643	data47 = None,
644	data48 = None,
645	dir = 'output',
646	filename = None,
647	save_to_file = True,
648	print_out = True,
649	output = None,
650	):
651	'''
652	Print out, save to disk and/or return a combined table of sessions
653	for a pair of `D47data` and `D48data` objects.
654	***Only applicable if the sessions in `data47` and those in `data48`
655	consist of the exact same sets of analyses.***
656
657	**Parameters**
658
659	+ `data47`: `D47data` instance
660	+ `data48`: `D48data` instance
661	+ `dir`: the directory in which to save the table
662	+ `filename`: the name to the csv file to write to
663	+ `save_to_file`: whether to save the table to disk
664	+ `print_out`: whether to print out the table
665	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
666		if set to `'raw'`: return a list of list of strings
667		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
668	'''
669	if data47 is None:
670		if data48 is None:
671			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
672		else:
673			return data48.table_of_sessions(
674				dir = dir,
675				filename = filename,
676				save_to_file = save_to_file,
677				print_out = print_out,
678				output = output
679				)
680	else:
681		if data48 is None:
682			return data47.table_of_sessions(
683				dir = dir,
684				filename = filename,
685				save_to_file = save_to_file,
686				print_out = print_out,
687				output = output
688				)
689		else:
690			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
691			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
692			for k,x in enumerate(out47[0]):
693				if k>7:
694					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
695					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
696			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
697
698			if save_to_file:
699				if not os.path.exists(dir):
700					os.makedirs(dir)
701				if filename is None:
702					filename = f'D47D48_sessions.csv'
703				with open(f'{dir}/{filename}', 'w') as fid:
704					fid.write(make_csv(out))
705			if print_out:
706				print('\n'+pretty_table(out))
707			if output == 'raw':
708				return out
709			elif output == 'pretty':
710				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
713def table_of_analyses(
714	data47 = None,
715	data48 = None,
716	dir = 'output',
717	filename = None,
718	save_to_file = True,
719	print_out = True,
720	output = None,
721	):
722	'''
723	Print out, save to disk and/or return a combined table of analyses
724	for a pair of `D47data` and `D48data` objects.
725
726	If the sessions in `data47` and those in `data48` do not consist of
727	the exact same sets of analyses, the table will have two columns
728	`Session_47` and `Session_48` instead of a single `Session` column.
729
730	**Parameters**
731
732	+ `data47`: `D47data` instance
733	+ `data48`: `D48data` instance
734	+ `dir`: the directory in which to save the table
735	+ `filename`: the name to the csv file to write to
736	+ `save_to_file`: whether to save the table to disk
737	+ `print_out`: whether to print out the table
738	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
739		if set to `'raw'`: return a list of list of strings
740		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
741	'''
742	if data47 is None:
743		if data48 is None:
744			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
745		else:
746			return data48.table_of_analyses(
747				dir = dir,
748				filename = filename,
749				save_to_file = save_to_file,
750				print_out = print_out,
751				output = output
752				)
753	else:
754		if data48 is None:
755			return data47.table_of_analyses(
756				dir = dir,
757				filename = filename,
758				save_to_file = save_to_file,
759				print_out = print_out,
760				output = output
761				)
762		else:
763			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
764			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
765			
766			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
767				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
768			else:
769				out47[0][1] = 'Session_47'
770				out48[0][1] = 'Session_48'
771				out47 = transpose_table(out47)
772				out48 = transpose_table(out48)
773				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
774
775			if save_to_file:
776				if not os.path.exists(dir):
777					os.makedirs(dir)
778				if filename is None:
779					filename = f'D47D48_sessions.csv'
780				with open(f'{dir}/{filename}', 'w') as fid:
781					fid.write(make_csv(out))
782			if print_out:
783				print('\n'+pretty_table(out))
784			if output == 'raw':
785				return out
786			elif output == 'pretty':
787				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 835class D4xdata(list):
 836	'''
 837	Store and process data for a large set of Δ47 and/or Δ48
 838	analyses, usually comprising more than one analytical session.
 839	'''
 840
 841	### 17O CORRECTION PARAMETERS
 842	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 843	'''
 844	Absolute (13C/12C) ratio of VPDB.
 845	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 846	'''
 847
 848	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 849	'''
 850	Absolute (18O/16C) ratio of VSMOW.
 851	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 852	'''
 853
 854	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 855	'''
 856	Mass-dependent exponent for triple oxygen isotopes.
 857	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 858	'''
 859
 860	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 861	'''
 862	Absolute (17O/16C) ratio of VSMOW.
 863	By default equal to 0.00038475
 864	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 865	rescaled to `R13_VPDB`)
 866	'''
 867
 868	R18_VPDB = R18_VSMOW * 1.03092
 869	'''
 870	Absolute (18O/16C) ratio of VPDB.
 871	By definition equal to `R18_VSMOW * 1.03092`.
 872	'''
 873
 874	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 875	'''
 876	Absolute (17O/16C) ratio of VPDB.
 877	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 878	'''
 879
 880	LEVENE_REF_SAMPLE = 'ETH-3'
 881	'''
 882	After the Δ4x standardization step, each sample is tested to
 883	assess whether the Δ4x variance within all analyses for that
 884	sample differs significantly from that observed for a given reference
 885	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 886	which yields a p-value corresponding to the null hypothesis that the
 887	underlying variances are equal).
 888
 889	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 890	sample should be used as a reference for this test.
 891	'''
 892
 893	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 894	'''
 895	Specifies the 18O/16O fractionation factor generally applicable
 896	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 897	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 898
 899	By default equal to 1.008129 (calcite reacted at 90 °C,
 900	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 901	'''
 902
 903	Nominal_d13C_VPDB = {
 904		'ETH-1': 2.02,
 905		'ETH-2': -10.17,
 906		'ETH-3': 1.71,
 907		}	# (Bernasconi et al., 2018)
 908	'''
 909	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 910	`D4xdata.standardize_d13C()`.
 911
 912	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 913	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 914	'''
 915
 916	Nominal_d18O_VPDB = {
 917		'ETH-1': -2.19,
 918		'ETH-2': -18.69,
 919		'ETH-3': -1.78,
 920		}	# (Bernasconi et al., 2018)
 921	'''
 922	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 923	`D4xdata.standardize_d18O()`.
 924
 925	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 926	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 927	'''
 928
 929	d13C_STANDARDIZATION_METHOD = '2pt'
 930	'''
 931	Method by which to standardize δ13C values:
 932	
 933	+ `none`: do not apply any δ13C standardization.
 934	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 935	minimize the difference between final δ13C_VPDB values and
 936	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 937	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 938	values so as to minimize the difference between final δ13C_VPDB
 939	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 940	is defined).
 941	'''
 942
 943	d18O_STANDARDIZATION_METHOD = '2pt'
 944	'''
 945	Method by which to standardize δ18O values:
 946	
 947	+ `none`: do not apply any δ18O standardization.
 948	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 949	minimize the difference between final δ18O_VPDB values and
 950	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 951	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 952	values so as to minimize the difference between final δ18O_VPDB
 953	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 954	is defined).
 955	'''
 956
 957	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 958		'''
 959		**Parameters**
 960
 961		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 962		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 963		+ `mass`: `'47'` or `'48'`
 964		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 965		+ `session`: define session name for analyses without a `Session` key
 966		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 967
 968		Returns a `D4xdata` object derived from `list`.
 969		'''
 970		self._4x = mass
 971		self.verbose = verbose
 972		self.prefix = 'D4xdata'
 973		self.logfile = logfile
 974		list.__init__(self, l)
 975		self.Nf = None
 976		self.repeatability = {}
 977		self.refresh(session = session)
 978
 979
 980	def make_verbal(oldfun):
 981		'''
 982		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 983		'''
 984		@wraps(oldfun)
 985		def newfun(*args, verbose = '', **kwargs):
 986			myself = args[0]
 987			oldprefix = myself.prefix
 988			myself.prefix = oldfun.__name__
 989			if verbose != '':
 990				oldverbose = myself.verbose
 991				myself.verbose = verbose
 992			out = oldfun(*args, **kwargs)
 993			myself.prefix = oldprefix
 994			if verbose != '':
 995				myself.verbose = oldverbose
 996			return out
 997		return newfun
 998
 999
1000	def msg(self, txt):
1001		'''
1002		Log a message to `self.logfile`, and print it out if `verbose = True`
1003		'''
1004		self.log(txt)
1005		if self.verbose:
1006			print(f'{f"[{self.prefix}]":<16} {txt}')
1007
1008
1009	def vmsg(self, txt):
1010		'''
1011		Log a message to `self.logfile` and print it out
1012		'''
1013		self.log(txt)
1014		print(txt)
1015
1016
1017	def log(self, *txts):
1018		'''
1019		Log a message to `self.logfile`
1020		'''
1021		if self.logfile:
1022			with open(self.logfile, 'a') as fid:
1023				for txt in txts:
1024					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1025
1026
1027	def refresh(self, session = 'mySession'):
1028		'''
1029		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1030		'''
1031		self.fill_in_missing_info(session = session)
1032		self.refresh_sessions()
1033		self.refresh_samples()
1034
1035
1036	def refresh_sessions(self):
1037		'''
1038		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1039		to `False` for all sessions.
1040		'''
1041		self.sessions = {
1042			s: {'data': [r for r in self if r['Session'] == s]}
1043			for s in sorted({r['Session'] for r in self})
1044			}
1045		for s in self.sessions:
1046			self.sessions[s]['scrambling_drift'] = False
1047			self.sessions[s]['slope_drift'] = False
1048			self.sessions[s]['wg_drift'] = False
1049			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1050			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1051
1052
1053	def refresh_samples(self):
1054		'''
1055		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1056		'''
1057		self.samples = {
1058			s: {'data': [r for r in self if r['Sample'] == s]}
1059			for s in sorted({r['Sample'] for r in self})
1060			}
1061		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1062		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1063
1064
1065	def read(self, filename, sep = '', session = ''):
1066		'''
1067		Read file in csv format to load data into a `D47data` object.
1068
1069		In the csv file, spaces before and after field separators (`','` by default)
1070		are optional. Each line corresponds to a single analysis.
1071
1072		The required fields are:
1073
1074		+ `UID`: a unique identifier
1075		+ `Session`: an identifier for the analytical session
1076		+ `Sample`: a sample identifier
1077		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1078
1079		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1080		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1081		and `d49` are optional, and set to NaN by default.
1082
1083		**Parameters**
1084
1085		+ `fileneme`: the path of the file to read
1086		+ `sep`: csv separator delimiting the fields
1087		+ `session`: set `Session` field to this string for all analyses
1088		'''
1089		with open(filename) as fid:
1090			self.input(fid.read(), sep = sep, session = session)
1091
1092
1093	def input(self, txt, sep = '', session = ''):
1094		'''
1095		Read `txt` string in csv format to load analysis data into a `D47data` object.
1096
1097		In the csv string, spaces before and after field separators (`','` by default)
1098		are optional. Each line corresponds to a single analysis.
1099
1100		The required fields are:
1101
1102		+ `UID`: a unique identifier
1103		+ `Session`: an identifier for the analytical session
1104		+ `Sample`: a sample identifier
1105		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1106
1107		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1108		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1109		and `d49` are optional, and set to NaN by default.
1110
1111		**Parameters**
1112
1113		+ `txt`: the csv string to read
1114		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1115		whichever appers most often in `txt`.
1116		+ `session`: set `Session` field to this string for all analyses
1117		'''
1118		if sep == '':
1119			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1120		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1121		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1122
1123		if session != '':
1124			for r in data:
1125				r['Session'] = session
1126
1127		self += data
1128		self.refresh()
1129
1130
1131	@make_verbal
1132	def wg(self,
1133		samples = None,
1134		session_groups = None,
1135	):
1136		'''
1137		Compute bulk composition of the working gas for each session based (by default)
1138		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1139		`self.Nominal_d18O_VPDB`.
1140
1141		**Parameters**
1142
1143		+ `samples`: A list of samples specifying the subset of samples (defined in both
1144		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1145		when computing the working gas. By default, use all samples defined both in
1146		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1147		+ `session_groups`: a list of lists of sessions
1148		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1149		specifying which sessions groups, if any, have the exact same WG composition.
1150		If set to `'all'`, force all sessions to have the same WG composition (use with
1151		caution and on short time scales, since the WG may drift slowly a long time scales).
1152		'''
1153
1154		self.msg('Computing WG composition:')
1155
1156		a18_acid = self.ALPHA_18O_ACID_REACTION
1157		
1158		if samples is None:
1159			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1160		if session_groups is None:
1161			session_groups = [[s] for s in self.sessions]
1162		elif session_groups == 'all':
1163			session_groups = [[s for s in self.sessions]]
1164
1165		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1166		R45R46_standards = {}
1167		for sample in samples:
1168			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1169			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1170			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1171			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1172			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1173
1174			C12_s = 1 / (1 + R13_s)
1175			C13_s = R13_s / (1 + R13_s)
1176			C16_s = 1 / (1 + R17_s + R18_s)
1177			C17_s = R17_s / (1 + R17_s + R18_s)
1178			C18_s = R18_s / (1 + R17_s + R18_s)
1179
1180			C626_s = C12_s * C16_s ** 2
1181			C627_s = 2 * C12_s * C16_s * C17_s
1182			C628_s = 2 * C12_s * C16_s * C18_s
1183			C636_s = C13_s * C16_s ** 2
1184			C637_s = 2 * C13_s * C16_s * C17_s
1185			C727_s = C12_s * C17_s ** 2
1186
1187			R45_s = (C627_s + C636_s) / C626_s
1188			R46_s = (C628_s + C637_s + C727_s) / C626_s
1189			R45R46_standards[sample] = (R45_s, R46_s)
1190		
1191		for sg in session_groups:
1192			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1193			assert db, f'No sample from {samples} found in session group {sg}.'
1194
1195			X = [r['d45'] for r in db]
1196			Y = [R45R46_standards[r['Sample']][0] for r in db]
1197			x1, x2 = np.min(X), np.max(X)
1198
1199			if x1 < x2:
1200				wgcoord = x1/(x1-x2)
1201			else:
1202				wgcoord = 999
1203
1204			if wgcoord < -.5 or wgcoord > 1.5:
1205				# unreasonable to extrapolate to d45 = 0
1206				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1207			else :
1208				# d45 = 0 is reasonably well bracketed
1209				R45_wg = np.polyfit(X, Y, 1)[1]
1210
1211			X = [r['d46'] for r in db]
1212			Y = [R45R46_standards[r['Sample']][1] for r in db]
1213			x1, x2 = np.min(X), np.max(X)
1214
1215			if x1 < x2:
1216				wgcoord = x1/(x1-x2)
1217			else:
1218				wgcoord = 999
1219
1220			if wgcoord < -.5 or wgcoord > 1.5:
1221				# unreasonable to extrapolate to d46 = 0
1222				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1223			else :
1224				# d46 = 0 is reasonably well bracketed
1225				R46_wg = np.polyfit(X, Y, 1)[1]
1226
1227			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1228
1229			for s in sg:
1230				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1231	
1232				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1233				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1234				for r in self.sessions[s]['data']:
1235					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1236					r['d18Owg_VSMOW'] = d18Owg_VSMOW
1237
1238
1239	def compute_bulk_delta(self, R45, R46, D17O = 0):
1240		'''
1241		Compute δ13C_VPDB and δ18O_VSMOW,
1242		by solving the generalized form of equation (17) from
1243		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1244		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1245		solving the corresponding second-order Taylor polynomial.
1246		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1247		'''
1248
1249		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1250
1251		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1252		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1253		C = 2 * self.R18_VSMOW
1254		D = -R46
1255
1256		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1257		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1258		cc = A + B + C + D
1259
1260		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1261
1262		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1263		R17 = K * R18 ** self.LAMBDA_17
1264		R13 = R45 - 2 * R17
1265
1266		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1267
1268		return d13C_VPDB, d18O_VSMOW
1269
1270
1271	@make_verbal
1272	def crunch(self, verbose = ''):
1273		'''
1274		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1275		'''
1276		for r in self:
1277			self.compute_bulk_and_clumping_deltas(r)
1278		self.standardize_d13C()
1279		self.standardize_d18O()
1280		self.msg(f"Crunched {len(self)} analyses.")
1281
1282
1283	def fill_in_missing_info(self, session = 'mySession'):
1284		'''
1285		Fill in optional fields with default values
1286		'''
1287		for i,r in enumerate(self):
1288			if 'D17O' not in r:
1289				r['D17O'] = 0.
1290			if 'UID' not in r:
1291				r['UID'] = f'{i+1}'
1292			if 'Session' not in r:
1293				r['Session'] = session
1294			for k in ['d47', 'd48', 'd49']:
1295				if k not in r:
1296					r[k] = np.nan
1297
1298
1299	def standardize_d13C(self):
1300		'''
1301		Perform δ13C standadization within each session `s` according to
1302		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1303		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1304		may be redefined abitrarily at a later stage.
1305		'''
1306		for s in self.sessions:
1307			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1308				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1309				X,Y = zip(*XY)
1310				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1311					offset = np.mean(Y) - np.mean(X)
1312					for r in self.sessions[s]['data']:
1313						r['d13C_VPDB'] += offset				
1314				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1315					a,b = np.polyfit(X,Y,1)
1316					for r in self.sessions[s]['data']:
1317						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1318
1319	def standardize_d18O(self):
1320		'''
1321		Perform δ18O standadization within each session `s` according to
1322		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1323		which is defined by default by `D47data.refresh_sessions()`as equal to
1324		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1325		'''
1326		for s in self.sessions:
1327			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1328				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1329				X,Y = zip(*XY)
1330				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1331				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1332					offset = np.mean(Y) - np.mean(X)
1333					for r in self.sessions[s]['data']:
1334						r['d18O_VSMOW'] += offset				
1335				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1336					a,b = np.polyfit(X,Y,1)
1337					for r in self.sessions[s]['data']:
1338						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1339	
1340
1341	def compute_bulk_and_clumping_deltas(self, r):
1342		'''
1343		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1344		'''
1345
1346		# Compute working gas R13, R18, and isobar ratios
1347		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1348		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1349		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1350
1351		# Compute analyte isobar ratios
1352		R45 = (1 + r['d45'] / 1000) * R45_wg
1353		R46 = (1 + r['d46'] / 1000) * R46_wg
1354		R47 = (1 + r['d47'] / 1000) * R47_wg
1355		R48 = (1 + r['d48'] / 1000) * R48_wg
1356		R49 = (1 + r['d49'] / 1000) * R49_wg
1357
1358		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1359		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1360		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1361
1362		# Compute stochastic isobar ratios of the analyte
1363		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1364			R13, R18, D17O = r['D17O']
1365		)
1366
1367		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1368		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1369		if (R45 / R45stoch - 1) > 5e-8:
1370			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1371		if (R46 / R46stoch - 1) > 5e-8:
1372			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1373
1374		# Compute raw clumped isotope anomalies
1375		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1376		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1377		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1378
1379
1380	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1381		'''
1382		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1383		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1384		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1385		'''
1386
1387		# Compute R17
1388		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1389
1390		# Compute isotope concentrations
1391		C12 = (1 + R13) ** -1
1392		C13 = C12 * R13
1393		C16 = (1 + R17 + R18) ** -1
1394		C17 = C16 * R17
1395		C18 = C16 * R18
1396
1397		# Compute stochastic isotopologue concentrations
1398		C626 = C16 * C12 * C16
1399		C627 = C16 * C12 * C17 * 2
1400		C628 = C16 * C12 * C18 * 2
1401		C636 = C16 * C13 * C16
1402		C637 = C16 * C13 * C17 * 2
1403		C638 = C16 * C13 * C18 * 2
1404		C727 = C17 * C12 * C17
1405		C728 = C17 * C12 * C18 * 2
1406		C737 = C17 * C13 * C17
1407		C738 = C17 * C13 * C18 * 2
1408		C828 = C18 * C12 * C18
1409		C838 = C18 * C13 * C18
1410
1411		# Compute stochastic isobar ratios
1412		R45 = (C636 + C627) / C626
1413		R46 = (C628 + C637 + C727) / C626
1414		R47 = (C638 + C728 + C737) / C626
1415		R48 = (C738 + C828) / C626
1416		R49 = C838 / C626
1417
1418		# Account for stochastic anomalies
1419		R47 *= 1 + D47 / 1000
1420		R48 *= 1 + D48 / 1000
1421		R49 *= 1 + D49 / 1000
1422
1423		# Return isobar ratios
1424		return R45, R46, R47, R48, R49
1425
1426
1427	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1428		'''
1429		Split unknown samples by UID (treat all analyses as different samples)
1430		or by session (treat analyses of a given sample in different sessions as
1431		different samples).
1432
1433		**Parameters**
1434
1435		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1436		+ `grouping`: `by_uid` | `by_session`
1437		'''
1438		if samples_to_split == 'all':
1439			samples_to_split = [s for s in self.unknowns]
1440		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1441		self.grouping = grouping.lower()
1442		if self.grouping in gkeys:
1443			gkey = gkeys[self.grouping]
1444		for r in self:
1445			if r['Sample'] in samples_to_split:
1446				r['Sample_original'] = r['Sample']
1447				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1448			elif r['Sample'] in self.unknowns:
1449				r['Sample_original'] = r['Sample']
1450		self.refresh_samples()
1451
1452
1453	def unsplit_samples(self, tables = False):
1454		'''
1455		Reverse the effects of `D47data.split_samples()`.
1456		
1457		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1458		
1459		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1460		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1461		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1462		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1463		that case session-averaged Δ4x values are statistically independent).
1464		'''
1465		unknowns_old = sorted({s for s in self.unknowns})
1466		CM_old = self.standardization.covar[:,:]
1467		VD_old = self.standardization.params.valuesdict().copy()
1468		vars_old = self.standardization.var_names
1469
1470		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1471
1472		Ns = len(vars_old) - len(unknowns_old)
1473		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1474		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1475
1476		W = np.zeros((len(vars_new), len(vars_old)))
1477		W[:Ns,:Ns] = np.eye(Ns)
1478		for u in unknowns_new:
1479			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1480			if self.grouping == 'by_session':
1481				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1482			elif self.grouping == 'by_uid':
1483				weights = [1 for s in splits]
1484			sw = sum(weights)
1485			weights = [w/sw for w in weights]
1486			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1487
1488		CM_new = W @ CM_old @ W.T
1489		V = W @ np.array([[VD_old[k]] for k in vars_old])
1490		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1491
1492		self.standardization.covar = CM_new
1493		self.standardization.params.valuesdict = lambda : VD_new
1494		self.standardization.var_names = vars_new
1495
1496		for r in self:
1497			if r['Sample'] in self.unknowns:
1498				r['Sample_split'] = r['Sample']
1499				r['Sample'] = r['Sample_original']
1500
1501		self.refresh_samples()
1502		self.consolidate_samples()
1503		self.repeatabilities()
1504
1505		if tables:
1506			self.table_of_analyses()
1507			self.table_of_samples()
1508
1509	def assign_timestamps(self):
1510		'''
1511		Assign a time field `t` of type `float` to each analysis.
1512
1513		If `TimeTag` is one of the data fields, `t` is equal within a given session
1514		to `TimeTag` minus the mean value of `TimeTag` for that session.
1515		Otherwise, `TimeTag` is by default equal to the index of each analysis
1516		in the dataset and `t` is defined as above.
1517		'''
1518		for session in self.sessions:
1519			sdata = self.sessions[session]['data']
1520			try:
1521				t0 = np.mean([r['TimeTag'] for r in sdata])
1522				for r in sdata:
1523					r['t'] = r['TimeTag'] - t0
1524			except KeyError:
1525				t0 = (len(sdata)-1)/2
1526				for t,r in enumerate(sdata):
1527					r['t'] = t - t0
1528
1529
1530	def report(self):
1531		'''
1532		Prints a report on the standardization fit.
1533		Only applicable after `D4xdata.standardize(method='pooled')`.
1534		'''
1535		report_fit(self.standardization)
1536
1537
1538	def combine_samples(self, sample_groups):
1539		'''
1540		Combine analyses of different samples to compute weighted average Δ4x
1541		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1542		dictionary.
1543		
1544		Caution: samples are weighted by number of replicate analyses, which is a
1545		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1546		correlated analytical errors for one or more samples).
1547		
1548		Returns a tuplet of:
1549		
1550		+ the list of group names
1551		+ an array of the corresponding Δ4x values
1552		+ the corresponding (co)variance matrix
1553		
1554		**Parameters**
1555
1556		+ `sample_groups`: a dictionary of the form:
1557		```py
1558		{'group1': ['sample_1', 'sample_2'],
1559		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1560		```
1561		'''
1562		
1563		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1564		groups = sorted(sample_groups.keys())
1565		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1566		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1567		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1568		W = np.array([
1569			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1570			for j in groups])
1571		D4x_new = W @ D4x_old
1572		CM_new = W @ CM_old @ W.T
1573
1574		return groups, D4x_new[:,0], CM_new
1575		
1576
1577	@make_verbal
1578	def standardize(self,
1579		method = 'pooled',
1580		weighted_sessions = [],
1581		consolidate = True,
1582		consolidate_tables = False,
1583		consolidate_plots = False,
1584		constraints = {},
1585		):
1586		'''
1587		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1588		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1589		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1590		i.e. that their true Δ4x value does not change between sessions,
1591		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1592		`'indep_sessions'`, the standardization processes each session independently, based only
1593		on anchors analyses.
1594		'''
1595
1596		self.standardization_method = method
1597		self.assign_timestamps()
1598
1599		if method == 'pooled':
1600			if weighted_sessions:
1601				for session_group in weighted_sessions:
1602					if self._4x == '47':
1603						X = D47data([r for r in self if r['Session'] in session_group])
1604					elif self._4x == '48':
1605						X = D48data([r for r in self if r['Session'] in session_group])
1606					X.Nominal_D4x = self.Nominal_D4x.copy()
1607					X.refresh()
1608					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1609					w = np.sqrt(result.redchi)
1610					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1611					for r in X:
1612						r[f'wD{self._4x}raw'] *= w
1613			else:
1614				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1615				for r in self:
1616					r[f'wD{self._4x}raw'] = 1.
1617
1618			params = Parameters()
1619			for k,session in enumerate(self.sessions):
1620				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1621				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1622				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1623				s = pf(session)
1624				params.add(f'a_{s}', value = 0.9)
1625				params.add(f'b_{s}', value = 0.)
1626				params.add(f'c_{s}', value = -0.9)
1627				params.add(f'a2_{s}', value = 0.,
1628# 					vary = self.sessions[session]['scrambling_drift'],
1629					)
1630				params.add(f'b2_{s}', value = 0.,
1631# 					vary = self.sessions[session]['slope_drift'],
1632					)
1633				params.add(f'c2_{s}', value = 0.,
1634# 					vary = self.sessions[session]['wg_drift'],
1635					)
1636				if not self.sessions[session]['scrambling_drift']:
1637					params[f'a2_{s}'].expr = '0'
1638				if not self.sessions[session]['slope_drift']:
1639					params[f'b2_{s}'].expr = '0'
1640				if not self.sessions[session]['wg_drift']:
1641					params[f'c2_{s}'].expr = '0'
1642
1643			for sample in self.unknowns:
1644				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1645
1646			for k in constraints:
1647				params[k].expr = constraints[k]
1648
1649			def residuals(p):
1650				R = []
1651				for r in self:
1652					session = pf(r['Session'])
1653					sample = pf(r['Sample'])
1654					if r['Sample'] in self.Nominal_D4x:
1655						R += [ (
1656							r[f'D{self._4x}raw'] - (
1657								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1658								+ p[f'b_{session}'] * r[f'd{self._4x}']
1659								+	p[f'c_{session}']
1660								+ r['t'] * (
1661									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1662									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1663									+	p[f'c2_{session}']
1664									)
1665								)
1666							) / r[f'wD{self._4x}raw'] ]
1667					else:
1668						R += [ (
1669							r[f'D{self._4x}raw'] - (
1670								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1671								+ p[f'b_{session}'] * r[f'd{self._4x}']
1672								+	p[f'c_{session}']
1673								+ r['t'] * (
1674									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1675									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1676									+	p[f'c2_{session}']
1677									)
1678								)
1679							) / r[f'wD{self._4x}raw'] ]
1680				return R
1681
1682			M = Minimizer(residuals, params)
1683			result = M.least_squares()
1684			self.Nf = result.nfree
1685			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1686			new_names, new_covar, new_se = _fullcovar(result)[:3]
1687			result.var_names = new_names
1688			result.covar = new_covar
1689
1690			for r in self:
1691				s = pf(r["Session"])
1692				a = result.params.valuesdict()[f'a_{s}']
1693				b = result.params.valuesdict()[f'b_{s}']
1694				c = result.params.valuesdict()[f'c_{s}']
1695				a2 = result.params.valuesdict()[f'a2_{s}']
1696				b2 = result.params.valuesdict()[f'b2_{s}']
1697				c2 = result.params.valuesdict()[f'c2_{s}']
1698				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1699				
1700
1701			self.standardization = result
1702
1703			for session in self.sessions:
1704				self.sessions[session]['Np'] = 3
1705				for k in ['scrambling', 'slope', 'wg']:
1706					if self.sessions[session][f'{k}_drift']:
1707						self.sessions[session]['Np'] += 1
1708
1709			if consolidate:
1710				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1711			return result
1712
1713
1714		elif method == 'indep_sessions':
1715
1716			if weighted_sessions:
1717				for session_group in weighted_sessions:
1718					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1719					X.Nominal_D4x = self.Nominal_D4x.copy()
1720					X.refresh()
1721					# This is only done to assign r['wD47raw'] for r in X:
1722					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1723					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1724			else:
1725				self.msg('All weights set to 1 ‰')
1726				for r in self:
1727					r[f'wD{self._4x}raw'] = 1
1728
1729			for session in self.sessions:
1730				s = self.sessions[session]
1731				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1732				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1733				s['Np'] = sum(p_active)
1734				sdata = s['data']
1735
1736				A = np.array([
1737					[
1738						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1739						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1740						1 / r[f'wD{self._4x}raw'],
1741						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1742						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1743						r['t'] / r[f'wD{self._4x}raw']
1744						]
1745					for r in sdata if r['Sample'] in self.anchors
1746					])[:,p_active] # only keep columns for the active parameters
1747				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1748				s['Na'] = Y.size
1749				CM = linalg.inv(A.T @ A)
1750				bf = (CM @ A.T @ Y).T[0,:]
1751				k = 0
1752				for n,a in zip(p_names, p_active):
1753					if a:
1754						s[n] = bf[k]
1755# 						self.msg(f'{n} = {bf[k]}')
1756						k += 1
1757					else:
1758						s[n] = 0.
1759# 						self.msg(f'{n} = 0.0')
1760
1761				for r in sdata :
1762					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1763					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1764					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1765
1766				s['CM'] = np.zeros((6,6))
1767				i = 0
1768				k_active = [j for j,a in enumerate(p_active) if a]
1769				for j,a in enumerate(p_active):
1770					if a:
1771						s['CM'][j,k_active] = CM[i,:]
1772						i += 1
1773
1774			if not weighted_sessions:
1775				w = self.rmswd()['rmswd']
1776				for r in self:
1777						r[f'wD{self._4x}'] *= w
1778						r[f'wD{self._4x}raw'] *= w
1779				for session in self.sessions:
1780					self.sessions[session]['CM'] *= w**2
1781
1782			for session in self.sessions:
1783				s = self.sessions[session]
1784				s['SE_a'] = s['CM'][0,0]**.5
1785				s['SE_b'] = s['CM'][1,1]**.5
1786				s['SE_c'] = s['CM'][2,2]**.5
1787				s['SE_a2'] = s['CM'][3,3]**.5
1788				s['SE_b2'] = s['CM'][4,4]**.5
1789				s['SE_c2'] = s['CM'][5,5]**.5
1790
1791			if not weighted_sessions:
1792				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1793			else:
1794				self.Nf = 0
1795				for sg in weighted_sessions:
1796					self.Nf += self.rmswd(sessions = sg)['Nf']
1797
1798			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1799
1800			avgD4x = {
1801				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1802				for sample in self.samples
1803				}
1804			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1805			rD4x = (chi2/self.Nf)**.5
1806			self.repeatability[f'sigma_{self._4x}'] = rD4x
1807
1808			if consolidate:
1809				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1810
1811
1812	def standardization_error(self, session, d4x, D4x, t = 0):
1813		'''
1814		Compute standardization error for a given session and
1815		(δ47, Δ47) composition.
1816		'''
1817		a = self.sessions[session]['a']
1818		b = self.sessions[session]['b']
1819		c = self.sessions[session]['c']
1820		a2 = self.sessions[session]['a2']
1821		b2 = self.sessions[session]['b2']
1822		c2 = self.sessions[session]['c2']
1823		CM = self.sessions[session]['CM']
1824
1825		x, y = D4x, d4x
1826		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1827# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1828		dxdy = -(b+b2*t) / (a+a2*t)
1829		dxdz = 1. / (a+a2*t)
1830		dxda = -x / (a+a2*t)
1831		dxdb = -y / (a+a2*t)
1832		dxdc = -1. / (a+a2*t)
1833		dxda2 = -x * a2 / (a+a2*t)
1834		dxdb2 = -y * t / (a+a2*t)
1835		dxdc2 = -t / (a+a2*t)
1836		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1837		sx = (V @ CM @ V.T) ** .5
1838		return sx
1839
1840
1841	@make_verbal
1842	def summary(self,
1843		dir = 'output',
1844		filename = None,
1845		save_to_file = True,
1846		print_out = True,
1847		):
1848		'''
1849		Print out an/or save to disk a summary of the standardization results.
1850
1851		**Parameters**
1852
1853		+ `dir`: the directory in which to save the table
1854		+ `filename`: the name to the csv file to write to
1855		+ `save_to_file`: whether to save the table to disk
1856		+ `print_out`: whether to print out the table
1857		'''
1858
1859		out = []
1860		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1861		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1862		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1863		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1864		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1865		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1866		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1867		out += [['Model degrees of freedom', f"{self.Nf}"]]
1868		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1869		out += [['Standardization method', self.standardization_method]]
1870
1871		if save_to_file:
1872			if not os.path.exists(dir):
1873				os.makedirs(dir)
1874			if filename is None:
1875				filename = f'D{self._4x}_summary.csv'
1876			with open(f'{dir}/{filename}', 'w') as fid:
1877				fid.write(make_csv(out))
1878		if print_out:
1879			self.msg('\n' + pretty_table(out, header = 0))
1880
1881
1882	@make_verbal
1883	def table_of_sessions(self,
1884		dir = 'output',
1885		filename = None,
1886		save_to_file = True,
1887		print_out = True,
1888		output = None,
1889		):
1890		'''
1891		Print out an/or save to disk a table of sessions.
1892
1893		**Parameters**
1894
1895		+ `dir`: the directory in which to save the table
1896		+ `filename`: the name to the csv file to write to
1897		+ `save_to_file`: whether to save the table to disk
1898		+ `print_out`: whether to print out the table
1899		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1900		    if set to `'raw'`: return a list of list of strings
1901		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1902		'''
1903		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1904		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1905		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1906
1907		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1908		if include_a2:
1909			out[-1] += ['a2 ± SE']
1910		if include_b2:
1911			out[-1] += ['b2 ± SE']
1912		if include_c2:
1913			out[-1] += ['c2 ± SE']
1914		for session in self.sessions:
1915			out += [[
1916				session,
1917				f"{self.sessions[session]['Na']}",
1918				f"{self.sessions[session]['Nu']}",
1919				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1920				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1921				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1922				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1923				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1924				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1925				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1926				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1927				]]
1928			if include_a2:
1929				if self.sessions[session]['scrambling_drift']:
1930					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1931				else:
1932					out[-1] += ['']
1933			if include_b2:
1934				if self.sessions[session]['slope_drift']:
1935					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1936				else:
1937					out[-1] += ['']
1938			if include_c2:
1939				if self.sessions[session]['wg_drift']:
1940					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1941				else:
1942					out[-1] += ['']
1943
1944		if save_to_file:
1945			if not os.path.exists(dir):
1946				os.makedirs(dir)
1947			if filename is None:
1948				filename = f'D{self._4x}_sessions.csv'
1949			with open(f'{dir}/{filename}', 'w') as fid:
1950				fid.write(make_csv(out))
1951		if print_out:
1952			self.msg('\n' + pretty_table(out))
1953		if output == 'raw':
1954			return out
1955		elif output == 'pretty':
1956			return pretty_table(out)
1957
1958
1959	@make_verbal
1960	def table_of_analyses(
1961		self,
1962		dir = 'output',
1963		filename = None,
1964		save_to_file = True,
1965		print_out = True,
1966		output = None,
1967		):
1968		'''
1969		Print out an/or save to disk a table of analyses.
1970
1971		**Parameters**
1972
1973		+ `dir`: the directory in which to save the table
1974		+ `filename`: the name to the csv file to write to
1975		+ `save_to_file`: whether to save the table to disk
1976		+ `print_out`: whether to print out the table
1977		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1978		    if set to `'raw'`: return a list of list of strings
1979		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1980		'''
1981
1982		out = [['UID','Session','Sample']]
1983		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1984		for f in extra_fields:
1985			out[-1] += [f[0]]
1986		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1987		for r in self:
1988			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1989			for f in extra_fields:
1990				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1991			out[-1] += [
1992				f"{r['d13Cwg_VPDB']:.3f}",
1993				f"{r['d18Owg_VSMOW']:.3f}",
1994				f"{r['d45']:.6f}",
1995				f"{r['d46']:.6f}",
1996				f"{r['d47']:.6f}",
1997				f"{r['d48']:.6f}",
1998				f"{r['d49']:.6f}",
1999				f"{r['d13C_VPDB']:.6f}",
2000				f"{r['d18O_VSMOW']:.6f}",
2001				f"{r['D47raw']:.6f}",
2002				f"{r['D48raw']:.6f}",
2003				f"{r['D49raw']:.6f}",
2004				f"{r[f'D{self._4x}']:.6f}"
2005				]
2006		if save_to_file:
2007			if not os.path.exists(dir):
2008				os.makedirs(dir)
2009			if filename is None:
2010				filename = f'D{self._4x}_analyses.csv'
2011			with open(f'{dir}/{filename}', 'w') as fid:
2012				fid.write(make_csv(out))
2013		if print_out:
2014			self.msg('\n' + pretty_table(out))
2015		return out
2016
2017	@make_verbal
2018	def covar_table(
2019		self,
2020		correl = False,
2021		dir = 'output',
2022		filename = None,
2023		save_to_file = True,
2024		print_out = True,
2025		output = None,
2026		):
2027		'''
2028		Print out, save to disk and/or return the variance-covariance matrix of D4x
2029		for all unknown samples.
2030
2031		**Parameters**
2032
2033		+ `dir`: the directory in which to save the csv
2034		+ `filename`: the name of the csv file to write to
2035		+ `save_to_file`: whether to save the csv
2036		+ `print_out`: whether to print out the matrix
2037		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2038		    if set to `'raw'`: return a list of list of strings
2039		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2040		'''
2041		samples = sorted([u for u in self.unknowns])
2042		out = [[''] + samples]
2043		for s1 in samples:
2044			out.append([s1])
2045			for s2 in samples:
2046				if correl:
2047					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2048				else:
2049					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2050
2051		if save_to_file:
2052			if not os.path.exists(dir):
2053				os.makedirs(dir)
2054			if filename is None:
2055				if correl:
2056					filename = f'D{self._4x}_correl.csv'
2057				else:
2058					filename = f'D{self._4x}_covar.csv'
2059			with open(f'{dir}/{filename}', 'w') as fid:
2060				fid.write(make_csv(out))
2061		if print_out:
2062			self.msg('\n'+pretty_table(out))
2063		if output == 'raw':
2064			return out
2065		elif output == 'pretty':
2066			return pretty_table(out)
2067
2068	@make_verbal
2069	def table_of_samples(
2070		self,
2071		dir = 'output',
2072		filename = None,
2073		save_to_file = True,
2074		print_out = True,
2075		output = None,
2076		):
2077		'''
2078		Print out, save to disk and/or return a table of samples.
2079
2080		**Parameters**
2081
2082		+ `dir`: the directory in which to save the csv
2083		+ `filename`: the name of the csv file to write to
2084		+ `save_to_file`: whether to save the csv
2085		+ `print_out`: whether to print out the table
2086		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2087		    if set to `'raw'`: return a list of list of strings
2088		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2089		'''
2090
2091		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2092		for sample in self.anchors:
2093			out += [[
2094				f"{sample}",
2095				f"{self.samples[sample]['N']}",
2096				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2097				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2098				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2099				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2100				]]
2101		for sample in self.unknowns:
2102			out += [[
2103				f"{sample}",
2104				f"{self.samples[sample]['N']}",
2105				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2106				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2107				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2108				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2109				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2110				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2111				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2112				]]
2113		if save_to_file:
2114			if not os.path.exists(dir):
2115				os.makedirs(dir)
2116			if filename is None:
2117				filename = f'D{self._4x}_samples.csv'
2118			with open(f'{dir}/{filename}', 'w') as fid:
2119				fid.write(make_csv(out))
2120		if print_out:
2121			self.msg('\n'+pretty_table(out))
2122		if output == 'raw':
2123			return out
2124		elif output == 'pretty':
2125			return pretty_table(out)
2126
2127
2128	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2129		'''
2130		Generate session plots and save them to disk.
2131
2132		**Parameters**
2133
2134		+ `dir`: the directory in which to save the plots
2135		+ `figsize`: the width and height (in inches) of each plot
2136		+ `filetype`: 'pdf' or 'png'
2137		+ `dpi`: resolution for PNG output
2138		'''
2139		if not os.path.exists(dir):
2140			os.makedirs(dir)
2141
2142		for session in self.sessions:
2143			sp = self.plot_single_session(session, xylimits = 'constant')
2144			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2145			ppl.close(sp.fig)
2146			
2147
2148
2149	@make_verbal
2150	def consolidate_samples(self):
2151		'''
2152		Compile various statistics for each sample.
2153
2154		For each anchor sample:
2155
2156		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2157		+ `SE_D47` or `SE_D48`: set to zero by definition
2158
2159		For each unknown sample:
2160
2161		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2162		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2163
2164		For each anchor and unknown:
2165
2166		+ `N`: the total number of analyses of this sample
2167		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2168		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2169		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2170		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2171		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2172		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2173		'''
2174		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2175		for sample in self.samples:
2176			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2177			if self.samples[sample]['N'] > 1:
2178				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2179
2180			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2181			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2182
2183			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2184			if len(D4x_pop) > 2:
2185				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2186			
2187		if self.standardization_method == 'pooled':
2188			for sample in self.anchors:
2189				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2190				self.samples[sample][f'SE_D{self._4x}'] = 0.
2191			for sample in self.unknowns:
2192				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2193				try:
2194					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2195				except ValueError:
2196					# when `sample` is constrained by self.standardize(constraints = {...}),
2197					# it is no longer listed in self.standardization.var_names.
2198					# Temporary fix: define SE as zero for now
2199					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2200
2201		elif self.standardization_method == 'indep_sessions':
2202			for sample in self.anchors:
2203				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2204				self.samples[sample][f'SE_D{self._4x}'] = 0.
2205			for sample in self.unknowns:
2206				self.msg(f'Consolidating sample {sample}')
2207				self.unknowns[sample][f'session_D{self._4x}'] = {}
2208				session_avg = []
2209				for session in self.sessions:
2210					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2211					if sdata:
2212						self.msg(f'{sample} found in session {session}')
2213						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2214						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2215						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2216						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2217						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2218						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2219						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2220				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2221				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2222				wsum = sum([weights[s] for s in weights])
2223				for s in weights:
2224					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2225
2226		for r in self:
2227			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2228
2229
2230
2231	def consolidate_sessions(self):
2232		'''
2233		Compute various statistics for each session.
2234
2235		+ `Na`: Number of anchor analyses in the session
2236		+ `Nu`: Number of unknown analyses in the session
2237		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2238		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2239		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2240		+ `a`: scrambling factor
2241		+ `b`: compositional slope
2242		+ `c`: WG offset
2243		+ `SE_a`: Model stadard erorr of `a`
2244		+ `SE_b`: Model stadard erorr of `b`
2245		+ `SE_c`: Model stadard erorr of `c`
2246		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2247		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2248		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2249		+ `a2`: scrambling factor drift
2250		+ `b2`: compositional slope drift
2251		+ `c2`: WG offset drift
2252		+ `Np`: Number of standardization parameters to fit
2253		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2254		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2255		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2256		'''
2257		for session in self.sessions:
2258			if 'd13Cwg_VPDB' not in self.sessions[session]:
2259				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2260			if 'd18Owg_VSMOW' not in self.sessions[session]:
2261				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2262			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2263			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2264
2265			self.msg(f'Computing repeatabilities for session {session}')
2266			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2267			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2268			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2269
2270		if self.standardization_method == 'pooled':
2271			for session in self.sessions:
2272
2273				# different (better?) computation of D4x repeatability for each session:
2274				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2275				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2276
2277				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2278				i = self.standardization.var_names.index(f'a_{pf(session)}')
2279				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2280
2281				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2282				i = self.standardization.var_names.index(f'b_{pf(session)}')
2283				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2284
2285				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2286				i = self.standardization.var_names.index(f'c_{pf(session)}')
2287				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2288
2289				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2290				if self.sessions[session]['scrambling_drift']:
2291					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2292					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2293				else:
2294					self.sessions[session]['SE_a2'] = 0.
2295
2296				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2297				if self.sessions[session]['slope_drift']:
2298					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2299					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2300				else:
2301					self.sessions[session]['SE_b2'] = 0.
2302
2303				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2304				if self.sessions[session]['wg_drift']:
2305					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2306					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2307				else:
2308					self.sessions[session]['SE_c2'] = 0.
2309
2310				i = self.standardization.var_names.index(f'a_{pf(session)}')
2311				j = self.standardization.var_names.index(f'b_{pf(session)}')
2312				k = self.standardization.var_names.index(f'c_{pf(session)}')
2313				CM = np.zeros((6,6))
2314				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2315				try:
2316					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2317					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2318					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2319					try:
2320						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2321						CM[3,4] = self.standardization.covar[i2,j2]
2322						CM[4,3] = self.standardization.covar[j2,i2]
2323					except ValueError:
2324						pass
2325					try:
2326						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2327						CM[3,5] = self.standardization.covar[i2,k2]
2328						CM[5,3] = self.standardization.covar[k2,i2]
2329					except ValueError:
2330						pass
2331				except ValueError:
2332					pass
2333				try:
2334					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2335					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2336					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2337					try:
2338						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2339						CM[4,5] = self.standardization.covar[j2,k2]
2340						CM[5,4] = self.standardization.covar[k2,j2]
2341					except ValueError:
2342						pass
2343				except ValueError:
2344					pass
2345				try:
2346					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2347					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2348					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2349				except ValueError:
2350					pass
2351
2352				self.sessions[session]['CM'] = CM
2353
2354		elif self.standardization_method == 'indep_sessions':
2355			pass # Not implemented yet
2356
2357
2358	@make_verbal
2359	def repeatabilities(self):
2360		'''
2361		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2362		(for all samples, for anchors, and for unknowns).
2363		'''
2364		self.msg('Computing reproducibilities for all sessions')
2365
2366		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2367		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2368		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2369		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2370		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2371
2372
2373	@make_verbal
2374	def consolidate(self, tables = True, plots = True):
2375		'''
2376		Collect information about samples, sessions and repeatabilities.
2377		'''
2378		self.consolidate_samples()
2379		self.consolidate_sessions()
2380		self.repeatabilities()
2381
2382		if tables:
2383			self.summary()
2384			self.table_of_sessions()
2385			self.table_of_analyses()
2386			self.table_of_samples()
2387
2388		if plots:
2389			self.plot_sessions()
2390
2391
2392	@make_verbal
2393	def rmswd(self,
2394		samples = 'all samples',
2395		sessions = 'all sessions',
2396		):
2397		'''
2398		Compute the χ2, root mean squared weighted deviation
2399		(i.e. reduced χ2), and corresponding degrees of freedom of the
2400		Δ4x values for samples in `samples` and sessions in `sessions`.
2401		
2402		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2403		'''
2404		if samples == 'all samples':
2405			mysamples = [k for k in self.samples]
2406		elif samples == 'anchors':
2407			mysamples = [k for k in self.anchors]
2408		elif samples == 'unknowns':
2409			mysamples = [k for k in self.unknowns]
2410		else:
2411			mysamples = samples
2412
2413		if sessions == 'all sessions':
2414			sessions = [k for k in self.sessions]
2415
2416		chisq, Nf = 0, 0
2417		for sample in mysamples :
2418			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2419			if len(G) > 1 :
2420				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2421				Nf += (len(G) - 1)
2422				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2423		r = (chisq / Nf)**.5 if Nf > 0 else 0
2424		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2425		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2426
2427	
2428	@make_verbal
2429	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2430		'''
2431		Compute the repeatability of `[r[key] for r in self]`
2432		'''
2433
2434		if samples == 'all samples':
2435			mysamples = [k for k in self.samples]
2436		elif samples == 'anchors':
2437			mysamples = [k for k in self.anchors]
2438		elif samples == 'unknowns':
2439			mysamples = [k for k in self.unknowns]
2440		else:
2441			mysamples = samples
2442
2443		if sessions == 'all sessions':
2444			sessions = [k for k in self.sessions]
2445
2446		if key in ['D47', 'D48']:
2447			# Full disclosure: the definition of Nf is tricky/debatable
2448			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2449			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2450			Nf = len(G)
2451# 			print(f'len(G) = {Nf}')
2452			Nf -= len([s for s in mysamples if s in self.unknowns])
2453# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2454			for session in sessions:
2455				Np = len([
2456					_ for _ in self.standardization.params
2457					if (
2458						self.standardization.params[_].expr is not None
2459						and (
2460							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2461							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2462							)
2463						)
2464					])
2465# 				print(f'session {session}: {Np} parameters to consider')
2466				Na = len({
2467					r['Sample'] for r in self.sessions[session]['data']
2468					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2469					})
2470# 				print(f'session {session}: {Na} different anchors in that session')
2471				Nf -= min(Np, Na)
2472# 			print(f'Nf = {Nf}')
2473
2474# 			for sample in mysamples :
2475# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2476# 				if len(X) > 1 :
2477# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2478# 					if sample in self.unknowns:
2479# 						Nf += len(X) - 1
2480# 					else:
2481# 						Nf += len(X)
2482# 			if samples in ['anchors', 'all samples']:
2483# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2484			r = (chisq / Nf)**.5 if Nf > 0 else 0
2485
2486		else: # if key not in ['D47', 'D48']
2487			chisq, Nf = 0, 0
2488			for sample in mysamples :
2489				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2490				if len(X) > 1 :
2491					Nf += len(X) - 1
2492					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2493			r = (chisq / Nf)**.5 if Nf > 0 else 0
2494
2495		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2496		return r
2497
2498	def sample_average(self, samples, weights = 'equal', normalize = True):
2499		'''
2500		Weighted average Δ4x value of a group of samples, accounting for covariance.
2501
2502		Returns the weighed average Δ4x value and associated SE
2503		of a group of samples. Weights are equal by default. If `normalize` is
2504		true, `weights` will be rescaled so that their sum equals 1.
2505
2506		**Examples**
2507
2508		```python
2509		self.sample_average(['X','Y'], [1, 2])
2510		```
2511
2512		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2513		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2514		values of samples X and Y, respectively.
2515
2516		```python
2517		self.sample_average(['X','Y'], [1, -1], normalize = False)
2518		```
2519
2520		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2521		'''
2522		if weights == 'equal':
2523			weights = [1/len(samples)] * len(samples)
2524
2525		if normalize:
2526			s = sum(weights)
2527			if s:
2528				weights = [w/s for w in weights]
2529
2530		try:
2531# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2532# 			C = self.standardization.covar[indices,:][:,indices]
2533			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2534			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2535			return correlated_sum(X, C, weights)
2536		except ValueError:
2537			return (0., 0.)
2538
2539
2540	def sample_D4x_covar(self, sample1, sample2 = None):
2541		'''
2542		Covariance between Δ4x values of samples
2543
2544		Returns the error covariance between the average Δ4x values of two
2545		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2546		returns the Δ4x variance for that sample.
2547		'''
2548		if sample2 is None:
2549			sample2 = sample1
2550		if self.standardization_method == 'pooled':
2551			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2552			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2553			return self.standardization.covar[i, j]
2554		elif self.standardization_method == 'indep_sessions':
2555			if sample1 == sample2:
2556				return self.samples[sample1][f'SE_D{self._4x}']**2
2557			else:
2558				c = 0
2559				for session in self.sessions:
2560					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2561					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2562					if sdata1 and sdata2:
2563						a = self.sessions[session]['a']
2564						# !! TODO: CM below does not account for temporal changes in standardization parameters
2565						CM = self.sessions[session]['CM'][:3,:3]
2566						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2567						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2568						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2569						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2570						c += (
2571							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2572							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2573							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2574							@ CM
2575							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2576							) / a**2
2577				return float(c)
2578
2579	def sample_D4x_correl(self, sample1, sample2 = None):
2580		'''
2581		Correlation between Δ4x errors of samples
2582
2583		Returns the error correlation between the average Δ4x values of two samples.
2584		'''
2585		if sample2 is None or sample2 == sample1:
2586			return 1.
2587		return (
2588			self.sample_D4x_covar(sample1, sample2)
2589			/ self.unknowns[sample1][f'SE_D{self._4x}']
2590			/ self.unknowns[sample2][f'SE_D{self._4x}']
2591			)
2592
2593	def plot_single_session(self,
2594		session,
2595		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2596		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2597		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2598		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2599		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2600		xylimits = 'free', # | 'constant'
2601		x_label = None,
2602		y_label = None,
2603		error_contour_interval = 'auto',
2604		fig = 'new',
2605		):
2606		'''
2607		Generate plot for a single session
2608		'''
2609		if x_label is None:
2610			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2611		if y_label is None:
2612			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2613
2614		out = _SessionPlot()
2615		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2616		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2617		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2618		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2619		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2620		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2621		anchor_avg = (np.array([ np.array([
2622				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2623				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2624				]) for sample in anchors]).T,
2625			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2626		unknown_avg = (np.array([ np.array([
2627				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2628				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2629				]) for sample in unknowns]).T,
2630			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2631		
2632		
2633		if fig == 'new':
2634			out.fig = ppl.figure(figsize = (6,6))
2635			ppl.subplots_adjust(.1,.1,.9,.9)
2636
2637		out.anchor_analyses, = ppl.plot(
2638			anchors_d,
2639			anchors_D,
2640			**kw_plot_anchors)
2641		out.unknown_analyses, = ppl.plot(
2642			unknowns_d,
2643			unknowns_D,
2644			**kw_plot_unknowns)
2645		out.anchor_avg = ppl.plot(
2646			*anchor_avg,
2647			**kw_plot_anchor_avg)
2648		out.unknown_avg = ppl.plot(
2649			*unknown_avg,
2650			**kw_plot_unknown_avg)
2651		if xylimits == 'constant':
2652			x = [r[f'd{self._4x}'] for r in self]
2653			y = [r[f'D{self._4x}'] for r in self]
2654			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2655			w, h = x2-x1, y2-y1
2656			x1 -= w/20
2657			x2 += w/20
2658			y1 -= h/20
2659			y2 += h/20
2660			ppl.axis([x1, x2, y1, y2])
2661		elif xylimits == 'free':
2662			x1, x2, y1, y2 = ppl.axis()
2663		else:
2664			x1, x2, y1, y2 = ppl.axis(xylimits)
2665				
2666		if error_contour_interval != 'none':
2667			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2668			XI,YI = np.meshgrid(xi, yi)
2669			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2670			if error_contour_interval == 'auto':
2671				rng = np.max(SI) - np.min(SI)
2672				if rng <= 0.01:
2673					cinterval = 0.001
2674				elif rng <= 0.03:
2675					cinterval = 0.004
2676				elif rng <= 0.1:
2677					cinterval = 0.01
2678				elif rng <= 0.3:
2679					cinterval = 0.03
2680				elif rng <= 1.:
2681					cinterval = 0.1
2682				else:
2683					cinterval = 0.5
2684			else:
2685				cinterval = error_contour_interval
2686
2687			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2688			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2689			out.clabel = ppl.clabel(out.contour)
2690			contour = (XI, YI, SI, cval, cinterval)
2691
2692		if fig == None:
2693			return {
2694			'anchors':anchors,
2695			'unknowns':unknowns,
2696			'anchors_d':anchors_d,
2697			'anchors_D':anchors_D,
2698			'unknowns_d':unknowns_d,
2699			'unknowns_D':unknowns_D,
2700			'anchor_avg':anchor_avg,
2701			'unknown_avg':unknown_avg,
2702			'contour':contour,
2703			}
2704
2705		ppl.xlabel(x_label)
2706		ppl.ylabel(y_label)
2707		ppl.title(session, weight = 'bold')
2708		ppl.grid(alpha = .2)
2709		out.ax = ppl.gca()		
2710
2711		return out
2712
2713	def plot_residuals(
2714		self,
2715		kde = False,
2716		hist = False,
2717		binwidth = 2/3,
2718		dir = 'output',
2719		filename = None,
2720		highlight = [],
2721		colors = None,
2722		figsize = None,
2723		dpi = 100,
2724		yspan = None,
2725		):
2726		'''
2727		Plot residuals of each analysis as a function of time (actually, as a function of
2728		the order of analyses in the `D4xdata` object)
2729
2730		+ `kde`: whether to add a kernel density estimate of residuals
2731		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2732		+ `histbins`: specify bin edges for the histogram
2733		+ `dir`: the directory in which to save the plot
2734		+ `highlight`: a list of samples to highlight
2735		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2736		+ `figsize`: (width, height) of figure
2737		+ `dpi`: resolution for PNG output
2738		+ `yspan`: factor controlling the range of y values shown in plot
2739		  (by default: `yspan = 1.5 if kde else 1.0`)
2740		'''
2741		
2742		from matplotlib import ticker
2743
2744		if yspan is None:
2745			if kde:
2746				yspan = 1.5
2747			else:
2748				yspan = 1.0
2749		
2750		# Layout
2751		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2752		if hist or kde:
2753			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2754			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2755		else:
2756			ppl.subplots_adjust(.08,.05,.78,.8)
2757			ax1 = ppl.subplot(111)
2758		
2759		# Colors
2760		N = len(self.anchors)
2761		if colors is None:
2762			if len(highlight) > 0:
2763				Nh = len(highlight)
2764				if Nh == 1:
2765					colors = {highlight[0]: (0,0,0)}
2766				elif Nh == 3:
2767					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2768				elif Nh == 4:
2769					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2770				else:
2771					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2772			else:
2773				if N == 3:
2774					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2775				elif N == 4:
2776					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2777				else:
2778					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2779
2780		ppl.sca(ax1)
2781		
2782		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2783
2784		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2785
2786		session = self[0]['Session']
2787		x1 = 0
2788# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2789		x_sessions = {}
2790		one_or_more_singlets = False
2791		one_or_more_multiplets = False
2792		multiplets = set()
2793		for k,r in enumerate(self):
2794			if r['Session'] != session:
2795				x2 = k-1
2796				x_sessions[session] = (x1+x2)/2
2797				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2798				session = r['Session']
2799				x1 = k
2800			singlet = len(self.samples[r['Sample']]['data']) == 1
2801			if not singlet:
2802				multiplets.add(r['Sample'])
2803			if r['Sample'] in self.unknowns:
2804				if singlet:
2805					one_or_more_singlets = True
2806				else:
2807					one_or_more_multiplets = True
2808			kw = dict(
2809				marker = 'x' if singlet else '+',
2810				ms = 4 if singlet else 5,
2811				ls = 'None',
2812				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2813				mew = 1,
2814				alpha = 0.2 if singlet else 1,
2815				)
2816			if highlight and r['Sample'] not in highlight:
2817				kw['alpha'] = 0.2
2818			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2819		x2 = k
2820		x_sessions[session] = (x1+x2)/2
2821
2822		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2823		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2824		if not (hist or kde):
2825			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2826			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2827
2828		xmin, xmax, ymin, ymax = ppl.axis()
2829		if yspan != 1:
2830			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2831		for s in x_sessions:
2832			ppl.text(
2833				x_sessions[s],
2834				ymax +1,
2835				s,
2836				va = 'bottom',
2837				**(
2838					dict(ha = 'center')
2839					if len(self.sessions[s]['data']) > (0.15 * len(self))
2840					else dict(ha = 'left', rotation = 45)
2841					)
2842				)
2843
2844		if hist or kde:
2845			ppl.sca(ax2)
2846
2847		for s in colors:
2848			kw['marker'] = '+'
2849			kw['ms'] = 5
2850			kw['mec'] = colors[s]
2851			kw['label'] = s
2852			kw['alpha'] = 1
2853			ppl.plot([], [], **kw)
2854
2855		kw['mec'] = (0,0,0)
2856
2857		if one_or_more_singlets:
2858			kw['marker'] = 'x'
2859			kw['ms'] = 4
2860			kw['alpha'] = .2
2861			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2862			ppl.plot([], [], **kw)
2863
2864		if one_or_more_multiplets:
2865			kw['marker'] = '+'
2866			kw['ms'] = 4
2867			kw['alpha'] = 1
2868			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2869			ppl.plot([], [], **kw)
2870
2871		if hist or kde:
2872			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2873		else:
2874			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2875		leg.set_zorder(-1000)
2876
2877		ppl.sca(ax1)
2878
2879		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2880		ppl.xticks([])
2881		ppl.axis([-1, len(self), None, None])
2882
2883		if hist or kde:
2884			ppl.sca(ax2)
2885			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2886
2887			if kde:
2888				from scipy.stats import gaussian_kde
2889				yi = np.linspace(ymin, ymax, 201)
2890				xi = gaussian_kde(X).evaluate(yi)
2891				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2892# 				ppl.plot(xi, yi, 'k-', lw = 1)
2893			elif hist:
2894				ppl.hist(
2895					X,
2896					orientation = 'horizontal',
2897					histtype = 'stepfilled',
2898					ec = [.4]*3,
2899					fc = [.25]*3,
2900					alpha = .25,
2901					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2902					)
2903			ppl.text(0, 0,
2904				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2905				size = 7.5,
2906				alpha = 1,
2907				va = 'center',
2908				ha = 'left',
2909				)
2910
2911			ppl.axis([0, None, ymin, ymax])
2912			ppl.xticks([])
2913			ppl.yticks([])
2914# 			ax2.spines['left'].set_visible(False)
2915			ax2.spines['right'].set_visible(False)
2916			ax2.spines['top'].set_visible(False)
2917			ax2.spines['bottom'].set_visible(False)
2918
2919		ax1.axis([None, None, ymin, ymax])
2920
2921		if not os.path.exists(dir):
2922			os.makedirs(dir)
2923		if filename is None:
2924			return fig
2925		elif filename == '':
2926			filename = f'D{self._4x}_residuals.pdf'
2927		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2928		ppl.close(fig)
2929				
2930
2931	def simulate(self, *args, **kwargs):
2932		'''
2933		Legacy function with warning message pointing to `virtual_data()`
2934		'''
2935		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2936
2937	def plot_anchor_residuals(
2938		self,
2939		dir = 'output',
2940		filename = '',
2941		figsize = None,
2942		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2943		dpi = 100,
2944		colors = None,
2945		):
2946		'''
2947		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2948		
2949		**Parameters**
2950
2951		+ `dir`: the directory in which to save the plot
2952		+ `filename`: the file name to save to.
2953		+ `dpi`: resolution for PNG output
2954		+ `figsize`: (width, height) of figure
2955		+ `subplots_adjust`: passed to the figure
2956		+ `dpi`: resolution for PNG output
2957		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2958		'''
2959
2960		# Colors
2961		N = len(self.anchors)
2962		if colors is None:
2963			if N == 3:
2964				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2965			elif N == 4:
2966				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2967			else:
2968				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2969
2970		if figsize is None:
2971			figsize = (4, 1.5*N+1)
2972		fig = ppl.figure(figsize = figsize)
2973		ppl.subplots_adjust(*subplots_adjust)
2974		axs = {}
2975		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2976		sigma = self.repeatability['r_D47a'] * 1000
2977		D = max(np.abs(X))
2978
2979		for k,a in enumerate(self.anchors):
2980			color = colors[a]
2981			axs[a] = ppl.subplot(N, 1, 1+k)
2982			axs[a].text(
2983				0.02, 1-0.05, a,
2984				va = 'top',
2985				ha = 'left',
2986				weight = 'bold',
2987				size = 9,
2988				color = [_*0.75 for _ in color],
2989				transform = axs[a].transAxes,
2990			)
2991			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
2992			axs[a].axvline(0, lw = 0.5, color = color)
2993			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
2994
2995			xi = np.linspace(-3*D, 3*D, 601)
2996			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
2997			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
2998			
2999			axs[a].errorbar(
3000				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3001				ecolor = color,
3002				marker = 's',
3003				ls = 'None',
3004				mec = color,
3005				mew = 1,
3006				mfc = 'w',
3007				ms = 8,
3008				elinewidth = 1,
3009				capsize = 4,
3010				capthick = 1,
3011			)
3012			
3013			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3014			ppl.yticks([])
3015
3016		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3017
3018		if not os.path.exists(dir):
3019			os.makedirs(dir)
3020		if filename is None:
3021			return fig
3022		elif filename == '':
3023			filename = f'D{self._4x}_anchor_residuals.pdf'
3024		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3025		ppl.close(fig)
3026		
3027
3028	def plot_distribution_of_analyses(
3029		self,
3030		dir = 'output',
3031		filename = None,
3032		vs_time = False,
3033		figsize = (6,4),
3034		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3035		output = None,
3036		dpi = 100,
3037		):
3038		'''
3039		Plot temporal distribution of all analyses in the data set.
3040		
3041		**Parameters**
3042
3043		+ `dir`: the directory in which to save the plot
3044		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3045		+ `dpi`: resolution for PNG output
3046		+ `figsize`: (width, height) of figure
3047		+ `dpi`: resolution for PNG output
3048		'''
3049
3050		asamples = [s for s in self.anchors]
3051		usamples = [s for s in self.unknowns]
3052		if output is None or output == 'fig':
3053			fig = ppl.figure(figsize = figsize)
3054			ppl.subplots_adjust(*subplots_adjust)
3055		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3056		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3057		Xmax += (Xmax-Xmin)/40
3058		Xmin -= (Xmax-Xmin)/41
3059		for k, s in enumerate(asamples + usamples):
3060			if vs_time:
3061				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3062			else:
3063				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3064			Y = [-k for x in X]
3065			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3066			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3067			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3068		ppl.axis([Xmin, Xmax, -k-1, 1])
3069		ppl.xlabel('\ntime')
3070		ppl.gca().annotate('',
3071			xy = (0.6, -0.02),
3072			xycoords = 'axes fraction',
3073			xytext = (.4, -0.02), 
3074            arrowprops = dict(arrowstyle = "->", color = 'k'),
3075            )
3076			
3077
3078		x2 = -1
3079		for session in self.sessions:
3080			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3081			if vs_time:
3082				ppl.axvline(x1, color = 'k', lw = .75)
3083			if x2 > -1:
3084				if not vs_time:
3085					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3086			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3087# 			from xlrd import xldate_as_datetime
3088# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3089			if vs_time:
3090				ppl.axvline(x2, color = 'k', lw = .75)
3091				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3092			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3093
3094		ppl.xticks([])
3095		ppl.yticks([])
3096
3097		if output is None:
3098			if not os.path.exists(dir):
3099				os.makedirs(dir)
3100			if filename == None:
3101				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3102			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3103			ppl.close(fig)
3104		elif output == 'ax':
3105			return ppl.gca()
3106		elif output == 'fig':
3107			return fig
3108
3109
3110	def plot_bulk_compositions(
3111		self,
3112		samples = None,
3113		dir = 'output/bulk_compositions',
3114		figsize = (6,6),
3115		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3116		show = False,
3117		sample_color = (0,.5,1),
3118		analysis_color = (.7,.7,.7),
3119		labeldist = 0.3,
3120		radius = 0.05,
3121		):
3122		'''
3123		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3124		
3125		By default, creates a directory `./output/bulk_compositions` where plots for
3126		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3127		
3128		
3129		**Parameters**
3130
3131		+ `samples`: Only these samples are processed (by default: all samples).
3132		+ `dir`: where to save the plots
3133		+ `figsize`: (width, height) of figure
3134		+ `subplots_adjust`: passed to `subplots_adjust()`
3135		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3136		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3137		+ `sample_color`: color used for replicate markers/labels
3138		+ `analysis_color`: color used for sample markers/labels
3139		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3140		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3141		'''
3142
3143		from matplotlib.patches import Ellipse
3144
3145		if samples is None:
3146			samples = [_ for _ in self.samples]
3147
3148		saved = {}
3149
3150		for s in samples:
3151
3152			fig = ppl.figure(figsize = figsize)
3153			fig.subplots_adjust(*subplots_adjust)
3154			ax = ppl.subplot(111)
3155			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3156			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3157			ppl.title(s)
3158
3159
3160			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3161			UID = [_['UID'] for _ in self.samples[s]['data']]
3162			XY0 = XY.mean(0)
3163
3164			for xy in XY:
3165				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3166				
3167			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3168			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3169			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3170			saved[s] = [XY, XY0]
3171			
3172			x1, x2, y1, y2 = ppl.axis()
3173			x0, dx = (x1+x2)/2, (x2-x1)/2
3174			y0, dy = (y1+y2)/2, (y2-y1)/2
3175			dx, dy = [max(max(dx, dy), radius)]*2
3176
3177			ppl.axis([
3178				x0 - 1.2*dx,
3179				x0 + 1.2*dx,
3180				y0 - 1.2*dy,
3181				y0 + 1.2*dy,
3182				])			
3183
3184			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3185
3186			for xy, uid in zip(XY, UID):
3187
3188				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3189				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3190
3191				if (vector_in_display_space**2).sum() > 0:
3192
3193					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3194					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3195					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3196					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3197
3198					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3199
3200				else:
3201
3202					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3203
3204			if radius:
3205				ax.add_artist(Ellipse(
3206					xy = XY0,
3207					width = radius*2,
3208					height = radius*2,
3209					ls = (0, (2,2)),
3210					lw = .7,
3211					ec = analysis_color,
3212					fc = 'None',
3213					))
3214				ppl.text(
3215					XY0[0],
3216					XY0[1]-radius,
3217					f'\n± {radius*1e3:.0f} ppm',
3218					color = analysis_color,
3219					va = 'top',
3220					ha = 'center',
3221					linespacing = 0.4,
3222					size = 8,
3223					)
3224
3225			if not os.path.exists(dir):
3226				os.makedirs(dir)
3227			fig.savefig(f'{dir}/{s}.pdf')
3228			ppl.close(fig)
3229
3230		fig = ppl.figure(figsize = figsize)
3231		fig.subplots_adjust(*subplots_adjust)
3232		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3233		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3234
3235		for s in saved:
3236			for xy in saved[s][0]:
3237				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3238			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3239			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3240			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3241
3242		x1, x2, y1, y2 = ppl.axis()
3243		ppl.axis([
3244			x1 - (x2-x1)/10,
3245			x2 + (x2-x1)/10,
3246			y1 - (y2-y1)/10,
3247			y2 + (y2-y1)/10,
3248			])			
3249
3250
3251		if not os.path.exists(dir):
3252			os.makedirs(dir)
3253		fig.savefig(f'{dir}/__all__.pdf')
3254		if show:
3255			ppl.show()
3256		ppl.close(fig)
3257		
3258
3259	def _save_D4x_correl(
3260		self,
3261		samples = None,
3262		dir = 'output',
3263		filename = None,
3264		D4x_precision = 4,
3265		correl_precision = 4,
3266		save_to_file = True,
3267		):
3268		'''
3269		Save D4x values along with their SE and correlation matrix.
3270
3271		**Parameters**
3272
3273		+ `samples`: Only these samples are output (by default: all samples).
3274		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3275		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3276		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3277		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3278		+ `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`,
3279		returns the output as a string
3280		'''
3281		if samples is None:
3282			samples = sorted([s for s in self.unknowns])
3283		
3284		out = [['Sample']] + [[s] for s in samples]
3285		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3286		for k,s in enumerate(samples):
3287			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3288			for s2 in samples:
3289				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3290		
3291		if save_to_file:
3292			if not os.path.exists(dir):
3293				os.makedirs(dir)
3294			if filename is None:
3295				filename = f'D{self._4x}_correl.csv'
3296			with open(f'{dir}/{filename}', 'w') as fid:
3297				fid.write(make_csv(out))
3298		else:
3299			return make_csv(out)

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
957	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
958		'''
959		**Parameters**
960
961		+ `l`: a list of dictionaries, with each dictionary including at least the keys
962		`Sample`, `d45`, `d46`, and `d47` or `d48`.
963		+ `mass`: `'47'` or `'48'`
964		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
965		+ `session`: define session name for analyses without a `Session` key
966		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
967
968		Returns a `D4xdata` object derived from `list`.
969		'''
970		self._4x = mass
971		self.verbose = verbose
972		self.prefix = 'D4xdata'
973		self.logfile = logfile
974		list.__init__(self, l)
975		self.Nf = None
976		self.repeatability = {}
977		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
980	def make_verbal(oldfun):
981		'''
982		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
983		'''
984		@wraps(oldfun)
985		def newfun(*args, verbose = '', **kwargs):
986			myself = args[0]
987			oldprefix = myself.prefix
988			myself.prefix = oldfun.__name__
989			if verbose != '':
990				oldverbose = myself.verbose
991				myself.verbose = verbose
992			out = oldfun(*args, **kwargs)
993			myself.prefix = oldprefix
994			if verbose != '':
995				myself.verbose = oldverbose
996			return out
997		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
1000	def msg(self, txt):
1001		'''
1002		Log a message to `self.logfile`, and print it out if `verbose = True`
1003		'''
1004		self.log(txt)
1005		if self.verbose:
1006			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
1009	def vmsg(self, txt):
1010		'''
1011		Log a message to `self.logfile` and print it out
1012		'''
1013		self.log(txt)
1014		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
1017	def log(self, *txts):
1018		'''
1019		Log a message to `self.logfile`
1020		'''
1021		if self.logfile:
1022			with open(self.logfile, 'a') as fid:
1023				for txt in txts:
1024					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
1027	def refresh(self, session = 'mySession'):
1028		'''
1029		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1030		'''
1031		self.fill_in_missing_info(session = session)
1032		self.refresh_sessions()
1033		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1036	def refresh_sessions(self):
1037		'''
1038		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1039		to `False` for all sessions.
1040		'''
1041		self.sessions = {
1042			s: {'data': [r for r in self if r['Session'] == s]}
1043			for s in sorted({r['Session'] for r in self})
1044			}
1045		for s in self.sessions:
1046			self.sessions[s]['scrambling_drift'] = False
1047			self.sessions[s]['slope_drift'] = False
1048			self.sessions[s]['wg_drift'] = False
1049			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1050			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1053	def refresh_samples(self):
1054		'''
1055		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1056		'''
1057		self.samples = {
1058			s: {'data': [r for r in self if r['Sample'] == s]}
1059			for s in sorted({r['Sample'] for r in self})
1060			}
1061		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1062		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1065	def read(self, filename, sep = '', session = ''):
1066		'''
1067		Read file in csv format to load data into a `D47data` object.
1068
1069		In the csv file, spaces before and after field separators (`','` by default)
1070		are optional. Each line corresponds to a single analysis.
1071
1072		The required fields are:
1073
1074		+ `UID`: a unique identifier
1075		+ `Session`: an identifier for the analytical session
1076		+ `Sample`: a sample identifier
1077		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1078
1079		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1080		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1081		and `d49` are optional, and set to NaN by default.
1082
1083		**Parameters**
1084
1085		+ `fileneme`: the path of the file to read
1086		+ `sep`: csv separator delimiting the fields
1087		+ `session`: set `Session` field to this string for all analyses
1088		'''
1089		with open(filename) as fid:
1090			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1093	def input(self, txt, sep = '', session = ''):
1094		'''
1095		Read `txt` string in csv format to load analysis data into a `D47data` object.
1096
1097		In the csv string, spaces before and after field separators (`','` by default)
1098		are optional. Each line corresponds to a single analysis.
1099
1100		The required fields are:
1101
1102		+ `UID`: a unique identifier
1103		+ `Session`: an identifier for the analytical session
1104		+ `Sample`: a sample identifier
1105		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1106
1107		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1108		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1109		and `d49` are optional, and set to NaN by default.
1110
1111		**Parameters**
1112
1113		+ `txt`: the csv string to read
1114		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1115		whichever appers most often in `txt`.
1116		+ `session`: set `Session` field to this string for all analyses
1117		'''
1118		if sep == '':
1119			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1120		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1121		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1122
1123		if session != '':
1124			for r in data:
1125				r['Session'] = session
1126
1127		self += data
1128		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, session_groups=None):
1131	@make_verbal
1132	def wg(self,
1133		samples = None,
1134		session_groups = None,
1135	):
1136		'''
1137		Compute bulk composition of the working gas for each session based (by default)
1138		on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1139		`self.Nominal_d18O_VPDB`.
1140
1141		**Parameters**
1142
1143		+ `samples`: A list of samples specifying the subset of samples (defined in both
1144		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered
1145		when computing the working gas. By default, use all samples defined both in
1146		`self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`.
1147		+ `session_groups`: a list of lists of sessions
1148		(e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`)
1149		specifying which sessions groups, if any, have the exact same WG composition.
1150		If set to `'all'`, force all sessions to have the same WG composition (use with
1151		caution and on short time scales, since the WG may drift slowly a long time scales).
1152		'''
1153
1154		self.msg('Computing WG composition:')
1155
1156		a18_acid = self.ALPHA_18O_ACID_REACTION
1157		
1158		if samples is None:
1159			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1160		if session_groups is None:
1161			session_groups = [[s] for s in self.sessions]
1162		elif session_groups == 'all':
1163			session_groups = [[s for s in self.sessions]]
1164
1165		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1166		R45R46_standards = {}
1167		for sample in samples:
1168			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1169			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1170			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1171			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1172			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1173
1174			C12_s = 1 / (1 + R13_s)
1175			C13_s = R13_s / (1 + R13_s)
1176			C16_s = 1 / (1 + R17_s + R18_s)
1177			C17_s = R17_s / (1 + R17_s + R18_s)
1178			C18_s = R18_s / (1 + R17_s + R18_s)
1179
1180			C626_s = C12_s * C16_s ** 2
1181			C627_s = 2 * C12_s * C16_s * C17_s
1182			C628_s = 2 * C12_s * C16_s * C18_s
1183			C636_s = C13_s * C16_s ** 2
1184			C637_s = 2 * C13_s * C16_s * C17_s
1185			C727_s = C12_s * C17_s ** 2
1186
1187			R45_s = (C627_s + C636_s) / C626_s
1188			R46_s = (C628_s + C637_s + C727_s) / C626_s
1189			R45R46_standards[sample] = (R45_s, R46_s)
1190		
1191		for sg in session_groups:
1192			db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples]
1193			assert db, f'No sample from {samples} found in session group {sg}.'
1194
1195			X = [r['d45'] for r in db]
1196			Y = [R45R46_standards[r['Sample']][0] for r in db]
1197			x1, x2 = np.min(X), np.max(X)
1198
1199			if x1 < x2:
1200				wgcoord = x1/(x1-x2)
1201			else:
1202				wgcoord = 999
1203
1204			if wgcoord < -.5 or wgcoord > 1.5:
1205				# unreasonable to extrapolate to d45 = 0
1206				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1207			else :
1208				# d45 = 0 is reasonably well bracketed
1209				R45_wg = np.polyfit(X, Y, 1)[1]
1210
1211			X = [r['d46'] for r in db]
1212			Y = [R45R46_standards[r['Sample']][1] for r in db]
1213			x1, x2 = np.min(X), np.max(X)
1214
1215			if x1 < x2:
1216				wgcoord = x1/(x1-x2)
1217			else:
1218				wgcoord = 999
1219
1220			if wgcoord < -.5 or wgcoord > 1.5:
1221				# unreasonable to extrapolate to d46 = 0
1222				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1223			else :
1224				# d46 = 0 is reasonably well bracketed
1225				R46_wg = np.polyfit(X, Y, 1)[1]
1226
1227			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1228
1229			for s in sg:
1230				self.msg(f'Sessions {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1231	
1232				self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1233				self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1234				for r in self.sessions[s]['data']:
1235					r['d13Cwg_VPDB'] = d13Cwg_VPDB
1236					r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based (by default) on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

Parameters

  • samples: A list of samples specifying the subset of samples (defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB) which will be considered when computing the working gas. By default, use all samples defined both in self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.
  • session_groups: a list of lists of sessions (e.g., [['session1', 'session2'], ['session3', 'session4', 'session5']]) specifying which sessions groups, if any, have the exact same WG composition. If set to 'all', force all sessions to have the same WG composition (use with caution and on short time scales, since the WG may drift slowly a long time scales).
def compute_bulk_delta(self, R45, R46, D17O=0):
1239	def compute_bulk_delta(self, R45, R46, D17O = 0):
1240		'''
1241		Compute δ13C_VPDB and δ18O_VSMOW,
1242		by solving the generalized form of equation (17) from
1243		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1244		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1245		solving the corresponding second-order Taylor polynomial.
1246		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1247		'''
1248
1249		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1250
1251		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1252		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1253		C = 2 * self.R18_VSMOW
1254		D = -R46
1255
1256		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1257		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1258		cc = A + B + C + D
1259
1260		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1261
1262		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1263		R17 = K * R18 ** self.LAMBDA_17
1264		R13 = R45 - 2 * R17
1265
1266		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1267
1268		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1271	@make_verbal
1272	def crunch(self, verbose = ''):
1273		'''
1274		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1275		'''
1276		for r in self:
1277			self.compute_bulk_and_clumping_deltas(r)
1278		self.standardize_d13C()
1279		self.standardize_d18O()
1280		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1283	def fill_in_missing_info(self, session = 'mySession'):
1284		'''
1285		Fill in optional fields with default values
1286		'''
1287		for i,r in enumerate(self):
1288			if 'D17O' not in r:
1289				r['D17O'] = 0.
1290			if 'UID' not in r:
1291				r['UID'] = f'{i+1}'
1292			if 'Session' not in r:
1293				r['Session'] = session
1294			for k in ['d47', 'd48', 'd49']:
1295				if k not in r:
1296					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1299	def standardize_d13C(self):
1300		'''
1301		Perform δ13C standadization within each session `s` according to
1302		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1303		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1304		may be redefined abitrarily at a later stage.
1305		'''
1306		for s in self.sessions:
1307			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1308				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1309				X,Y = zip(*XY)
1310				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1311					offset = np.mean(Y) - np.mean(X)
1312					for r in self.sessions[s]['data']:
1313						r['d13C_VPDB'] += offset				
1314				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1315					a,b = np.polyfit(X,Y,1)
1316					for r in self.sessions[s]['data']:
1317						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1319	def standardize_d18O(self):
1320		'''
1321		Perform δ18O standadization within each session `s` according to
1322		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1323		which is defined by default by `D47data.refresh_sessions()`as equal to
1324		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1325		'''
1326		for s in self.sessions:
1327			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1328				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1329				X,Y = zip(*XY)
1330				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1331				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1332					offset = np.mean(Y) - np.mean(X)
1333					for r in self.sessions[s]['data']:
1334						r['d18O_VSMOW'] += offset				
1335				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1336					a,b = np.polyfit(X,Y,1)
1337					for r in self.sessions[s]['data']:
1338						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1341	def compute_bulk_and_clumping_deltas(self, r):
1342		'''
1343		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1344		'''
1345
1346		# Compute working gas R13, R18, and isobar ratios
1347		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1348		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1349		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1350
1351		# Compute analyte isobar ratios
1352		R45 = (1 + r['d45'] / 1000) * R45_wg
1353		R46 = (1 + r['d46'] / 1000) * R46_wg
1354		R47 = (1 + r['d47'] / 1000) * R47_wg
1355		R48 = (1 + r['d48'] / 1000) * R48_wg
1356		R49 = (1 + r['d49'] / 1000) * R49_wg
1357
1358		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1359		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1360		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1361
1362		# Compute stochastic isobar ratios of the analyte
1363		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1364			R13, R18, D17O = r['D17O']
1365		)
1366
1367		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1368		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1369		if (R45 / R45stoch - 1) > 5e-8:
1370			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1371		if (R46 / R46stoch - 1) > 5e-8:
1372			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1373
1374		# Compute raw clumped isotope anomalies
1375		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1376		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1377		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1380	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1381		'''
1382		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1383		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1384		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1385		'''
1386
1387		# Compute R17
1388		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1389
1390		# Compute isotope concentrations
1391		C12 = (1 + R13) ** -1
1392		C13 = C12 * R13
1393		C16 = (1 + R17 + R18) ** -1
1394		C17 = C16 * R17
1395		C18 = C16 * R18
1396
1397		# Compute stochastic isotopologue concentrations
1398		C626 = C16 * C12 * C16
1399		C627 = C16 * C12 * C17 * 2
1400		C628 = C16 * C12 * C18 * 2
1401		C636 = C16 * C13 * C16
1402		C637 = C16 * C13 * C17 * 2
1403		C638 = C16 * C13 * C18 * 2
1404		C727 = C17 * C12 * C17
1405		C728 = C17 * C12 * C18 * 2
1406		C737 = C17 * C13 * C17
1407		C738 = C17 * C13 * C18 * 2
1408		C828 = C18 * C12 * C18
1409		C838 = C18 * C13 * C18
1410
1411		# Compute stochastic isobar ratios
1412		R45 = (C636 + C627) / C626
1413		R46 = (C628 + C637 + C727) / C626
1414		R47 = (C638 + C728 + C737) / C626
1415		R48 = (C738 + C828) / C626
1416		R49 = C838 / C626
1417
1418		# Account for stochastic anomalies
1419		R47 *= 1 + D47 / 1000
1420		R48 *= 1 + D48 / 1000
1421		R49 *= 1 + D49 / 1000
1422
1423		# Return isobar ratios
1424		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1427	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1428		'''
1429		Split unknown samples by UID (treat all analyses as different samples)
1430		or by session (treat analyses of a given sample in different sessions as
1431		different samples).
1432
1433		**Parameters**
1434
1435		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1436		+ `grouping`: `by_uid` | `by_session`
1437		'''
1438		if samples_to_split == 'all':
1439			samples_to_split = [s for s in self.unknowns]
1440		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1441		self.grouping = grouping.lower()
1442		if self.grouping in gkeys:
1443			gkey = gkeys[self.grouping]
1444		for r in self:
1445			if r['Sample'] in samples_to_split:
1446				r['Sample_original'] = r['Sample']
1447				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1448			elif r['Sample'] in self.unknowns:
1449				r['Sample_original'] = r['Sample']
1450		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1453	def unsplit_samples(self, tables = False):
1454		'''
1455		Reverse the effects of `D47data.split_samples()`.
1456		
1457		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1458		
1459		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1460		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1461		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1462		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1463		that case session-averaged Δ4x values are statistically independent).
1464		'''
1465		unknowns_old = sorted({s for s in self.unknowns})
1466		CM_old = self.standardization.covar[:,:]
1467		VD_old = self.standardization.params.valuesdict().copy()
1468		vars_old = self.standardization.var_names
1469
1470		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1471
1472		Ns = len(vars_old) - len(unknowns_old)
1473		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1474		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1475
1476		W = np.zeros((len(vars_new), len(vars_old)))
1477		W[:Ns,:Ns] = np.eye(Ns)
1478		for u in unknowns_new:
1479			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1480			if self.grouping == 'by_session':
1481				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1482			elif self.grouping == 'by_uid':
1483				weights = [1 for s in splits]
1484			sw = sum(weights)
1485			weights = [w/sw for w in weights]
1486			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1487
1488		CM_new = W @ CM_old @ W.T
1489		V = W @ np.array([[VD_old[k]] for k in vars_old])
1490		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1491
1492		self.standardization.covar = CM_new
1493		self.standardization.params.valuesdict = lambda : VD_new
1494		self.standardization.var_names = vars_new
1495
1496		for r in self:
1497			if r['Sample'] in self.unknowns:
1498				r['Sample_split'] = r['Sample']
1499				r['Sample'] = r['Sample_original']
1500
1501		self.refresh_samples()
1502		self.consolidate_samples()
1503		self.repeatabilities()
1504
1505		if tables:
1506			self.table_of_analyses()
1507			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1509	def assign_timestamps(self):
1510		'''
1511		Assign a time field `t` of type `float` to each analysis.
1512
1513		If `TimeTag` is one of the data fields, `t` is equal within a given session
1514		to `TimeTag` minus the mean value of `TimeTag` for that session.
1515		Otherwise, `TimeTag` is by default equal to the index of each analysis
1516		in the dataset and `t` is defined as above.
1517		'''
1518		for session in self.sessions:
1519			sdata = self.sessions[session]['data']
1520			try:
1521				t0 = np.mean([r['TimeTag'] for r in sdata])
1522				for r in sdata:
1523					r['t'] = r['TimeTag'] - t0
1524			except KeyError:
1525				t0 = (len(sdata)-1)/2
1526				for t,r in enumerate(sdata):
1527					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1530	def report(self):
1531		'''
1532		Prints a report on the standardization fit.
1533		Only applicable after `D4xdata.standardize(method='pooled')`.
1534		'''
1535		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1538	def combine_samples(self, sample_groups):
1539		'''
1540		Combine analyses of different samples to compute weighted average Δ4x
1541		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1542		dictionary.
1543		
1544		Caution: samples are weighted by number of replicate analyses, which is a
1545		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1546		correlated analytical errors for one or more samples).
1547		
1548		Returns a tuplet of:
1549		
1550		+ the list of group names
1551		+ an array of the corresponding Δ4x values
1552		+ the corresponding (co)variance matrix
1553		
1554		**Parameters**
1555
1556		+ `sample_groups`: a dictionary of the form:
1557		```py
1558		{'group1': ['sample_1', 'sample_2'],
1559		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1560		```
1561		'''
1562		
1563		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1564		groups = sorted(sample_groups.keys())
1565		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1566		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1567		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1568		W = np.array([
1569			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1570			for j in groups])
1571		D4x_new = W @ D4x_old
1572		CM_new = W @ CM_old @ W.T
1573
1574		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1577	@make_verbal
1578	def standardize(self,
1579		method = 'pooled',
1580		weighted_sessions = [],
1581		consolidate = True,
1582		consolidate_tables = False,
1583		consolidate_plots = False,
1584		constraints = {},
1585		):
1586		'''
1587		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1588		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1589		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1590		i.e. that their true Δ4x value does not change between sessions,
1591		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1592		`'indep_sessions'`, the standardization processes each session independently, based only
1593		on anchors analyses.
1594		'''
1595
1596		self.standardization_method = method
1597		self.assign_timestamps()
1598
1599		if method == 'pooled':
1600			if weighted_sessions:
1601				for session_group in weighted_sessions:
1602					if self._4x == '47':
1603						X = D47data([r for r in self if r['Session'] in session_group])
1604					elif self._4x == '48':
1605						X = D48data([r for r in self if r['Session'] in session_group])
1606					X.Nominal_D4x = self.Nominal_D4x.copy()
1607					X.refresh()
1608					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1609					w = np.sqrt(result.redchi)
1610					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1611					for r in X:
1612						r[f'wD{self._4x}raw'] *= w
1613			else:
1614				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1615				for r in self:
1616					r[f'wD{self._4x}raw'] = 1.
1617
1618			params = Parameters()
1619			for k,session in enumerate(self.sessions):
1620				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1621				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1622				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1623				s = pf(session)
1624				params.add(f'a_{s}', value = 0.9)
1625				params.add(f'b_{s}', value = 0.)
1626				params.add(f'c_{s}', value = -0.9)
1627				params.add(f'a2_{s}', value = 0.,
1628# 					vary = self.sessions[session]['scrambling_drift'],
1629					)
1630				params.add(f'b2_{s}', value = 0.,
1631# 					vary = self.sessions[session]['slope_drift'],
1632					)
1633				params.add(f'c2_{s}', value = 0.,
1634# 					vary = self.sessions[session]['wg_drift'],
1635					)
1636				if not self.sessions[session]['scrambling_drift']:
1637					params[f'a2_{s}'].expr = '0'
1638				if not self.sessions[session]['slope_drift']:
1639					params[f'b2_{s}'].expr = '0'
1640				if not self.sessions[session]['wg_drift']:
1641					params[f'c2_{s}'].expr = '0'
1642
1643			for sample in self.unknowns:
1644				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1645
1646			for k in constraints:
1647				params[k].expr = constraints[k]
1648
1649			def residuals(p):
1650				R = []
1651				for r in self:
1652					session = pf(r['Session'])
1653					sample = pf(r['Sample'])
1654					if r['Sample'] in self.Nominal_D4x:
1655						R += [ (
1656							r[f'D{self._4x}raw'] - (
1657								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1658								+ p[f'b_{session}'] * r[f'd{self._4x}']
1659								+	p[f'c_{session}']
1660								+ r['t'] * (
1661									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1662									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1663									+	p[f'c2_{session}']
1664									)
1665								)
1666							) / r[f'wD{self._4x}raw'] ]
1667					else:
1668						R += [ (
1669							r[f'D{self._4x}raw'] - (
1670								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1671								+ p[f'b_{session}'] * r[f'd{self._4x}']
1672								+	p[f'c_{session}']
1673								+ r['t'] * (
1674									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1675									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1676									+	p[f'c2_{session}']
1677									)
1678								)
1679							) / r[f'wD{self._4x}raw'] ]
1680				return R
1681
1682			M = Minimizer(residuals, params)
1683			result = M.least_squares()
1684			self.Nf = result.nfree
1685			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1686			new_names, new_covar, new_se = _fullcovar(result)[:3]
1687			result.var_names = new_names
1688			result.covar = new_covar
1689
1690			for r in self:
1691				s = pf(r["Session"])
1692				a = result.params.valuesdict()[f'a_{s}']
1693				b = result.params.valuesdict()[f'b_{s}']
1694				c = result.params.valuesdict()[f'c_{s}']
1695				a2 = result.params.valuesdict()[f'a2_{s}']
1696				b2 = result.params.valuesdict()[f'b2_{s}']
1697				c2 = result.params.valuesdict()[f'c2_{s}']
1698				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1699				
1700
1701			self.standardization = result
1702
1703			for session in self.sessions:
1704				self.sessions[session]['Np'] = 3
1705				for k in ['scrambling', 'slope', 'wg']:
1706					if self.sessions[session][f'{k}_drift']:
1707						self.sessions[session]['Np'] += 1
1708
1709			if consolidate:
1710				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1711			return result
1712
1713
1714		elif method == 'indep_sessions':
1715
1716			if weighted_sessions:
1717				for session_group in weighted_sessions:
1718					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1719					X.Nominal_D4x = self.Nominal_D4x.copy()
1720					X.refresh()
1721					# This is only done to assign r['wD47raw'] for r in X:
1722					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1723					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1724			else:
1725				self.msg('All weights set to 1 ‰')
1726				for r in self:
1727					r[f'wD{self._4x}raw'] = 1
1728
1729			for session in self.sessions:
1730				s = self.sessions[session]
1731				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1732				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1733				s['Np'] = sum(p_active)
1734				sdata = s['data']
1735
1736				A = np.array([
1737					[
1738						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1739						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1740						1 / r[f'wD{self._4x}raw'],
1741						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1742						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1743						r['t'] / r[f'wD{self._4x}raw']
1744						]
1745					for r in sdata if r['Sample'] in self.anchors
1746					])[:,p_active] # only keep columns for the active parameters
1747				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1748				s['Na'] = Y.size
1749				CM = linalg.inv(A.T @ A)
1750				bf = (CM @ A.T @ Y).T[0,:]
1751				k = 0
1752				for n,a in zip(p_names, p_active):
1753					if a:
1754						s[n] = bf[k]
1755# 						self.msg(f'{n} = {bf[k]}')
1756						k += 1
1757					else:
1758						s[n] = 0.
1759# 						self.msg(f'{n} = 0.0')
1760
1761				for r in sdata :
1762					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1763					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1764					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1765
1766				s['CM'] = np.zeros((6,6))
1767				i = 0
1768				k_active = [j for j,a in enumerate(p_active) if a]
1769				for j,a in enumerate(p_active):
1770					if a:
1771						s['CM'][j,k_active] = CM[i,:]
1772						i += 1
1773
1774			if not weighted_sessions:
1775				w = self.rmswd()['rmswd']
1776				for r in self:
1777						r[f'wD{self._4x}'] *= w
1778						r[f'wD{self._4x}raw'] *= w
1779				for session in self.sessions:
1780					self.sessions[session]['CM'] *= w**2
1781
1782			for session in self.sessions:
1783				s = self.sessions[session]
1784				s['SE_a'] = s['CM'][0,0]**.5
1785				s['SE_b'] = s['CM'][1,1]**.5
1786				s['SE_c'] = s['CM'][2,2]**.5
1787				s['SE_a2'] = s['CM'][3,3]**.5
1788				s['SE_b2'] = s['CM'][4,4]**.5
1789				s['SE_c2'] = s['CM'][5,5]**.5
1790
1791			if not weighted_sessions:
1792				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1793			else:
1794				self.Nf = 0
1795				for sg in weighted_sessions:
1796					self.Nf += self.rmswd(sessions = sg)['Nf']
1797
1798			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1799
1800			avgD4x = {
1801				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1802				for sample in self.samples
1803				}
1804			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1805			rD4x = (chi2/self.Nf)**.5
1806			self.repeatability[f'sigma_{self._4x}'] = rD4x
1807
1808			if consolidate:
1809				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1812	def standardization_error(self, session, d4x, D4x, t = 0):
1813		'''
1814		Compute standardization error for a given session and
1815		(δ47, Δ47) composition.
1816		'''
1817		a = self.sessions[session]['a']
1818		b = self.sessions[session]['b']
1819		c = self.sessions[session]['c']
1820		a2 = self.sessions[session]['a2']
1821		b2 = self.sessions[session]['b2']
1822		c2 = self.sessions[session]['c2']
1823		CM = self.sessions[session]['CM']
1824
1825		x, y = D4x, d4x
1826		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1827# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1828		dxdy = -(b+b2*t) / (a+a2*t)
1829		dxdz = 1. / (a+a2*t)
1830		dxda = -x / (a+a2*t)
1831		dxdb = -y / (a+a2*t)
1832		dxdc = -1. / (a+a2*t)
1833		dxda2 = -x * a2 / (a+a2*t)
1834		dxdb2 = -y * t / (a+a2*t)
1835		dxdc2 = -t / (a+a2*t)
1836		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1837		sx = (V @ CM @ V.T) ** .5
1838		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1841	@make_verbal
1842	def summary(self,
1843		dir = 'output',
1844		filename = None,
1845		save_to_file = True,
1846		print_out = True,
1847		):
1848		'''
1849		Print out an/or save to disk a summary of the standardization results.
1850
1851		**Parameters**
1852
1853		+ `dir`: the directory in which to save the table
1854		+ `filename`: the name to the csv file to write to
1855		+ `save_to_file`: whether to save the table to disk
1856		+ `print_out`: whether to print out the table
1857		'''
1858
1859		out = []
1860		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1861		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1862		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1863		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1864		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1865		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1866		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1867		out += [['Model degrees of freedom', f"{self.Nf}"]]
1868		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1869		out += [['Standardization method', self.standardization_method]]
1870
1871		if save_to_file:
1872			if not os.path.exists(dir):
1873				os.makedirs(dir)
1874			if filename is None:
1875				filename = f'D{self._4x}_summary.csv'
1876			with open(f'{dir}/{filename}', 'w') as fid:
1877				fid.write(make_csv(out))
1878		if print_out:
1879			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1882	@make_verbal
1883	def table_of_sessions(self,
1884		dir = 'output',
1885		filename = None,
1886		save_to_file = True,
1887		print_out = True,
1888		output = None,
1889		):
1890		'''
1891		Print out an/or save to disk a table of sessions.
1892
1893		**Parameters**
1894
1895		+ `dir`: the directory in which to save the table
1896		+ `filename`: the name to the csv file to write to
1897		+ `save_to_file`: whether to save the table to disk
1898		+ `print_out`: whether to print out the table
1899		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1900		    if set to `'raw'`: return a list of list of strings
1901		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1902		'''
1903		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1904		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1905		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1906
1907		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1908		if include_a2:
1909			out[-1] += ['a2 ± SE']
1910		if include_b2:
1911			out[-1] += ['b2 ± SE']
1912		if include_c2:
1913			out[-1] += ['c2 ± SE']
1914		for session in self.sessions:
1915			out += [[
1916				session,
1917				f"{self.sessions[session]['Na']}",
1918				f"{self.sessions[session]['Nu']}",
1919				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1920				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1921				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1922				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1923				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1924				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1925				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1926				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1927				]]
1928			if include_a2:
1929				if self.sessions[session]['scrambling_drift']:
1930					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1931				else:
1932					out[-1] += ['']
1933			if include_b2:
1934				if self.sessions[session]['slope_drift']:
1935					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1936				else:
1937					out[-1] += ['']
1938			if include_c2:
1939				if self.sessions[session]['wg_drift']:
1940					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1941				else:
1942					out[-1] += ['']
1943
1944		if save_to_file:
1945			if not os.path.exists(dir):
1946				os.makedirs(dir)
1947			if filename is None:
1948				filename = f'D{self._4x}_sessions.csv'
1949			with open(f'{dir}/{filename}', 'w') as fid:
1950				fid.write(make_csv(out))
1951		if print_out:
1952			self.msg('\n' + pretty_table(out))
1953		if output == 'raw':
1954			return out
1955		elif output == 'pretty':
1956			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1959	@make_verbal
1960	def table_of_analyses(
1961		self,
1962		dir = 'output',
1963		filename = None,
1964		save_to_file = True,
1965		print_out = True,
1966		output = None,
1967		):
1968		'''
1969		Print out an/or save to disk a table of analyses.
1970
1971		**Parameters**
1972
1973		+ `dir`: the directory in which to save the table
1974		+ `filename`: the name to the csv file to write to
1975		+ `save_to_file`: whether to save the table to disk
1976		+ `print_out`: whether to print out the table
1977		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1978		    if set to `'raw'`: return a list of list of strings
1979		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1980		'''
1981
1982		out = [['UID','Session','Sample']]
1983		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1984		for f in extra_fields:
1985			out[-1] += [f[0]]
1986		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1987		for r in self:
1988			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1989			for f in extra_fields:
1990				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1991			out[-1] += [
1992				f"{r['d13Cwg_VPDB']:.3f}",
1993				f"{r['d18Owg_VSMOW']:.3f}",
1994				f"{r['d45']:.6f}",
1995				f"{r['d46']:.6f}",
1996				f"{r['d47']:.6f}",
1997				f"{r['d48']:.6f}",
1998				f"{r['d49']:.6f}",
1999				f"{r['d13C_VPDB']:.6f}",
2000				f"{r['d18O_VSMOW']:.6f}",
2001				f"{r['D47raw']:.6f}",
2002				f"{r['D48raw']:.6f}",
2003				f"{r['D49raw']:.6f}",
2004				f"{r[f'D{self._4x}']:.6f}"
2005				]
2006		if save_to_file:
2007			if not os.path.exists(dir):
2008				os.makedirs(dir)
2009			if filename is None:
2010				filename = f'D{self._4x}_analyses.csv'
2011			with open(f'{dir}/{filename}', 'w') as fid:
2012				fid.write(make_csv(out))
2013		if print_out:
2014			self.msg('\n' + pretty_table(out))
2015		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2017	@make_verbal
2018	def covar_table(
2019		self,
2020		correl = False,
2021		dir = 'output',
2022		filename = None,
2023		save_to_file = True,
2024		print_out = True,
2025		output = None,
2026		):
2027		'''
2028		Print out, save to disk and/or return the variance-covariance matrix of D4x
2029		for all unknown samples.
2030
2031		**Parameters**
2032
2033		+ `dir`: the directory in which to save the csv
2034		+ `filename`: the name of the csv file to write to
2035		+ `save_to_file`: whether to save the csv
2036		+ `print_out`: whether to print out the matrix
2037		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2038		    if set to `'raw'`: return a list of list of strings
2039		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2040		'''
2041		samples = sorted([u for u in self.unknowns])
2042		out = [[''] + samples]
2043		for s1 in samples:
2044			out.append([s1])
2045			for s2 in samples:
2046				if correl:
2047					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2048				else:
2049					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2050
2051		if save_to_file:
2052			if not os.path.exists(dir):
2053				os.makedirs(dir)
2054			if filename is None:
2055				if correl:
2056					filename = f'D{self._4x}_correl.csv'
2057				else:
2058					filename = f'D{self._4x}_covar.csv'
2059			with open(f'{dir}/{filename}', 'w') as fid:
2060				fid.write(make_csv(out))
2061		if print_out:
2062			self.msg('\n'+pretty_table(out))
2063		if output == 'raw':
2064			return out
2065		elif output == 'pretty':
2066			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2068	@make_verbal
2069	def table_of_samples(
2070		self,
2071		dir = 'output',
2072		filename = None,
2073		save_to_file = True,
2074		print_out = True,
2075		output = None,
2076		):
2077		'''
2078		Print out, save to disk and/or return a table of samples.
2079
2080		**Parameters**
2081
2082		+ `dir`: the directory in which to save the csv
2083		+ `filename`: the name of the csv file to write to
2084		+ `save_to_file`: whether to save the csv
2085		+ `print_out`: whether to print out the table
2086		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2087		    if set to `'raw'`: return a list of list of strings
2088		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2089		'''
2090
2091		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2092		for sample in self.anchors:
2093			out += [[
2094				f"{sample}",
2095				f"{self.samples[sample]['N']}",
2096				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2097				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2098				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2099				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2100				]]
2101		for sample in self.unknowns:
2102			out += [[
2103				f"{sample}",
2104				f"{self.samples[sample]['N']}",
2105				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2106				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2107				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2108				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2109				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2110				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2111				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2112				]]
2113		if save_to_file:
2114			if not os.path.exists(dir):
2115				os.makedirs(dir)
2116			if filename is None:
2117				filename = f'D{self._4x}_samples.csv'
2118			with open(f'{dir}/{filename}', 'w') as fid:
2119				fid.write(make_csv(out))
2120		if print_out:
2121			self.msg('\n'+pretty_table(out))
2122		if output == 'raw':
2123			return out
2124		elif output == 'pretty':
2125			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2128	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2129		'''
2130		Generate session plots and save them to disk.
2131
2132		**Parameters**
2133
2134		+ `dir`: the directory in which to save the plots
2135		+ `figsize`: the width and height (in inches) of each plot
2136		+ `filetype`: 'pdf' or 'png'
2137		+ `dpi`: resolution for PNG output
2138		'''
2139		if not os.path.exists(dir):
2140			os.makedirs(dir)
2141
2142		for session in self.sessions:
2143			sp = self.plot_single_session(session, xylimits = 'constant')
2144			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2145			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2149	@make_verbal
2150	def consolidate_samples(self):
2151		'''
2152		Compile various statistics for each sample.
2153
2154		For each anchor sample:
2155
2156		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2157		+ `SE_D47` or `SE_D48`: set to zero by definition
2158
2159		For each unknown sample:
2160
2161		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2162		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2163
2164		For each anchor and unknown:
2165
2166		+ `N`: the total number of analyses of this sample
2167		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2168		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2169		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2170		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2171		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2172		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2173		'''
2174		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2175		for sample in self.samples:
2176			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2177			if self.samples[sample]['N'] > 1:
2178				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2179
2180			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2181			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2182
2183			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2184			if len(D4x_pop) > 2:
2185				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2186			
2187		if self.standardization_method == 'pooled':
2188			for sample in self.anchors:
2189				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2190				self.samples[sample][f'SE_D{self._4x}'] = 0.
2191			for sample in self.unknowns:
2192				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2193				try:
2194					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2195				except ValueError:
2196					# when `sample` is constrained by self.standardize(constraints = {...}),
2197					# it is no longer listed in self.standardization.var_names.
2198					# Temporary fix: define SE as zero for now
2199					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2200
2201		elif self.standardization_method == 'indep_sessions':
2202			for sample in self.anchors:
2203				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2204				self.samples[sample][f'SE_D{self._4x}'] = 0.
2205			for sample in self.unknowns:
2206				self.msg(f'Consolidating sample {sample}')
2207				self.unknowns[sample][f'session_D{self._4x}'] = {}
2208				session_avg = []
2209				for session in self.sessions:
2210					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2211					if sdata:
2212						self.msg(f'{sample} found in session {session}')
2213						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2214						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2215						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2216						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2217						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2218						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2219						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2220				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2221				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2222				wsum = sum([weights[s] for s in weights])
2223				for s in weights:
2224					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2225
2226		for r in self:
2227			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2231	def consolidate_sessions(self):
2232		'''
2233		Compute various statistics for each session.
2234
2235		+ `Na`: Number of anchor analyses in the session
2236		+ `Nu`: Number of unknown analyses in the session
2237		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2238		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2239		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2240		+ `a`: scrambling factor
2241		+ `b`: compositional slope
2242		+ `c`: WG offset
2243		+ `SE_a`: Model stadard erorr of `a`
2244		+ `SE_b`: Model stadard erorr of `b`
2245		+ `SE_c`: Model stadard erorr of `c`
2246		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2247		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2248		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2249		+ `a2`: scrambling factor drift
2250		+ `b2`: compositional slope drift
2251		+ `c2`: WG offset drift
2252		+ `Np`: Number of standardization parameters to fit
2253		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2254		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2255		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2256		'''
2257		for session in self.sessions:
2258			if 'd13Cwg_VPDB' not in self.sessions[session]:
2259				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2260			if 'd18Owg_VSMOW' not in self.sessions[session]:
2261				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2262			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2263			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2264
2265			self.msg(f'Computing repeatabilities for session {session}')
2266			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2267			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2268			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2269
2270		if self.standardization_method == 'pooled':
2271			for session in self.sessions:
2272
2273				# different (better?) computation of D4x repeatability for each session:
2274				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2275				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2276
2277				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2278				i = self.standardization.var_names.index(f'a_{pf(session)}')
2279				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2280
2281				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2282				i = self.standardization.var_names.index(f'b_{pf(session)}')
2283				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2284
2285				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2286				i = self.standardization.var_names.index(f'c_{pf(session)}')
2287				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2288
2289				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2290				if self.sessions[session]['scrambling_drift']:
2291					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2292					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2293				else:
2294					self.sessions[session]['SE_a2'] = 0.
2295
2296				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2297				if self.sessions[session]['slope_drift']:
2298					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2299					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2300				else:
2301					self.sessions[session]['SE_b2'] = 0.
2302
2303				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2304				if self.sessions[session]['wg_drift']:
2305					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2306					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2307				else:
2308					self.sessions[session]['SE_c2'] = 0.
2309
2310				i = self.standardization.var_names.index(f'a_{pf(session)}')
2311				j = self.standardization.var_names.index(f'b_{pf(session)}')
2312				k = self.standardization.var_names.index(f'c_{pf(session)}')
2313				CM = np.zeros((6,6))
2314				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2315				try:
2316					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2317					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2318					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2319					try:
2320						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2321						CM[3,4] = self.standardization.covar[i2,j2]
2322						CM[4,3] = self.standardization.covar[j2,i2]
2323					except ValueError:
2324						pass
2325					try:
2326						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2327						CM[3,5] = self.standardization.covar[i2,k2]
2328						CM[5,3] = self.standardization.covar[k2,i2]
2329					except ValueError:
2330						pass
2331				except ValueError:
2332					pass
2333				try:
2334					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2335					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2336					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2337					try:
2338						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2339						CM[4,5] = self.standardization.covar[j2,k2]
2340						CM[5,4] = self.standardization.covar[k2,j2]
2341					except ValueError:
2342						pass
2343				except ValueError:
2344					pass
2345				try:
2346					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2347					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2348					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2349				except ValueError:
2350					pass
2351
2352				self.sessions[session]['CM'] = CM
2353
2354		elif self.standardization_method == 'indep_sessions':
2355			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2358	@make_verbal
2359	def repeatabilities(self):
2360		'''
2361		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2362		(for all samples, for anchors, and for unknowns).
2363		'''
2364		self.msg('Computing reproducibilities for all sessions')
2365
2366		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2367		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2368		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2369		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2370		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2373	@make_verbal
2374	def consolidate(self, tables = True, plots = True):
2375		'''
2376		Collect information about samples, sessions and repeatabilities.
2377		'''
2378		self.consolidate_samples()
2379		self.consolidate_sessions()
2380		self.repeatabilities()
2381
2382		if tables:
2383			self.summary()
2384			self.table_of_sessions()
2385			self.table_of_analyses()
2386			self.table_of_samples()
2387
2388		if plots:
2389			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2392	@make_verbal
2393	def rmswd(self,
2394		samples = 'all samples',
2395		sessions = 'all sessions',
2396		):
2397		'''
2398		Compute the χ2, root mean squared weighted deviation
2399		(i.e. reduced χ2), and corresponding degrees of freedom of the
2400		Δ4x values for samples in `samples` and sessions in `sessions`.
2401		
2402		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2403		'''
2404		if samples == 'all samples':
2405			mysamples = [k for k in self.samples]
2406		elif samples == 'anchors':
2407			mysamples = [k for k in self.anchors]
2408		elif samples == 'unknowns':
2409			mysamples = [k for k in self.unknowns]
2410		else:
2411			mysamples = samples
2412
2413		if sessions == 'all sessions':
2414			sessions = [k for k in self.sessions]
2415
2416		chisq, Nf = 0, 0
2417		for sample in mysamples :
2418			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2419			if len(G) > 1 :
2420				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2421				Nf += (len(G) - 1)
2422				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2423		r = (chisq / Nf)**.5 if Nf > 0 else 0
2424		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2425		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2428	@make_verbal
2429	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2430		'''
2431		Compute the repeatability of `[r[key] for r in self]`
2432		'''
2433
2434		if samples == 'all samples':
2435			mysamples = [k for k in self.samples]
2436		elif samples == 'anchors':
2437			mysamples = [k for k in self.anchors]
2438		elif samples == 'unknowns':
2439			mysamples = [k for k in self.unknowns]
2440		else:
2441			mysamples = samples
2442
2443		if sessions == 'all sessions':
2444			sessions = [k for k in self.sessions]
2445
2446		if key in ['D47', 'D48']:
2447			# Full disclosure: the definition of Nf is tricky/debatable
2448			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2449			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2450			Nf = len(G)
2451# 			print(f'len(G) = {Nf}')
2452			Nf -= len([s for s in mysamples if s in self.unknowns])
2453# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2454			for session in sessions:
2455				Np = len([
2456					_ for _ in self.standardization.params
2457					if (
2458						self.standardization.params[_].expr is not None
2459						and (
2460							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2461							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2462							)
2463						)
2464					])
2465# 				print(f'session {session}: {Np} parameters to consider')
2466				Na = len({
2467					r['Sample'] for r in self.sessions[session]['data']
2468					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2469					})
2470# 				print(f'session {session}: {Na} different anchors in that session')
2471				Nf -= min(Np, Na)
2472# 			print(f'Nf = {Nf}')
2473
2474# 			for sample in mysamples :
2475# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2476# 				if len(X) > 1 :
2477# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2478# 					if sample in self.unknowns:
2479# 						Nf += len(X) - 1
2480# 					else:
2481# 						Nf += len(X)
2482# 			if samples in ['anchors', 'all samples']:
2483# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2484			r = (chisq / Nf)**.5 if Nf > 0 else 0
2485
2486		else: # if key not in ['D47', 'D48']
2487			chisq, Nf = 0, 0
2488			for sample in mysamples :
2489				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2490				if len(X) > 1 :
2491					Nf += len(X) - 1
2492					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2493			r = (chisq / Nf)**.5 if Nf > 0 else 0
2494
2495		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2496		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2498	def sample_average(self, samples, weights = 'equal', normalize = True):
2499		'''
2500		Weighted average Δ4x value of a group of samples, accounting for covariance.
2501
2502		Returns the weighed average Δ4x value and associated SE
2503		of a group of samples. Weights are equal by default. If `normalize` is
2504		true, `weights` will be rescaled so that their sum equals 1.
2505
2506		**Examples**
2507
2508		```python
2509		self.sample_average(['X','Y'], [1, 2])
2510		```
2511
2512		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2513		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2514		values of samples X and Y, respectively.
2515
2516		```python
2517		self.sample_average(['X','Y'], [1, -1], normalize = False)
2518		```
2519
2520		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2521		'''
2522		if weights == 'equal':
2523			weights = [1/len(samples)] * len(samples)
2524
2525		if normalize:
2526			s = sum(weights)
2527			if s:
2528				weights = [w/s for w in weights]
2529
2530		try:
2531# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2532# 			C = self.standardization.covar[indices,:][:,indices]
2533			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2534			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2535			return correlated_sum(X, C, weights)
2536		except ValueError:
2537			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2540	def sample_D4x_covar(self, sample1, sample2 = None):
2541		'''
2542		Covariance between Δ4x values of samples
2543
2544		Returns the error covariance between the average Δ4x values of two
2545		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2546		returns the Δ4x variance for that sample.
2547		'''
2548		if sample2 is None:
2549			sample2 = sample1
2550		if self.standardization_method == 'pooled':
2551			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2552			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2553			return self.standardization.covar[i, j]
2554		elif self.standardization_method == 'indep_sessions':
2555			if sample1 == sample2:
2556				return self.samples[sample1][f'SE_D{self._4x}']**2
2557			else:
2558				c = 0
2559				for session in self.sessions:
2560					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2561					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2562					if sdata1 and sdata2:
2563						a = self.sessions[session]['a']
2564						# !! TODO: CM below does not account for temporal changes in standardization parameters
2565						CM = self.sessions[session]['CM'][:3,:3]
2566						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2567						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2568						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2569						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2570						c += (
2571							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2572							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2573							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2574							@ CM
2575							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2576							) / a**2
2577				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2579	def sample_D4x_correl(self, sample1, sample2 = None):
2580		'''
2581		Correlation between Δ4x errors of samples
2582
2583		Returns the error correlation between the average Δ4x values of two samples.
2584		'''
2585		if sample2 is None or sample2 == sample1:
2586			return 1.
2587		return (
2588			self.sample_D4x_covar(sample1, sample2)
2589			/ self.unknowns[sample1][f'SE_D{self._4x}']
2590			/ self.unknowns[sample2][f'SE_D{self._4x}']
2591			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2593	def plot_single_session(self,
2594		session,
2595		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2596		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2597		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2598		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2599		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2600		xylimits = 'free', # | 'constant'
2601		x_label = None,
2602		y_label = None,
2603		error_contour_interval = 'auto',
2604		fig = 'new',
2605		):
2606		'''
2607		Generate plot for a single session
2608		'''
2609		if x_label is None:
2610			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2611		if y_label is None:
2612			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2613
2614		out = _SessionPlot()
2615		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2616		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2617		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2618		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2619		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2620		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2621		anchor_avg = (np.array([ np.array([
2622				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2623				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2624				]) for sample in anchors]).T,
2625			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2626		unknown_avg = (np.array([ np.array([
2627				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2628				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2629				]) for sample in unknowns]).T,
2630			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2631		
2632		
2633		if fig == 'new':
2634			out.fig = ppl.figure(figsize = (6,6))
2635			ppl.subplots_adjust(.1,.1,.9,.9)
2636
2637		out.anchor_analyses, = ppl.plot(
2638			anchors_d,
2639			anchors_D,
2640			**kw_plot_anchors)
2641		out.unknown_analyses, = ppl.plot(
2642			unknowns_d,
2643			unknowns_D,
2644			**kw_plot_unknowns)
2645		out.anchor_avg = ppl.plot(
2646			*anchor_avg,
2647			**kw_plot_anchor_avg)
2648		out.unknown_avg = ppl.plot(
2649			*unknown_avg,
2650			**kw_plot_unknown_avg)
2651		if xylimits == 'constant':
2652			x = [r[f'd{self._4x}'] for r in self]
2653			y = [r[f'D{self._4x}'] for r in self]
2654			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2655			w, h = x2-x1, y2-y1
2656			x1 -= w/20
2657			x2 += w/20
2658			y1 -= h/20
2659			y2 += h/20
2660			ppl.axis([x1, x2, y1, y2])
2661		elif xylimits == 'free':
2662			x1, x2, y1, y2 = ppl.axis()
2663		else:
2664			x1, x2, y1, y2 = ppl.axis(xylimits)
2665				
2666		if error_contour_interval != 'none':
2667			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2668			XI,YI = np.meshgrid(xi, yi)
2669			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2670			if error_contour_interval == 'auto':
2671				rng = np.max(SI) - np.min(SI)
2672				if rng <= 0.01:
2673					cinterval = 0.001
2674				elif rng <= 0.03:
2675					cinterval = 0.004
2676				elif rng <= 0.1:
2677					cinterval = 0.01
2678				elif rng <= 0.3:
2679					cinterval = 0.03
2680				elif rng <= 1.:
2681					cinterval = 0.1
2682				else:
2683					cinterval = 0.5
2684			else:
2685				cinterval = error_contour_interval
2686
2687			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2688			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2689			out.clabel = ppl.clabel(out.contour)
2690			contour = (XI, YI, SI, cval, cinterval)
2691
2692		if fig == None:
2693			return {
2694			'anchors':anchors,
2695			'unknowns':unknowns,
2696			'anchors_d':anchors_d,
2697			'anchors_D':anchors_D,
2698			'unknowns_d':unknowns_d,
2699			'unknowns_D':unknowns_D,
2700			'anchor_avg':anchor_avg,
2701			'unknown_avg':unknown_avg,
2702			'contour':contour,
2703			}
2704
2705		ppl.xlabel(x_label)
2706		ppl.ylabel(y_label)
2707		ppl.title(session, weight = 'bold')
2708		ppl.grid(alpha = .2)
2709		out.ax = ppl.gca()		
2710
2711		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2713	def plot_residuals(
2714		self,
2715		kde = False,
2716		hist = False,
2717		binwidth = 2/3,
2718		dir = 'output',
2719		filename = None,
2720		highlight = [],
2721		colors = None,
2722		figsize = None,
2723		dpi = 100,
2724		yspan = None,
2725		):
2726		'''
2727		Plot residuals of each analysis as a function of time (actually, as a function of
2728		the order of analyses in the `D4xdata` object)
2729
2730		+ `kde`: whether to add a kernel density estimate of residuals
2731		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2732		+ `histbins`: specify bin edges for the histogram
2733		+ `dir`: the directory in which to save the plot
2734		+ `highlight`: a list of samples to highlight
2735		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2736		+ `figsize`: (width, height) of figure
2737		+ `dpi`: resolution for PNG output
2738		+ `yspan`: factor controlling the range of y values shown in plot
2739		  (by default: `yspan = 1.5 if kde else 1.0`)
2740		'''
2741		
2742		from matplotlib import ticker
2743
2744		if yspan is None:
2745			if kde:
2746				yspan = 1.5
2747			else:
2748				yspan = 1.0
2749		
2750		# Layout
2751		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2752		if hist or kde:
2753			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2754			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2755		else:
2756			ppl.subplots_adjust(.08,.05,.78,.8)
2757			ax1 = ppl.subplot(111)
2758		
2759		# Colors
2760		N = len(self.anchors)
2761		if colors is None:
2762			if len(highlight) > 0:
2763				Nh = len(highlight)
2764				if Nh == 1:
2765					colors = {highlight[0]: (0,0,0)}
2766				elif Nh == 3:
2767					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2768				elif Nh == 4:
2769					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2770				else:
2771					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2772			else:
2773				if N == 3:
2774					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2775				elif N == 4:
2776					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2777				else:
2778					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2779
2780		ppl.sca(ax1)
2781		
2782		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2783
2784		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2785
2786		session = self[0]['Session']
2787		x1 = 0
2788# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2789		x_sessions = {}
2790		one_or_more_singlets = False
2791		one_or_more_multiplets = False
2792		multiplets = set()
2793		for k,r in enumerate(self):
2794			if r['Session'] != session:
2795				x2 = k-1
2796				x_sessions[session] = (x1+x2)/2
2797				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2798				session = r['Session']
2799				x1 = k
2800			singlet = len(self.samples[r['Sample']]['data']) == 1
2801			if not singlet:
2802				multiplets.add(r['Sample'])
2803			if r['Sample'] in self.unknowns:
2804				if singlet:
2805					one_or_more_singlets = True
2806				else:
2807					one_or_more_multiplets = True
2808			kw = dict(
2809				marker = 'x' if singlet else '+',
2810				ms = 4 if singlet else 5,
2811				ls = 'None',
2812				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2813				mew = 1,
2814				alpha = 0.2 if singlet else 1,
2815				)
2816			if highlight and r['Sample'] not in highlight:
2817				kw['alpha'] = 0.2
2818			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2819		x2 = k
2820		x_sessions[session] = (x1+x2)/2
2821
2822		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2823		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2824		if not (hist or kde):
2825			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2826			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2827
2828		xmin, xmax, ymin, ymax = ppl.axis()
2829		if yspan != 1:
2830			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2831		for s in x_sessions:
2832			ppl.text(
2833				x_sessions[s],
2834				ymax +1,
2835				s,
2836				va = 'bottom',
2837				**(
2838					dict(ha = 'center')
2839					if len(self.sessions[s]['data']) > (0.15 * len(self))
2840					else dict(ha = 'left', rotation = 45)
2841					)
2842				)
2843
2844		if hist or kde:
2845			ppl.sca(ax2)
2846
2847		for s in colors:
2848			kw['marker'] = '+'
2849			kw['ms'] = 5
2850			kw['mec'] = colors[s]
2851			kw['label'] = s
2852			kw['alpha'] = 1
2853			ppl.plot([], [], **kw)
2854
2855		kw['mec'] = (0,0,0)
2856
2857		if one_or_more_singlets:
2858			kw['marker'] = 'x'
2859			kw['ms'] = 4
2860			kw['alpha'] = .2
2861			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2862			ppl.plot([], [], **kw)
2863
2864		if one_or_more_multiplets:
2865			kw['marker'] = '+'
2866			kw['ms'] = 4
2867			kw['alpha'] = 1
2868			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2869			ppl.plot([], [], **kw)
2870
2871		if hist or kde:
2872			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2873		else:
2874			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2875		leg.set_zorder(-1000)
2876
2877		ppl.sca(ax1)
2878
2879		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2880		ppl.xticks([])
2881		ppl.axis([-1, len(self), None, None])
2882
2883		if hist or kde:
2884			ppl.sca(ax2)
2885			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2886
2887			if kde:
2888				from scipy.stats import gaussian_kde
2889				yi = np.linspace(ymin, ymax, 201)
2890				xi = gaussian_kde(X).evaluate(yi)
2891				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2892# 				ppl.plot(xi, yi, 'k-', lw = 1)
2893			elif hist:
2894				ppl.hist(
2895					X,
2896					orientation = 'horizontal',
2897					histtype = 'stepfilled',
2898					ec = [.4]*3,
2899					fc = [.25]*3,
2900					alpha = .25,
2901					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2902					)
2903			ppl.text(0, 0,
2904				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2905				size = 7.5,
2906				alpha = 1,
2907				va = 'center',
2908				ha = 'left',
2909				)
2910
2911			ppl.axis([0, None, ymin, ymax])
2912			ppl.xticks([])
2913			ppl.yticks([])
2914# 			ax2.spines['left'].set_visible(False)
2915			ax2.spines['right'].set_visible(False)
2916			ax2.spines['top'].set_visible(False)
2917			ax2.spines['bottom'].set_visible(False)
2918
2919		ax1.axis([None, None, ymin, ymax])
2920
2921		if not os.path.exists(dir):
2922			os.makedirs(dir)
2923		if filename is None:
2924			return fig
2925		elif filename == '':
2926			filename = f'D{self._4x}_residuals.pdf'
2927		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2928		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: (r, g, b)} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2931	def simulate(self, *args, **kwargs):
2932		'''
2933		Legacy function with warning message pointing to `virtual_data()`
2934		'''
2935		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_anchor_residuals( self, dir='output', filename='', figsize=None, subplots_adjust=(0.05, 0.1, 0.95, 0.98, 0.25, 0.25), dpi=100, colors=None):
2937	def plot_anchor_residuals(
2938		self,
2939		dir = 'output',
2940		filename = '',
2941		figsize = None,
2942		subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25),
2943		dpi = 100,
2944		colors = None,
2945		):
2946		'''
2947		Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
2948		
2949		**Parameters**
2950
2951		+ `dir`: the directory in which to save the plot
2952		+ `filename`: the file name to save to.
2953		+ `dpi`: resolution for PNG output
2954		+ `figsize`: (width, height) of figure
2955		+ `subplots_adjust`: passed to the figure
2956		+ `dpi`: resolution for PNG output
2957		+ `colors`: a dict of `{<sample>: (r, g, b)}` for all samples
2958		'''
2959
2960		# Colors
2961		N = len(self.anchors)
2962		if colors is None:
2963			if N == 3:
2964				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2965			elif N == 4:
2966				colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2967			else:
2968				colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2969
2970		if figsize is None:
2971			figsize = (4, 1.5*N+1)
2972		fig = ppl.figure(figsize = figsize)
2973		ppl.subplots_adjust(*subplots_adjust)
2974		axs = {}
2975		X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000
2976		sigma = self.repeatability['r_D47a'] * 1000
2977		D = max(np.abs(X))
2978
2979		for k,a in enumerate(self.anchors):
2980			color = colors[a]
2981			axs[a] = ppl.subplot(N, 1, 1+k)
2982			axs[a].text(
2983				0.02, 1-0.05, a,
2984				va = 'top',
2985				ha = 'left',
2986				weight = 'bold',
2987				size = 9,
2988				color = [_*0.75 for _ in color],
2989				transform = axs[a].transAxes,
2990			)
2991			X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000
2992			axs[a].axvline(0, lw = 0.5, color = color)
2993			axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False)
2994
2995			xi = np.linspace(-3*D, 3*D, 601)
2996			yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0)
2997			ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color)
2998			
2999			axs[a].errorbar(
3000				X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5,
3001				ecolor = color,
3002				marker = 's',
3003				ls = 'None',
3004				mec = color,
3005				mew = 1,
3006				mfc = 'w',
3007				ms = 8,
3008				elinewidth = 1,
3009				capsize = 4,
3010				capthick = 1,
3011			)
3012			
3013			axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05])
3014			ppl.yticks([])
3015
3016		ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)')		
3017
3018		if not os.path.exists(dir):
3019			os.makedirs(dir)
3020		if filename is None:
3021			return fig
3022		elif filename == '':
3023			filename = f'D{self._4x}_anchor_residuals.pdf'
3024		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3025		ppl.close(fig)

Plot a summary of the residuals for all anchors, intended to help detect systematic bias.

Parameters

  • dir: the directory in which to save the plot
  • filename: the file name to save to.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • subplots_adjust: passed to the figure
  • dpi: resolution for PNG output
  • colors: a dict of {<sample>: (r, g, b)} for all samples
def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
3028	def plot_distribution_of_analyses(
3029		self,
3030		dir = 'output',
3031		filename = None,
3032		vs_time = False,
3033		figsize = (6,4),
3034		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
3035		output = None,
3036		dpi = 100,
3037		):
3038		'''
3039		Plot temporal distribution of all analyses in the data set.
3040		
3041		**Parameters**
3042
3043		+ `dir`: the directory in which to save the plot
3044		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
3045		+ `dpi`: resolution for PNG output
3046		+ `figsize`: (width, height) of figure
3047		+ `dpi`: resolution for PNG output
3048		'''
3049
3050		asamples = [s for s in self.anchors]
3051		usamples = [s for s in self.unknowns]
3052		if output is None or output == 'fig':
3053			fig = ppl.figure(figsize = figsize)
3054			ppl.subplots_adjust(*subplots_adjust)
3055		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3056		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
3057		Xmax += (Xmax-Xmin)/40
3058		Xmin -= (Xmax-Xmin)/41
3059		for k, s in enumerate(asamples + usamples):
3060			if vs_time:
3061				X = [r['TimeTag'] for r in self if r['Sample'] == s]
3062			else:
3063				X = [x for x,r in enumerate(self) if r['Sample'] == s]
3064			Y = [-k for x in X]
3065			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
3066			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
3067			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
3068		ppl.axis([Xmin, Xmax, -k-1, 1])
3069		ppl.xlabel('\ntime')
3070		ppl.gca().annotate('',
3071			xy = (0.6, -0.02),
3072			xycoords = 'axes fraction',
3073			xytext = (.4, -0.02), 
3074            arrowprops = dict(arrowstyle = "->", color = 'k'),
3075            )
3076			
3077
3078		x2 = -1
3079		for session in self.sessions:
3080			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3081			if vs_time:
3082				ppl.axvline(x1, color = 'k', lw = .75)
3083			if x2 > -1:
3084				if not vs_time:
3085					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
3086			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
3087# 			from xlrd import xldate_as_datetime
3088# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
3089			if vs_time:
3090				ppl.axvline(x2, color = 'k', lw = .75)
3091				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
3092			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
3093
3094		ppl.xticks([])
3095		ppl.yticks([])
3096
3097		if output is None:
3098			if not os.path.exists(dir):
3099				os.makedirs(dir)
3100			if filename == None:
3101				filename = f'D{self._4x}_distribution_of_analyses.pdf'
3102			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
3103			ppl.close(fig)
3104		elif output == 'ax':
3105			return ppl.gca()
3106		elif output == 'fig':
3107			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
3110	def plot_bulk_compositions(
3111		self,
3112		samples = None,
3113		dir = 'output/bulk_compositions',
3114		figsize = (6,6),
3115		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3116		show = False,
3117		sample_color = (0,.5,1),
3118		analysis_color = (.7,.7,.7),
3119		labeldist = 0.3,
3120		radius = 0.05,
3121		):
3122		'''
3123		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3124		
3125		By default, creates a directory `./output/bulk_compositions` where plots for
3126		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3127		
3128		
3129		**Parameters**
3130
3131		+ `samples`: Only these samples are processed (by default: all samples).
3132		+ `dir`: where to save the plots
3133		+ `figsize`: (width, height) of figure
3134		+ `subplots_adjust`: passed to `subplots_adjust()`
3135		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3136		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3137		+ `sample_color`: color used for replicate markers/labels
3138		+ `analysis_color`: color used for sample markers/labels
3139		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3140		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3141		'''
3142
3143		from matplotlib.patches import Ellipse
3144
3145		if samples is None:
3146			samples = [_ for _ in self.samples]
3147
3148		saved = {}
3149
3150		for s in samples:
3151
3152			fig = ppl.figure(figsize = figsize)
3153			fig.subplots_adjust(*subplots_adjust)
3154			ax = ppl.subplot(111)
3155			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3156			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3157			ppl.title(s)
3158
3159
3160			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3161			UID = [_['UID'] for _ in self.samples[s]['data']]
3162			XY0 = XY.mean(0)
3163
3164			for xy in XY:
3165				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3166				
3167			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3168			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3169			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3170			saved[s] = [XY, XY0]
3171			
3172			x1, x2, y1, y2 = ppl.axis()
3173			x0, dx = (x1+x2)/2, (x2-x1)/2
3174			y0, dy = (y1+y2)/2, (y2-y1)/2
3175			dx, dy = [max(max(dx, dy), radius)]*2
3176
3177			ppl.axis([
3178				x0 - 1.2*dx,
3179				x0 + 1.2*dx,
3180				y0 - 1.2*dy,
3181				y0 + 1.2*dy,
3182				])			
3183
3184			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3185
3186			for xy, uid in zip(XY, UID):
3187
3188				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3189				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3190
3191				if (vector_in_display_space**2).sum() > 0:
3192
3193					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3194					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3195					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3196					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3197
3198					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3199
3200				else:
3201
3202					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3203
3204			if radius:
3205				ax.add_artist(Ellipse(
3206					xy = XY0,
3207					width = radius*2,
3208					height = radius*2,
3209					ls = (0, (2,2)),
3210					lw = .7,
3211					ec = analysis_color,
3212					fc = 'None',
3213					))
3214				ppl.text(
3215					XY0[0],
3216					XY0[1]-radius,
3217					f'\n± {radius*1e3:.0f} ppm',
3218					color = analysis_color,
3219					va = 'top',
3220					ha = 'center',
3221					linespacing = 0.4,
3222					size = 8,
3223					)
3224
3225			if not os.path.exists(dir):
3226				os.makedirs(dir)
3227			fig.savefig(f'{dir}/{s}.pdf')
3228			ppl.close(fig)
3229
3230		fig = ppl.figure(figsize = figsize)
3231		fig.subplots_adjust(*subplots_adjust)
3232		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3233		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3234
3235		for s in saved:
3236			for xy in saved[s][0]:
3237				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3238			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3239			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3240			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3241
3242		x1, x2, y1, y2 = ppl.axis()
3243		ppl.axis([
3244			x1 - (x2-x1)/10,
3245			x2 + (x2-x1)/10,
3246			y1 - (y2-y1)/10,
3247			y2 + (y2-y1)/10,
3248			])			
3249
3250
3251		if not os.path.exists(dir):
3252			os.makedirs(dir)
3253		fig.savefig(f'{dir}/__all__.pdf')
3254		if show:
3255			ppl.show()
3256		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3302class D47data(D4xdata):
3303	'''
3304	Store and process data for a large set of Δ47 analyses,
3305	usually comprising more than one analytical session.
3306	'''
3307
3308	Nominal_D4x = {
3309		'ETH-1':   0.2052,
3310		'ETH-2':   0.2085,
3311		'ETH-3':   0.6132,
3312		'ETH-4':   0.4511,
3313		'IAEA-C1': 0.3018,
3314		'IAEA-C2': 0.6409,
3315		'MERCK':   0.5135,
3316		} # I-CDES (Bernasconi et al., 2021)
3317	'''
3318	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3319	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3320	reference frame.
3321
3322	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3323	```py
3324	{
3325		'ETH-1'   : 0.2052,
3326		'ETH-2'   : 0.2085,
3327		'ETH-3'   : 0.6132,
3328		'ETH-4'   : 0.4511,
3329		'IAEA-C1' : 0.3018,
3330		'IAEA-C2' : 0.6409,
3331		'MERCK'   : 0.5135,
3332	}
3333	```
3334	'''
3335
3336
3337	@property
3338	def Nominal_D47(self):
3339		return self.Nominal_D4x
3340	
3341
3342	@Nominal_D47.setter
3343	def Nominal_D47(self, new):
3344		self.Nominal_D4x = dict(**new)
3345		self.refresh()
3346
3347
3348	def __init__(self, l = [], **kwargs):
3349		'''
3350		**Parameters:** same as `D4xdata.__init__()`
3351		'''
3352		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3353
3354
3355	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3356		'''
3357		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3358		value for that temperature, and add treat these samples as additional anchors.
3359
3360		**Parameters**
3361
3362		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3363		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3364		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3365		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3366		if `new`: keep pre-existing anchors but update them in case of conflict
3367		between old and new Δ47 values;
3368		if `old`: keep pre-existing anchors but preserve their original Δ47
3369		values in case of conflict.
3370		'''
3371		f = {
3372			'petersen': fCO2eqD47_Petersen,
3373			'wang': fCO2eqD47_Wang,
3374			}[fCo2eqD47]
3375		foo = {}
3376		for r in self:
3377			if 'Teq' in r:
3378				if r['Sample'] in foo:
3379					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3380				else:
3381					foo[r['Sample']] = f(r['Teq'])
3382			else:
3383					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3384
3385		if priority == 'replace':
3386			self.Nominal_D47 = {}
3387		for s in foo:
3388			if priority != 'old' or s not in self.Nominal_D47:
3389				self.Nominal_D47[s] = foo[s]
3390	
3391	def save_D47_correl(self, *args, **kwargs):
3392		return self._save_D4x_correl(*args, **kwargs)
3393
3394	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3348	def __init__(self, l = [], **kwargs):
3349		'''
3350		**Parameters:** same as `D4xdata.__init__()`
3351		'''
3352		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3337	@property
3338	def Nominal_D47(self):
3339		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3355	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3356		'''
3357		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3358		value for that temperature, and add treat these samples as additional anchors.
3359
3360		**Parameters**
3361
3362		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3363		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3364		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3365		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3366		if `new`: keep pre-existing anchors but update them in case of conflict
3367		between old and new Δ47 values;
3368		if `old`: keep pre-existing anchors but preserve their original Δ47
3369		values in case of conflict.
3370		'''
3371		f = {
3372			'petersen': fCO2eqD47_Petersen,
3373			'wang': fCO2eqD47_Wang,
3374			}[fCo2eqD47]
3375		foo = {}
3376		for r in self:
3377			if 'Teq' in r:
3378				if r['Sample'] in foo:
3379					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3380				else:
3381					foo[r['Sample']] = f(r['Teq'])
3382			else:
3383					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3384
3385		if priority == 'replace':
3386			self.Nominal_D47 = {}
3387		for s in foo:
3388			if priority != 'old' or s not in self.Nominal_D47:
3389				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3391	def save_D47_correl(self, *args, **kwargs):
3392		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string
class D48data(D4xdata):
3397class D48data(D4xdata):
3398	'''
3399	Store and process data for a large set of Δ48 analyses,
3400	usually comprising more than one analytical session.
3401	'''
3402
3403	Nominal_D4x = {
3404		'ETH-1':  0.138,
3405		'ETH-2':  0.138,
3406		'ETH-3':  0.270,
3407		'ETH-4':  0.223,
3408		'GU-1':  -0.419,
3409		} # (Fiebig et al., 2019, 2021)
3410	'''
3411	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3412	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3413	reference frame.
3414
3415	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3416	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3417
3418	```py
3419	{
3420		'ETH-1' :  0.138,
3421		'ETH-2' :  0.138,
3422		'ETH-3' :  0.270,
3423		'ETH-4' :  0.223,
3424		'GU-1'  : -0.419,
3425	}
3426	```
3427	'''
3428
3429
3430	@property
3431	def Nominal_D48(self):
3432		return self.Nominal_D4x
3433
3434	
3435	@Nominal_D48.setter
3436	def Nominal_D48(self, new):
3437		self.Nominal_D4x = dict(**new)
3438		self.refresh()
3439
3440
3441	def __init__(self, l = [], **kwargs):
3442		'''
3443		**Parameters:** same as `D4xdata.__init__()`
3444		'''
3445		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3446
3447	def save_D48_correl(self, *args, **kwargs):
3448		return self._save_D4x_correl(*args, **kwargs)
3449
3450	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3441	def __init__(self, l = [], **kwargs):
3442		'''
3443		**Parameters:** same as `D4xdata.__init__()`
3444		'''
3445		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3430	@property
3431	def Nominal_D48(self):
3432		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3447	def save_D48_correl(self, *args, **kwargs):
3448		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string
class D49data(D4xdata):
3453class D49data(D4xdata):
3454	'''
3455	Store and process data for a large set of Δ49 analyses,
3456	usually comprising more than one analytical session.
3457	'''
3458	
3459	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3460	'''
3461	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3462	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3463	reference frame.
3464
3465	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3466
3467	```py
3468	{
3469		"1000C": 0.0,
3470		"25C": 2.228
3471	}
3472	```
3473	'''
3474	
3475	@property
3476	def Nominal_D49(self):
3477		return self.Nominal_D4x
3478	
3479	@Nominal_D49.setter
3480	def Nominal_D49(self, new):
3481		self.Nominal_D4x = dict(**new)
3482		self.refresh()
3483	
3484	def __init__(self, l=[], **kwargs):
3485		'''
3486		**Parameters:** same as `D4xdata.__init__()`
3487		'''
3488		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3489	
3490	def save_D49_correl(self, *args, **kwargs):
3491		return self._save_D4x_correl(*args, **kwargs)
3492	
3493	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3484	def __init__(self, l=[], **kwargs):
3485		'''
3486		**Parameters:** same as `D4xdata.__init__()`
3487		'''
3488		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3475	@property
3476	def Nominal_D49(self):
3477		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3490	def save_D49_correl(self, *args, **kwargs):
3491		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
  • save_to_file: whether to write the output to a file factor values (by default: True). If False, returns the output as a string