D47crunch

Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements

Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).

The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.

1. Tutorial

1.1 Installation

The easy option is to use pip; open a shell terminal and simply type:

python -m pip install D47crunch

For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:

  1. Download the dev branch source code here and rename it to D47crunch.py.
  2. Do any of the following:
    • copy D47crunch.py to somewhere in your Python path
    • copy D47crunch.py to a working directory (import D47crunch will only work if called within that directory)
    • copy D47crunch.py to any other location (e.g., /foo/bar) and then use the following code snippet in your own code to import D47crunch:
import sys
sys.path.append('/foo/bar')
import D47crunch

Documentation for the development version can be downloaded here (save html file and open it locally).

1.2 Usage

Start by creating a file named rawdata.csv with the following contents:

UID,  Sample,           d45,       d46,        d47,        d48,       d49
A01,  ETH-1,        5.79502,  11.62767,   16.89351,   24.56708,   0.79486
A02,  MYSAMPLE-1,   6.21907,  11.49107,   17.27749,   24.58270,   1.56318
A03,  ETH-2,       -6.05868,  -4.81718,  -11.63506,  -10.32578,   0.61352
A04,  MYSAMPLE-2,  -3.86184,   4.94184,    0.60612,   10.52732,   0.57118
A05,  ETH-3,        5.54365,  12.05228,   17.40555,   25.96919,   0.74608
A06,  ETH-2,       -6.06706,  -4.87710,  -11.69927,  -10.64421,   1.61234
A07,  ETH-1,        5.78821,  11.55910,   16.80191,   24.56423,   1.47963
A08,  MYSAMPLE-2,  -3.87692,   4.86889,    0.52185,   10.40390,   1.07032

Then instantiate a D47data object which will store and process this data:

import D47crunch
mydata = D47data()

For now, this object is empty:

>>> print(mydata)
[]

To load the analyses saved in rawdata.csv into our D47data object and process the data:

mydata.read('rawdata.csv')

# compute δ13C, δ18O of working gas:
mydata.wg()

# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()

# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()

We can now print a summary of the data processing:

>>> mydata.summary(verbose = True, save_to_file = False)
[summary]        
–––––––––––––––––––––––––––––––  –––––––––
N samples (anchors + unknowns)   5 (3 + 2)
N analyses (anchors + unknowns)  8 (5 + 3)
Repeatability of δ13C_VPDB         4.2 ppm
Repeatability of δ18O_VSMOW       47.5 ppm
Repeatability of Δ47 (anchors)    13.4 ppm
Repeatability of Δ47 (unknowns)    2.5 ppm
Repeatability of Δ47 (all)         9.6 ppm
Model degrees of freedom                 3
Student's 95% t-factor                3.18
Standardization method              pooled
–––––––––––––––––––––––––––––––  –––––––––

This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).

To see the actual results:

>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples] 
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample      N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1       2       2.01       37.01  0.2052                    0.0131          
ETH-2       2     -10.17       19.88  0.2085                    0.0026          
ETH-3       1       1.73       37.49  0.6132                                    
MYSAMPLE-1  1       2.48       36.90  0.2996  0.0091  ± 0.0291                  
MYSAMPLE-2  2      -8.17       30.05  0.6600  0.0115  ± 0.0366  0.0025          
––––––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––

This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.

We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):

>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses] 
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
UID    Session      Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48       d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw      D49raw       D47
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––
A01  mySession       ETH-1       -3.807        24.921   5.795020  11.627670   16.893510   24.567080  0.794860    2.014086   37.041843  -0.574686   1.149684  -27.690250  0.214454
A02  mySession  MYSAMPLE-1       -3.807        24.921   6.219070  11.491070   17.277490   24.582700  1.563180    2.476827   36.898281  -0.499264   1.435380  -27.122614  0.299589
A03  mySession       ETH-2       -3.807        24.921  -6.058680  -4.817180  -11.635060  -10.325780  0.613520  -10.166796   19.907706  -0.685979  -0.721617   16.716901  0.206693
A04  mySession  MYSAMPLE-2       -3.807        24.921  -3.861840   4.941840    0.606120   10.527320  0.571180   -8.159927   30.087230  -0.248531   0.613099   -4.979413  0.658270
A05  mySession       ETH-3       -3.807        24.921   5.543650  12.052280   17.405550   25.969190  0.746080    1.727029   37.485567  -0.226150   1.678699  -28.280301  0.613200
A06  mySession       ETH-2       -3.807        24.921  -6.067060  -4.877100  -11.699270  -10.644210  1.612340  -10.173599   19.845192  -0.683054  -0.922832   17.861363  0.210328
A07  mySession       ETH-1       -3.807        24.921   5.788210  11.559100   16.801910   24.564230  1.479630    2.009281   36.970298  -0.591129   1.282632  -26.888335  0.195926
A08  mySession  MYSAMPLE-2       -3.807        24.921  -3.876920   4.868890    0.521850   10.403900  1.070320   -8.173486   30.011134  -0.245768   0.636159   -4.324964  0.661803
–––  –––––––––  ––––––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––

2. How-to

2.1 Simulate a virtual data set to play with

It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).

This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

2.2 Control data quality

D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:

from D47crunch import *
from random import shuffle

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 8),
        dict(Sample = 'ETH-2', N = 8),
        dict(Sample = 'ETH-3', N = 8),
        dict(Sample = 'FOO', N = 4,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 4,
            d13C_VPDB = -15., d18O_VPDB = -15.,
            D47 = 0.5, D48 = 0.2),
        ])

sessions = [
    virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
    for k in range(10)]

# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])

# create D47data instance:
data47 = D47data(data)

# process D47data instance:
data47.crunch()
data47.standardize()

2.2.1 Plotting the distribution of analyses through time

data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

time_distribution.png

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).

2.2.2 Generating session plots

data47.plot_sessions()

Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

D47_plot_Session_03.png

2.2.3 Plotting Δ47 or Δ48 residuals

data47.plot_residuals(filename = 'residuals.pdf', kde = True)

residuals.png

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.

2.2.4 Checking δ13C and δ18O dispersion

mydata = D47data(virtual_data(
    session = 'mysession',
    samples = [
        dict(Sample = 'ETH-1', N = 4),
        dict(Sample = 'ETH-2', N = 4),
        dict(Sample = 'ETH-3', N = 4),
        dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
    ], seed = 123))

mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()

D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

bulk_compositions.png

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters

Nominal values for various carbonate standards are defined in four places:

17O correction parameters are defined by:

When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:

Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:

from D47crunch import D4xdata, D47data

# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
    a: D47data.Nominal_D4x[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600

# only now create D47data object:
mydata = D47data()

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:

from D47crunch import D47data

# first create D47data object:
mydata = D47data()

# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value

# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17

# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
    a: mydata.Nominal_D47[a]
    for a in ['ETH-1', 'ETH-2', 'ETH-3']
    }
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600

# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)

# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}

The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:

from D47crunch import D47data

# create two D47data objects:
foo = D47data()
bar = D47data()

# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
    'ETH-1': foo.Nominal_D47['ETH-1'],
    'ETH-2': foo.Nominal_D47['ETH-1'],
    'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
    'INLAB_REF_MATERIAL': 0.666,
    }

# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg()          # compute δ13C, δ18O of working gas
foo.crunch()      # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values

bar.read('rawdata.csv')
bar.wg()          # compute δ13C, δ18O of working gas
bar.crunch()      # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values

# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)

2.4 Process paired Δ47 and Δ48 values

Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.

from D47crunch import *

# generate virtual data:
args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)

# create D47data instance:
data47 = D47data(session1 + session2)

# process D47data instance:
data47.crunch()
data47.standardize()

# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)

# process D48data instance:
data48.crunch()
data48.standardize()

# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)

Expected output:

––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47      a_47 ± SE  1e3 x b_47 ± SE       c_47 ± SE   r_D48      a_48 ± SE  1e3 x b_48 ± SE       c_48 ± SE
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––
Session_01   9   3       -4.000        26.000  0.0000  0.0000  0.0098  1.021 ± 0.019   -0.398 ± 0.260  -0.903 ± 0.006  0.0486  0.540 ± 0.151    1.235 ± 0.607  -0.390 ± 0.025
Session_02   9   3       -4.000        26.000  0.0000  0.0000  0.0090  1.015 ± 0.019    0.376 ± 0.260  -0.905 ± 0.006  0.0186  1.350 ± 0.156   -0.871 ± 0.608  -0.504 ± 0.027
––––––––––  ––  ––  –––––––––––  ––––––––––––  ––––––  ––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––  ––––––  –––––––––––––  –––––––––––––––  ––––––––––––––


––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
Sample  N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene     D48      SE    95% CL      SD  p_Levene
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––
ETH-1   6       2.02       37.02  0.2052                    0.0078            0.1380                    0.0223          
ETH-2   6     -10.17       19.88  0.2085                    0.0036            0.1380                    0.0482          
ETH-3   6       1.71       37.45  0.6132                    0.0080            0.2700                    0.0176          
FOO     6      -5.00       28.91  0.3026  0.0044  ± 0.0093  0.0121     0.164  0.1397  0.0121  ± 0.0255  0.0267     0.127
––––––  –  –––––––––  ––––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––  ––––––  ––––––  ––––––––  ––––––  ––––––––


–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47       D48
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––
1    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.120787   21.286237   27.780042    2.020000   37.024281  -0.708176  -0.316435  -0.000013  0.197297  0.087763
2    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132240   21.307795   27.780042    2.020000   37.024281  -0.696913  -0.295333  -0.000013  0.208328  0.126791
3    Session_01   ETH-1       -4.000        26.000   6.018962  10.747026   16.132438   21.313884   27.780042    2.020000   37.024281  -0.696718  -0.289374  -0.000013  0.208519  0.137813
4    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700300  -12.210735  -18.023381  -10.170000   19.875825  -0.683938  -0.297902  -0.000002  0.209785  0.198705
5    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.707421  -12.270781  -18.023381  -10.170000   19.875825  -0.691145  -0.358673  -0.000002  0.202726  0.086308
6    Session_01   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.700061  -12.278310  -18.023381  -10.170000   19.875825  -0.683696  -0.366292  -0.000002  0.210022  0.072215
7    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.684379   22.225827   28.306614    1.710000   37.450394  -0.273094  -0.216392  -0.000014  0.623472  0.270873
8    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.660163   22.233729   28.306614    1.710000   37.450394  -0.296906  -0.208664  -0.000014  0.600150  0.285167
9    Session_01   ETH-3       -4.000        26.000   5.742374  11.161270   16.675191   22.215632   28.306614    1.710000   37.450394  -0.282128  -0.226363  -0.000014  0.614623  0.252432
10   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.328380    5.374933    4.665655   -5.000000   28.907344  -0.582131  -0.288924  -0.000006  0.314928  0.175105
11   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.302220    5.384454    4.665655   -5.000000   28.907344  -0.608241  -0.279457  -0.000006  0.289356  0.192614
12   Session_01     FOO       -4.000        26.000  -0.840413   2.828738    1.322530    5.372841    4.665655   -5.000000   28.907344  -0.587970  -0.291004  -0.000006  0.309209  0.171257
13   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.140853   21.267202   27.780042    2.020000   37.024281  -0.688442  -0.335067  -0.000013  0.207730  0.138730
14   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.127087   21.256983   27.780042    2.020000   37.024281  -0.701980  -0.345071  -0.000013  0.194396  0.131311
15   Session_02   ETH-1       -4.000        26.000   6.018962  10.747026   16.148253   21.287779   27.780042    2.020000   37.024281  -0.681165  -0.314926  -0.000013  0.214898  0.153668
16   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715859  -12.204791  -18.023381  -10.170000   19.875825  -0.699685  -0.291887  -0.000002  0.207349  0.149128
17   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.709763  -12.188685  -18.023381  -10.170000   19.875825  -0.693516  -0.275587  -0.000002  0.213426  0.161217
18   Session_02   ETH-2       -4.000        26.000  -5.995859  -5.976076  -12.715427  -12.253049  -18.023381  -10.170000   19.875825  -0.699249  -0.340727  -0.000002  0.207780  0.112907
19   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.685994   22.249463   28.306614    1.710000   37.450394  -0.271506  -0.193275  -0.000014  0.618328  0.244431
20   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.681351   22.298166   28.306614    1.710000   37.450394  -0.276071  -0.145641  -0.000014  0.613831  0.279758
21   Session_02   ETH-3       -4.000        26.000   5.742374  11.161270   16.676169   22.306848   28.306614    1.710000   37.450394  -0.281167  -0.137150  -0.000014  0.608813  0.286056
22   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.324359    5.339497    4.665655   -5.000000   28.907344  -0.586144  -0.324160  -0.000006  0.314015  0.136535
23   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.297658    5.325854    4.665655   -5.000000   28.907344  -0.612794  -0.337727  -0.000006  0.287767  0.126473
24   Session_02     FOO       -4.000        26.000  -0.840413   2.828738    1.310185    5.339898    4.665655   -5.000000   28.907344  -0.600291  -0.323761  -0.000006  0.300082  0.136830
–––  ––––––––––  ––––––  –––––––––––  ––––––––––––  –––––––––  –––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  ––––––––––  –––––––––  –––––––––  –––––––––  ––––––––  ––––––––

3. Command-Line Interface (CLI)

Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:

D47crunch rawdata.csv

This will create a directory named output and populate it by calling the following methods:

You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:

D47crunch -a anchors.csv rawdata.csv

In this case, the anchors.csv file (you may use any other file name) must have the following format:

Sample, d13C_VPDB, d18O_VPDB,    D47
 ETH-1,      2.02,     -2.19, 0.2052
 ETH-2,    -10.17,    -18.69, 0.2085
 ETH-3,      1.71,     -1.78, 0.6132
 ETH-4,          ,          , 0.4511

The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.

You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:

D47crunch -e badbatch.csv rawdata.csv

In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:

UID, Sample
A03
A09
B06
   , MYBADSAMPLE-1
   , MYBADSAMPLE-2

This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.

The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:

D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv

To process Δ48 as well as Δ47, just add the --D48 option.

API Documentation

   1'''
   2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
   3
   4Process and standardize carbonate and/or CO2 clumped-isotope analyses,
   5from low-level data out of a dual-inlet mass spectrometer to final, “absolute”
   6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates
   7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)).
   8
   9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results.
  10The **how-to** section provides instructions applicable to various specific tasks.
  11
  12.. include:: ../../docpages/tutorial.md
  13.. include:: ../../docpages/howto.md
  14.. include:: ../../docpages/cli.md
  15
  16<h1>API Documentation</h1>
  17'''
  18
  19__docformat__ = "restructuredtext"
  20__author__    = 'Mathieu Daëron'
  21__contact__   = 'daeron@lsce.ipsl.fr'
  22__copyright__ = 'Copyright (c) Mathieu Daëron'
  23__license__   = 'MIT License - https://opensource.org/licenses/MIT'
  24__date__      = '2025-09-04'
  25__version__   = '2.4.3'
  26
  27import os
  28import numpy as np
  29import typer
  30from typing_extensions import Annotated
  31from statistics import stdev
  32from scipy.stats import t as tstudent
  33from scipy.stats import levene
  34from scipy.interpolate import interp1d
  35from numpy import linalg
  36from lmfit import Minimizer, Parameters, report_fit
  37from matplotlib import pyplot as ppl
  38from datetime import datetime as dt
  39from functools import wraps
  40from colorsys import hls_to_rgb
  41from matplotlib import rcParams
  42from typer import rich_utils
  43
  44rich_utils.STYLE_HELPTEXT = ''
  45
  46rcParams['font.family'] = 'sans-serif'
  47rcParams['font.sans-serif'] = 'Helvetica'
  48rcParams['font.size'] = 10
  49rcParams['mathtext.fontset'] = 'custom'
  50rcParams['mathtext.rm'] = 'sans'
  51rcParams['mathtext.bf'] = 'sans:bold'
  52rcParams['mathtext.it'] = 'sans:italic'
  53rcParams['mathtext.cal'] = 'sans:italic'
  54rcParams['mathtext.default'] = 'rm'
  55rcParams['xtick.major.size'] = 4
  56rcParams['xtick.major.width'] = 1
  57rcParams['ytick.major.size'] = 4
  58rcParams['ytick.major.width'] = 1
  59rcParams['axes.grid'] = False
  60rcParams['axes.linewidth'] = 1
  61rcParams['grid.linewidth'] = .75
  62rcParams['grid.linestyle'] = '-'
  63rcParams['grid.alpha'] = .15
  64rcParams['savefig.dpi'] = 150
  65
  66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]])
  67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1])
  68def fCO2eqD47_Petersen(T):
  69	'''
  70	CO2 equilibrium Δ47 value as a function of T (in degrees C)
  71	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
  72
  73	'''
  74	return float(_fCO2eqD47_Petersen(T))
  75
  76
  77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]])
  78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1])
  79def fCO2eqD47_Wang(T):
  80	'''
  81	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
  82	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
  83	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
  84	'''
  85	return float(_fCO2eqD47_Wang(T))
  86
  87
  88def correlated_sum(X, C, w = None):
  89	'''
  90	Compute covariance-aware linear combinations
  91
  92	**Parameters**
  93	
  94	+ `X`: list or 1-D array of values to sum
  95	+ `C`: covariance matrix for the elements of `X`
  96	+ `w`: list or 1-D array of weights to apply to the elements of `X`
  97	       (all equal to 1 by default)
  98
  99	Return the sum (and its SE) of the elements of `X`, with optional weights equal
 100	to the elements of `w`, accounting for covariances between the elements of `X`.
 101	'''
 102	if w is None:
 103		w = [1 for x in X]
 104	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5
 105
 106
 107def make_csv(x, hsep = ',', vsep = '\n'):
 108	'''
 109	Formats a list of lists of strings as a CSV
 110
 111	**Parameters**
 112
 113	+ `x`: the list of lists of strings to format
 114	+ `hsep`: the field separator (`,` by default)
 115	+ `vsep`: the line-ending convention to use (`\\n` by default)
 116
 117	**Example**
 118
 119	```py
 120	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
 121	```
 122
 123	outputs:
 124
 125	```py
 126	a,b,c
 127	d,e,f
 128	```
 129	'''
 130	return vsep.join([hsep.join(l) for l in x])
 131
 132
 133def pf(txt):
 134	'''
 135	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
 136	'''
 137	return txt.replace('-','_').replace('.','_').replace(' ','_')
 138
 139
 140def smart_type(x):
 141	'''
 142	Tries to convert string `x` to a float if it includes a decimal point, or
 143	to an integer if it does not. If both attempts fail, return the original
 144	string unchanged.
 145	'''
 146	try:
 147		y = float(x)
 148	except ValueError:
 149		return x
 150	if '.' not in x:
 151		return int(y)
 152	return y
 153
 154class _Defaults():
 155	def __init__(self):
 156		pass
 157
 158D47crunch_defaults = _Defaults()
 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—'
 160
 161def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
 162	'''
 163	Reads a list of lists of strings and outputs an ascii table
 164
 165	**Parameters**
 166
 167	+ `x`: a list of lists of strings
 168	+ `header`: the number of lines to treat as header lines
 169	+ `hsep`: the horizontal separator between columns
 170	+ `vsep`: the character to use as vertical separator
 171	+ `align`: string of left (`<`) or right (`>`) alignment characters.
 172
 173	**Example**
 174
 175	```py
 176	print(pretty_table([
 177		['A', 'B', 'C'],
 178		['1', '1.9999', 'foo'],
 179		['10', 'x', 'bar'],
 180	]))
 181	```
 182	yields:	
 183	```
 184	——  ——————  ———
 185	A        B    C
 186	——  ——————  ———
 187	1   1.9999  foo
 188	10       x  bar
 189	——  ——————  ———
 190	```
 191
 192	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
 193	
 194	```py
 195	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
 196	print(pretty_table([
 197		['A', 'B', 'C'],
 198		['1', '1.9999', 'foo'],
 199		['10', 'x', 'bar'],
 200	]))
 201	```
 202	yields:	
 203	```
 204	==  ======  ===
 205	A        B    C
 206	==  ======  ===
 207	1   1.9999  foo
 208	10       x  bar
 209	==  ======  ===
 210	```
 211	'''
 212	
 213	if vsep is None:
 214		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
 215	
 216	txt = []
 217	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
 218
 219	if len(widths) > len(align):
 220		align += '>' * (len(widths)-len(align))
 221	sepline = hsep.join([vsep*w for w in widths])
 222	txt += [sepline]
 223	for k,l in enumerate(x):
 224		if k and k == header:
 225			txt += [sepline]
 226		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
 227	txt += [sepline]
 228	txt += ['']
 229	return '\n'.join(txt)
 230
 231
 232def transpose_table(x):
 233	'''
 234	Transpose a list if lists
 235
 236	**Parameters**
 237
 238	+ `x`: a list of lists
 239
 240	**Example**
 241
 242	```py
 243	x = [[1, 2], [3, 4]]
 244	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
 245	```
 246	'''
 247	return [[e for e in c] for c in zip(*x)]
 248
 249
 250def w_avg(X, sX) :
 251	'''
 252	Compute variance-weighted average
 253
 254	Returns the value and SE of the weighted average of the elements of `X`,
 255	with relative weights equal to their inverse variances (`1/sX**2`).
 256
 257	**Parameters**
 258
 259	+ `X`: array-like of elements to average
 260	+ `sX`: array-like of the corresponding SE values
 261
 262	**Tip**
 263
 264	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
 265	they may be rearranged using `zip()`:
 266
 267	```python
 268	foo = [(0, 1), (1, 0.5), (2, 0.5)]
 269	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
 270	```
 271	'''
 272	X = [ x for x in X ]
 273	sX = [ sx for sx in sX ]
 274	W = [ sx**-2 for sx in sX ]
 275	W = [ w/sum(W) for w in W ]
 276	Xavg = sum([ w*x for w,x in zip(W,X) ])
 277	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
 278	return Xavg, sXavg
 279
 280
 281def read_csv(filename, sep = ''):
 282	'''
 283	Read contents of `filename` in csv format and return a list of dictionaries.
 284
 285	In the csv string, spaces before and after field separators (`','` by default)
 286	are optional.
 287
 288	**Parameters**
 289
 290	+ `filename`: the csv file to read
 291	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
 292	whichever appers most often in the contents of `filename`.
 293	'''
 294	with open(filename) as fid:
 295		txt = fid.read()
 296
 297	if sep == '':
 298		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
 299	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
 300	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
 301
 302
 303def simulate_single_analysis(
 304	sample = 'MYSAMPLE',
 305	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
 306	d13C_VPDB = None, d18O_VPDB = None,
 307	D47 = None, D48 = None, D49 = 0., D17O = 0.,
 308	a47 = 1., b47 = 0., c47 = -0.9,
 309	a48 = 1., b48 = 0., c48 = -0.45,
 310	Nominal_D47 = None,
 311	Nominal_D48 = None,
 312	Nominal_d13C_VPDB = None,
 313	Nominal_d18O_VPDB = None,
 314	ALPHA_18O_ACID_REACTION = None,
 315	R13_VPDB = None,
 316	R17_VSMOW = None,
 317	R18_VSMOW = None,
 318	LAMBDA_17 = None,
 319	R18_VPDB = None,
 320	):
 321	'''
 322	Compute working-gas delta values for a single analysis, assuming a stochastic working
 323	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
 324	
 325	**Parameters**
 326
 327	+ `sample`: sample name
 328	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 329		(respectively –4 and +26 ‰ by default)
 330	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 331	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
 332		of the carbonate sample
 333	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
 334		Δ48 values if `D47` or `D48` are not specified
 335	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 336		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
 337	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 338	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 339		correction parameters (by default equal to the `D4xdata` default values)
 340	
 341	Returns a dictionary with fields
 342	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
 343	'''
 344
 345	if Nominal_d13C_VPDB is None:
 346		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
 347
 348	if Nominal_d18O_VPDB is None:
 349		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
 350
 351	if ALPHA_18O_ACID_REACTION is None:
 352		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
 353
 354	if R13_VPDB is None:
 355		R13_VPDB = D4xdata().R13_VPDB
 356
 357	if R17_VSMOW is None:
 358		R17_VSMOW = D4xdata().R17_VSMOW
 359
 360	if R18_VSMOW is None:
 361		R18_VSMOW = D4xdata().R18_VSMOW
 362
 363	if LAMBDA_17 is None:
 364		LAMBDA_17 = D4xdata().LAMBDA_17
 365
 366	if R18_VPDB is None:
 367		R18_VPDB = D4xdata().R18_VPDB
 368	
 369	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
 370	
 371	if Nominal_D47 is None:
 372		Nominal_D47 = D47data().Nominal_D47
 373
 374	if Nominal_D48 is None:
 375		Nominal_D48 = D48data().Nominal_D48
 376	
 377	if d13C_VPDB is None:
 378		if sample in Nominal_d13C_VPDB:
 379			d13C_VPDB = Nominal_d13C_VPDB[sample]
 380		else:
 381			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
 382
 383	if d18O_VPDB is None:
 384		if sample in Nominal_d18O_VPDB:
 385			d18O_VPDB = Nominal_d18O_VPDB[sample]
 386		else:
 387			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
 388
 389	if D47 is None:
 390		if sample in Nominal_D47:
 391			D47 = Nominal_D47[sample]
 392		else:
 393			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
 394
 395	if D48 is None:
 396		if sample in Nominal_D48:
 397			D48 = Nominal_D48[sample]
 398		else:
 399			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
 400
 401	X = D4xdata()
 402	X.R13_VPDB = R13_VPDB
 403	X.R17_VSMOW = R17_VSMOW
 404	X.R18_VSMOW = R18_VSMOW
 405	X.LAMBDA_17 = LAMBDA_17
 406	X.R18_VPDB = R18_VPDB
 407	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
 408
 409	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
 410		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
 411		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
 412		)
 413	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
 414		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 415		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 416		D17O=D17O, D47=D47, D48=D48, D49=D49,
 417		)
 418	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
 419		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
 420		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
 421		D17O=D17O,
 422		)
 423	
 424	d45 = 1000 * (R45/R45wg - 1)
 425	d46 = 1000 * (R46/R46wg - 1)
 426	d47 = 1000 * (R47/R47wg - 1)
 427	d48 = 1000 * (R48/R48wg - 1)
 428	d49 = 1000 * (R49/R49wg - 1)
 429
 430	for k in range(3): # dumb iteration to adjust for small changes in d47
 431		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
 432		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
 433		d47 = 1000 * (R47raw/R47wg - 1)
 434		d48 = 1000 * (R48raw/R48wg - 1)
 435
 436	return dict(
 437		Sample = sample,
 438		D17O = D17O,
 439		d13Cwg_VPDB = d13Cwg_VPDB,
 440		d18Owg_VSMOW = d18Owg_VSMOW,
 441		d45 = d45,
 442		d46 = d46,
 443		d47 = d47,
 444		d48 = d48,
 445		d49 = d49,
 446		)
 447
 448
 449def virtual_data(
 450	samples = [],
 451	a47 = 1., b47 = 0., c47 = -0.9,
 452	a48 = 1., b48 = 0., c48 = -0.45,
 453	rd45 = 0.020, rd46 = 0.060,
 454	rD47 = 0.015, rD48 = 0.045,
 455	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
 456	session = None,
 457	Nominal_D47 = None, Nominal_D48 = None,
 458	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
 459	ALPHA_18O_ACID_REACTION = None,
 460	R13_VPDB = None,
 461	R17_VSMOW = None,
 462	R18_VSMOW = None,
 463	LAMBDA_17 = None,
 464	R18_VPDB = None,
 465	seed = 0,
 466	shuffle = True,
 467	):
 468	'''
 469	Return list with simulated analyses from a single session.
 470	
 471	**Parameters**
 472	
 473	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
 474	    * `Sample`: the name of the sample
 475	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
 476	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
 477	    * `N`: how many analyses to generate for this sample
 478	+ `a47`: scrambling factor for Δ47
 479	+ `b47`: compositional nonlinearity for Δ47
 480	+ `c47`: working gas offset for Δ47
 481	+ `a48`: scrambling factor for Δ48
 482	+ `b48`: compositional nonlinearity for Δ48
 483	+ `c48`: working gas offset for Δ48
 484	+ `rd45`: analytical repeatability of δ45
 485	+ `rd46`: analytical repeatability of δ46
 486	+ `rD47`: analytical repeatability of Δ47
 487	+ `rD48`: analytical repeatability of Δ48
 488	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
 489		(by default equal to the `simulate_single_analysis` default values)
 490	+ `session`: name of the session (no name by default)
 491	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
 492		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
 493	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
 494		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
 495		(by default equal to the `simulate_single_analysis` defaults)
 496	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
 497		(by default equal to the `simulate_single_analysis` defaults)
 498	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
 499		correction parameters (by default equal to the `simulate_single_analysis` default)
 500	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
 501	+ `shuffle`: randomly reorder the sequence of analyses
 502	
 503		
 504	Here is an example of using this method to generate an arbitrary combination of
 505	anchors and unknowns for a bunch of sessions:
 506
 507	```py
 508	.. include:: ../../code_examples/virtual_data/example.py
 509	```
 510	
 511	This should output something like:
 512	
 513	```
 514	.. include:: ../../code_examples/virtual_data/output.txt
 515	```
 516	'''
 517	
 518	kwargs = locals().copy()
 519
 520	from numpy import random as nprandom
 521	if seed:
 522		nprandom.seed(seed)
 523		rng = nprandom.default_rng(seed)
 524	else:
 525		rng = nprandom.default_rng()
 526	
 527	N = sum([s['N'] for s in samples])
 528	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 529	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
 530	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 531	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
 532	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 533	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
 534	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
 535	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
 536	
 537	k = 0
 538	out = []
 539	for s in samples:
 540		kw = {}
 541		kw['sample'] = s['Sample']
 542		kw = {
 543			**kw,
 544			**{var: kwargs[var]
 545				for var in [
 546					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
 547					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
 548					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
 549					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
 550					]
 551				if kwargs[var] is not None},
 552			**{var: s[var]
 553				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
 554				if var in s},
 555			}
 556
 557		sN = s['N']
 558		while sN:
 559			out.append(simulate_single_analysis(**kw))
 560			out[-1]['d45'] += errors45[k]
 561			out[-1]['d46'] += errors46[k]
 562			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
 563			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
 564			sN -= 1
 565			k += 1
 566
 567		if session is not None:
 568			for r in out:
 569				r['Session'] = session
 570
 571		if shuffle:
 572			nprandom.shuffle(out)
 573
 574	return out
 575
 576def table_of_samples(
 577	data47 = None,
 578	data48 = None,
 579	dir = 'output',
 580	filename = None,
 581	save_to_file = True,
 582	print_out = True,
 583	output = None,
 584	):
 585	'''
 586	Print out, save to disk and/or return a combined table of samples
 587	for a pair of `D47data` and `D48data` objects.
 588
 589	**Parameters**
 590
 591	+ `data47`: `D47data` instance
 592	+ `data48`: `D48data` instance
 593	+ `dir`: the directory in which to save the table
 594	+ `filename`: the name to the csv file to write to
 595	+ `save_to_file`: whether to save the table to disk
 596	+ `print_out`: whether to print out the table
 597	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 598		if set to `'raw'`: return a list of list of strings
 599		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 600	'''
 601	if data47 is None:
 602		if data48 is None:
 603			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 604		else:
 605			return data48.table_of_samples(
 606				dir = dir,
 607				filename = filename,
 608				save_to_file = save_to_file,
 609				print_out = print_out,
 610				output = output
 611				)
 612	else:
 613		if data48 is None:
 614			return data47.table_of_samples(
 615				dir = dir,
 616				filename = filename,
 617				save_to_file = save_to_file,
 618				print_out = print_out,
 619				output = output
 620				)
 621		else:
 622			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 623			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
 624			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
 625
 626			if save_to_file:
 627				if not os.path.exists(dir):
 628					os.makedirs(dir)
 629				if filename is None:
 630					filename = f'D47D48_samples.csv'
 631				with open(f'{dir}/{filename}', 'w') as fid:
 632					fid.write(make_csv(out))
 633			if print_out:
 634				print('\n'+pretty_table(out))
 635			if output == 'raw':
 636				return out
 637			elif output == 'pretty':
 638				return pretty_table(out)
 639
 640
 641def table_of_sessions(
 642	data47 = None,
 643	data48 = None,
 644	dir = 'output',
 645	filename = None,
 646	save_to_file = True,
 647	print_out = True,
 648	output = None,
 649	):
 650	'''
 651	Print out, save to disk and/or return a combined table of sessions
 652	for a pair of `D47data` and `D48data` objects.
 653	***Only applicable if the sessions in `data47` and those in `data48`
 654	consist of the exact same sets of analyses.***
 655
 656	**Parameters**
 657
 658	+ `data47`: `D47data` instance
 659	+ `data48`: `D48data` instance
 660	+ `dir`: the directory in which to save the table
 661	+ `filename`: the name to the csv file to write to
 662	+ `save_to_file`: whether to save the table to disk
 663	+ `print_out`: whether to print out the table
 664	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 665		if set to `'raw'`: return a list of list of strings
 666		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 667	'''
 668	if data47 is None:
 669		if data48 is None:
 670			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 671		else:
 672			return data48.table_of_sessions(
 673				dir = dir,
 674				filename = filename,
 675				save_to_file = save_to_file,
 676				print_out = print_out,
 677				output = output
 678				)
 679	else:
 680		if data48 is None:
 681			return data47.table_of_sessions(
 682				dir = dir,
 683				filename = filename,
 684				save_to_file = save_to_file,
 685				print_out = print_out,
 686				output = output
 687				)
 688		else:
 689			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 690			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
 691			for k,x in enumerate(out47[0]):
 692				if k>7:
 693					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
 694					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
 695			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
 696
 697			if save_to_file:
 698				if not os.path.exists(dir):
 699					os.makedirs(dir)
 700				if filename is None:
 701					filename = f'D47D48_sessions.csv'
 702				with open(f'{dir}/{filename}', 'w') as fid:
 703					fid.write(make_csv(out))
 704			if print_out:
 705				print('\n'+pretty_table(out))
 706			if output == 'raw':
 707				return out
 708			elif output == 'pretty':
 709				return pretty_table(out)
 710
 711
 712def table_of_analyses(
 713	data47 = None,
 714	data48 = None,
 715	dir = 'output',
 716	filename = None,
 717	save_to_file = True,
 718	print_out = True,
 719	output = None,
 720	):
 721	'''
 722	Print out, save to disk and/or return a combined table of analyses
 723	for a pair of `D47data` and `D48data` objects.
 724
 725	If the sessions in `data47` and those in `data48` do not consist of
 726	the exact same sets of analyses, the table will have two columns
 727	`Session_47` and `Session_48` instead of a single `Session` column.
 728
 729	**Parameters**
 730
 731	+ `data47`: `D47data` instance
 732	+ `data48`: `D48data` instance
 733	+ `dir`: the directory in which to save the table
 734	+ `filename`: the name to the csv file to write to
 735	+ `save_to_file`: whether to save the table to disk
 736	+ `print_out`: whether to print out the table
 737	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
 738		if set to `'raw'`: return a list of list of strings
 739		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
 740	'''
 741	if data47 is None:
 742		if data48 is None:
 743			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
 744		else:
 745			return data48.table_of_analyses(
 746				dir = dir,
 747				filename = filename,
 748				save_to_file = save_to_file,
 749				print_out = print_out,
 750				output = output
 751				)
 752	else:
 753		if data48 is None:
 754			return data47.table_of_analyses(
 755				dir = dir,
 756				filename = filename,
 757				save_to_file = save_to_file,
 758				print_out = print_out,
 759				output = output
 760				)
 761		else:
 762			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 763			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
 764			
 765			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
 766				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
 767			else:
 768				out47[0][1] = 'Session_47'
 769				out48[0][1] = 'Session_48'
 770				out47 = transpose_table(out47)
 771				out48 = transpose_table(out48)
 772				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
 773
 774			if save_to_file:
 775				if not os.path.exists(dir):
 776					os.makedirs(dir)
 777				if filename is None:
 778					filename = f'D47D48_sessions.csv'
 779				with open(f'{dir}/{filename}', 'w') as fid:
 780					fid.write(make_csv(out))
 781			if print_out:
 782				print('\n'+pretty_table(out))
 783			if output == 'raw':
 784				return out
 785			elif output == 'pretty':
 786				return pretty_table(out)
 787
 788
 789def _fullcovar(minresult, epsilon = 0.01, named = False):
 790	'''
 791	Construct full covariance matrix in the case of constrained parameters
 792	'''
 793	
 794	import asteval
 795	
 796	def f(values):
 797		interp = asteval.Interpreter()
 798		for n,v in zip(minresult.var_names, values):
 799			interp(f'{n} = {v}')
 800		for q in minresult.params:
 801			if minresult.params[q].expr:
 802				interp(f'{q} = {minresult.params[q].expr}')
 803		return np.array([interp.symtable[q] for q in minresult.params])
 804
 805	# construct Jacobian
 806	J = np.zeros((minresult.nvarys, len(minresult.params)))
 807	X = np.array([minresult.params[p].value for p in minresult.var_names])
 808	sX = np.array([minresult.params[p].stderr for p in minresult.var_names])
 809
 810	for j in range(minresult.nvarys):
 811		x1 = [_ for _ in X]
 812		x1[j] += epsilon * sX[j]
 813		x2 = [_ for _ in X]
 814		x2[j] -= epsilon * sX[j]
 815		J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j])
 816
 817	_names = [q for q in minresult.params]
 818	_covar = J.T @ minresult.covar @ J
 819	_se = np.diag(_covar)**.5
 820	_correl = _covar.copy()
 821	for k,s in enumerate(_se):
 822		if s:
 823			_correl[k,:] /= s
 824			_correl[:,k] /= s
 825
 826	if named:
 827		_covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params}
 828		_se = {i: _se[i] for i in minresult.params}
 829		_correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params}
 830
 831	return _names, _covar, _se, _correl
 832
 833
 834class D4xdata(list):
 835	'''
 836	Store and process data for a large set of Δ47 and/or Δ48
 837	analyses, usually comprising more than one analytical session.
 838	'''
 839
 840	### 17O CORRECTION PARAMETERS
 841	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 842	'''
 843	Absolute (13C/12C) ratio of VPDB.
 844	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 845	'''
 846
 847	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 848	'''
 849	Absolute (18O/16C) ratio of VSMOW.
 850	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 851	'''
 852
 853	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 854	'''
 855	Mass-dependent exponent for triple oxygen isotopes.
 856	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 857	'''
 858
 859	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 860	'''
 861	Absolute (17O/16C) ratio of VSMOW.
 862	By default equal to 0.00038475
 863	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 864	rescaled to `R13_VPDB`)
 865	'''
 866
 867	R18_VPDB = R18_VSMOW * 1.03092
 868	'''
 869	Absolute (18O/16C) ratio of VPDB.
 870	By definition equal to `R18_VSMOW * 1.03092`.
 871	'''
 872
 873	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 874	'''
 875	Absolute (17O/16C) ratio of VPDB.
 876	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 877	'''
 878
 879	LEVENE_REF_SAMPLE = 'ETH-3'
 880	'''
 881	After the Δ4x standardization step, each sample is tested to
 882	assess whether the Δ4x variance within all analyses for that
 883	sample differs significantly from that observed for a given reference
 884	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 885	which yields a p-value corresponding to the null hypothesis that the
 886	underlying variances are equal).
 887
 888	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 889	sample should be used as a reference for this test.
 890	'''
 891
 892	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 893	'''
 894	Specifies the 18O/16O fractionation factor generally applicable
 895	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 896	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 897
 898	By default equal to 1.008129 (calcite reacted at 90 °C,
 899	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 900	'''
 901
 902	Nominal_d13C_VPDB = {
 903		'ETH-1': 2.02,
 904		'ETH-2': -10.17,
 905		'ETH-3': 1.71,
 906		}	# (Bernasconi et al., 2018)
 907	'''
 908	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 909	`D4xdata.standardize_d13C()`.
 910
 911	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 912	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 913	'''
 914
 915	Nominal_d18O_VPDB = {
 916		'ETH-1': -2.19,
 917		'ETH-2': -18.69,
 918		'ETH-3': -1.78,
 919		}	# (Bernasconi et al., 2018)
 920	'''
 921	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 922	`D4xdata.standardize_d18O()`.
 923
 924	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 925	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 926	'''
 927
 928	d13C_STANDARDIZATION_METHOD = '2pt'
 929	'''
 930	Method by which to standardize δ13C values:
 931	
 932	+ `none`: do not apply any δ13C standardization.
 933	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 934	minimize the difference between final δ13C_VPDB values and
 935	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 936	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 937	values so as to minimize the difference between final δ13C_VPDB
 938	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 939	is defined).
 940	'''
 941
 942	d18O_STANDARDIZATION_METHOD = '2pt'
 943	'''
 944	Method by which to standardize δ18O values:
 945	
 946	+ `none`: do not apply any δ18O standardization.
 947	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 948	minimize the difference between final δ18O_VPDB values and
 949	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 950	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 951	values so as to minimize the difference between final δ18O_VPDB
 952	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 953	is defined).
 954	'''
 955
 956	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 957		'''
 958		**Parameters**
 959
 960		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 961		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 962		+ `mass`: `'47'` or `'48'`
 963		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 964		+ `session`: define session name for analyses without a `Session` key
 965		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 966
 967		Returns a `D4xdata` object derived from `list`.
 968		'''
 969		self._4x = mass
 970		self.verbose = verbose
 971		self.prefix = 'D4xdata'
 972		self.logfile = logfile
 973		list.__init__(self, l)
 974		self.Nf = None
 975		self.repeatability = {}
 976		self.refresh(session = session)
 977
 978
 979	def make_verbal(oldfun):
 980		'''
 981		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 982		'''
 983		@wraps(oldfun)
 984		def newfun(*args, verbose = '', **kwargs):
 985			myself = args[0]
 986			oldprefix = myself.prefix
 987			myself.prefix = oldfun.__name__
 988			if verbose != '':
 989				oldverbose = myself.verbose
 990				myself.verbose = verbose
 991			out = oldfun(*args, **kwargs)
 992			myself.prefix = oldprefix
 993			if verbose != '':
 994				myself.verbose = oldverbose
 995			return out
 996		return newfun
 997
 998
 999	def msg(self, txt):
1000		'''
1001		Log a message to `self.logfile`, and print it out if `verbose = True`
1002		'''
1003		self.log(txt)
1004		if self.verbose:
1005			print(f'{f"[{self.prefix}]":<16} {txt}')
1006
1007
1008	def vmsg(self, txt):
1009		'''
1010		Log a message to `self.logfile` and print it out
1011		'''
1012		self.log(txt)
1013		print(txt)
1014
1015
1016	def log(self, *txts):
1017		'''
1018		Log a message to `self.logfile`
1019		'''
1020		if self.logfile:
1021			with open(self.logfile, 'a') as fid:
1022				for txt in txts:
1023					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1024
1025
1026	def refresh(self, session = 'mySession'):
1027		'''
1028		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1029		'''
1030		self.fill_in_missing_info(session = session)
1031		self.refresh_sessions()
1032		self.refresh_samples()
1033
1034
1035	def refresh_sessions(self):
1036		'''
1037		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1038		to `False` for all sessions.
1039		'''
1040		self.sessions = {
1041			s: {'data': [r for r in self if r['Session'] == s]}
1042			for s in sorted({r['Session'] for r in self})
1043			}
1044		for s in self.sessions:
1045			self.sessions[s]['scrambling_drift'] = False
1046			self.sessions[s]['slope_drift'] = False
1047			self.sessions[s]['wg_drift'] = False
1048			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1049			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1050
1051
1052	def refresh_samples(self):
1053		'''
1054		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1055		'''
1056		self.samples = {
1057			s: {'data': [r for r in self if r['Sample'] == s]}
1058			for s in sorted({r['Sample'] for r in self})
1059			}
1060		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1061		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1062
1063
1064	def read(self, filename, sep = '', session = ''):
1065		'''
1066		Read file in csv format to load data into a `D47data` object.
1067
1068		In the csv file, spaces before and after field separators (`','` by default)
1069		are optional. Each line corresponds to a single analysis.
1070
1071		The required fields are:
1072
1073		+ `UID`: a unique identifier
1074		+ `Session`: an identifier for the analytical session
1075		+ `Sample`: a sample identifier
1076		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1077
1078		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1079		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1080		and `d49` are optional, and set to NaN by default.
1081
1082		**Parameters**
1083
1084		+ `fileneme`: the path of the file to read
1085		+ `sep`: csv separator delimiting the fields
1086		+ `session`: set `Session` field to this string for all analyses
1087		'''
1088		with open(filename) as fid:
1089			self.input(fid.read(), sep = sep, session = session)
1090
1091
1092	def input(self, txt, sep = '', session = ''):
1093		'''
1094		Read `txt` string in csv format to load analysis data into a `D47data` object.
1095
1096		In the csv string, spaces before and after field separators (`','` by default)
1097		are optional. Each line corresponds to a single analysis.
1098
1099		The required fields are:
1100
1101		+ `UID`: a unique identifier
1102		+ `Session`: an identifier for the analytical session
1103		+ `Sample`: a sample identifier
1104		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1105
1106		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1107		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1108		and `d49` are optional, and set to NaN by default.
1109
1110		**Parameters**
1111
1112		+ `txt`: the csv string to read
1113		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1114		whichever appers most often in `txt`.
1115		+ `session`: set `Session` field to this string for all analyses
1116		'''
1117		if sep == '':
1118			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1119		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1120		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1121
1122		if session != '':
1123			for r in data:
1124				r['Session'] = session
1125
1126		self += data
1127		self.refresh()
1128
1129
1130	@make_verbal
1131	def wg(self, samples = None, a18_acid = None):
1132		'''
1133		Compute bulk composition of the working gas for each session based on
1134		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1135		`self.Nominal_d18O_VPDB`.
1136		'''
1137
1138		self.msg('Computing WG composition:')
1139
1140		if a18_acid is None:
1141			a18_acid = self.ALPHA_18O_ACID_REACTION
1142		if samples is None:
1143			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1144
1145		assert a18_acid, f'Acid fractionation factor should not be zero.'
1146
1147		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1148		R45R46_standards = {}
1149		for sample in samples:
1150			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1151			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1152			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1153			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1154			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1155
1156			C12_s = 1 / (1 + R13_s)
1157			C13_s = R13_s / (1 + R13_s)
1158			C16_s = 1 / (1 + R17_s + R18_s)
1159			C17_s = R17_s / (1 + R17_s + R18_s)
1160			C18_s = R18_s / (1 + R17_s + R18_s)
1161
1162			C626_s = C12_s * C16_s ** 2
1163			C627_s = 2 * C12_s * C16_s * C17_s
1164			C628_s = 2 * C12_s * C16_s * C18_s
1165			C636_s = C13_s * C16_s ** 2
1166			C637_s = 2 * C13_s * C16_s * C17_s
1167			C727_s = C12_s * C17_s ** 2
1168
1169			R45_s = (C627_s + C636_s) / C626_s
1170			R46_s = (C628_s + C637_s + C727_s) / C626_s
1171			R45R46_standards[sample] = (R45_s, R46_s)
1172		
1173		for s in self.sessions:
1174			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1175			assert db, f'No sample from {samples} found in session "{s}".'
1176# 			dbsamples = sorted({r['Sample'] for r in db})
1177
1178			X = [r['d45'] for r in db]
1179			Y = [R45R46_standards[r['Sample']][0] for r in db]
1180			x1, x2 = np.min(X), np.max(X)
1181
1182			if x1 < x2:
1183				wgcoord = x1/(x1-x2)
1184			else:
1185				wgcoord = 999
1186
1187			if wgcoord < -.5 or wgcoord > 1.5:
1188				# unreasonable to extrapolate to d45 = 0
1189				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1190			else :
1191				# d45 = 0 is reasonably well bracketed
1192				R45_wg = np.polyfit(X, Y, 1)[1]
1193
1194			X = [r['d46'] for r in db]
1195			Y = [R45R46_standards[r['Sample']][1] for r in db]
1196			x1, x2 = np.min(X), np.max(X)
1197
1198			if x1 < x2:
1199				wgcoord = x1/(x1-x2)
1200			else:
1201				wgcoord = 999
1202
1203			if wgcoord < -.5 or wgcoord > 1.5:
1204				# unreasonable to extrapolate to d46 = 0
1205				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1206			else :
1207				# d46 = 0 is reasonably well bracketed
1208				R46_wg = np.polyfit(X, Y, 1)[1]
1209
1210			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1211
1212			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1213
1214			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1215			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1216			for r in self.sessions[s]['data']:
1217				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1218				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1219
1220
1221	def compute_bulk_delta(self, R45, R46, D17O = 0):
1222		'''
1223		Compute δ13C_VPDB and δ18O_VSMOW,
1224		by solving the generalized form of equation (17) from
1225		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1226		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1227		solving the corresponding second-order Taylor polynomial.
1228		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1229		'''
1230
1231		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1232
1233		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1234		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1235		C = 2 * self.R18_VSMOW
1236		D = -R46
1237
1238		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1239		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1240		cc = A + B + C + D
1241
1242		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1243
1244		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1245		R17 = K * R18 ** self.LAMBDA_17
1246		R13 = R45 - 2 * R17
1247
1248		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1249
1250		return d13C_VPDB, d18O_VSMOW
1251
1252
1253	@make_verbal
1254	def crunch(self, verbose = ''):
1255		'''
1256		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1257		'''
1258		for r in self:
1259			self.compute_bulk_and_clumping_deltas(r)
1260		self.standardize_d13C()
1261		self.standardize_d18O()
1262		self.msg(f"Crunched {len(self)} analyses.")
1263
1264
1265	def fill_in_missing_info(self, session = 'mySession'):
1266		'''
1267		Fill in optional fields with default values
1268		'''
1269		for i,r in enumerate(self):
1270			if 'D17O' not in r:
1271				r['D17O'] = 0.
1272			if 'UID' not in r:
1273				r['UID'] = f'{i+1}'
1274			if 'Session' not in r:
1275				r['Session'] = session
1276			for k in ['d47', 'd48', 'd49']:
1277				if k not in r:
1278					r[k] = np.nan
1279
1280
1281	def standardize_d13C(self):
1282		'''
1283		Perform δ13C standadization within each session `s` according to
1284		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1285		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1286		may be redefined abitrarily at a later stage.
1287		'''
1288		for s in self.sessions:
1289			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1290				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1291				X,Y = zip(*XY)
1292				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1293					offset = np.mean(Y) - np.mean(X)
1294					for r in self.sessions[s]['data']:
1295						r['d13C_VPDB'] += offset				
1296				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1297					a,b = np.polyfit(X,Y,1)
1298					for r in self.sessions[s]['data']:
1299						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1300
1301	def standardize_d18O(self):
1302		'''
1303		Perform δ18O standadization within each session `s` according to
1304		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1305		which is defined by default by `D47data.refresh_sessions()`as equal to
1306		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1307		'''
1308		for s in self.sessions:
1309			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1310				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1311				X,Y = zip(*XY)
1312				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1313				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1314					offset = np.mean(Y) - np.mean(X)
1315					for r in self.sessions[s]['data']:
1316						r['d18O_VSMOW'] += offset				
1317				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1318					a,b = np.polyfit(X,Y,1)
1319					for r in self.sessions[s]['data']:
1320						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1321	
1322
1323	def compute_bulk_and_clumping_deltas(self, r):
1324		'''
1325		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1326		'''
1327
1328		# Compute working gas R13, R18, and isobar ratios
1329		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1330		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1331		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1332
1333		# Compute analyte isobar ratios
1334		R45 = (1 + r['d45'] / 1000) * R45_wg
1335		R46 = (1 + r['d46'] / 1000) * R46_wg
1336		R47 = (1 + r['d47'] / 1000) * R47_wg
1337		R48 = (1 + r['d48'] / 1000) * R48_wg
1338		R49 = (1 + r['d49'] / 1000) * R49_wg
1339
1340		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1341		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1342		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1343
1344		# Compute stochastic isobar ratios of the analyte
1345		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1346			R13, R18, D17O = r['D17O']
1347		)
1348
1349		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1350		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1351		if (R45 / R45stoch - 1) > 5e-8:
1352			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1353		if (R46 / R46stoch - 1) > 5e-8:
1354			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1355
1356		# Compute raw clumped isotope anomalies
1357		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1358		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1359		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1360
1361
1362	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1363		'''
1364		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1365		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1366		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1367		'''
1368
1369		# Compute R17
1370		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1371
1372		# Compute isotope concentrations
1373		C12 = (1 + R13) ** -1
1374		C13 = C12 * R13
1375		C16 = (1 + R17 + R18) ** -1
1376		C17 = C16 * R17
1377		C18 = C16 * R18
1378
1379		# Compute stochastic isotopologue concentrations
1380		C626 = C16 * C12 * C16
1381		C627 = C16 * C12 * C17 * 2
1382		C628 = C16 * C12 * C18 * 2
1383		C636 = C16 * C13 * C16
1384		C637 = C16 * C13 * C17 * 2
1385		C638 = C16 * C13 * C18 * 2
1386		C727 = C17 * C12 * C17
1387		C728 = C17 * C12 * C18 * 2
1388		C737 = C17 * C13 * C17
1389		C738 = C17 * C13 * C18 * 2
1390		C828 = C18 * C12 * C18
1391		C838 = C18 * C13 * C18
1392
1393		# Compute stochastic isobar ratios
1394		R45 = (C636 + C627) / C626
1395		R46 = (C628 + C637 + C727) / C626
1396		R47 = (C638 + C728 + C737) / C626
1397		R48 = (C738 + C828) / C626
1398		R49 = C838 / C626
1399
1400		# Account for stochastic anomalies
1401		R47 *= 1 + D47 / 1000
1402		R48 *= 1 + D48 / 1000
1403		R49 *= 1 + D49 / 1000
1404
1405		# Return isobar ratios
1406		return R45, R46, R47, R48, R49
1407
1408
1409	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1410		'''
1411		Split unknown samples by UID (treat all analyses as different samples)
1412		or by session (treat analyses of a given sample in different sessions as
1413		different samples).
1414
1415		**Parameters**
1416
1417		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1418		+ `grouping`: `by_uid` | `by_session`
1419		'''
1420		if samples_to_split == 'all':
1421			samples_to_split = [s for s in self.unknowns]
1422		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1423		self.grouping = grouping.lower()
1424		if self.grouping in gkeys:
1425			gkey = gkeys[self.grouping]
1426		for r in self:
1427			if r['Sample'] in samples_to_split:
1428				r['Sample_original'] = r['Sample']
1429				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1430			elif r['Sample'] in self.unknowns:
1431				r['Sample_original'] = r['Sample']
1432		self.refresh_samples()
1433
1434
1435	def unsplit_samples(self, tables = False):
1436		'''
1437		Reverse the effects of `D47data.split_samples()`.
1438		
1439		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1440		
1441		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1442		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1443		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1444		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1445		that case session-averaged Δ4x values are statistically independent).
1446		'''
1447		unknowns_old = sorted({s for s in self.unknowns})
1448		CM_old = self.standardization.covar[:,:]
1449		VD_old = self.standardization.params.valuesdict().copy()
1450		vars_old = self.standardization.var_names
1451
1452		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1453
1454		Ns = len(vars_old) - len(unknowns_old)
1455		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1456		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1457
1458		W = np.zeros((len(vars_new), len(vars_old)))
1459		W[:Ns,:Ns] = np.eye(Ns)
1460		for u in unknowns_new:
1461			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1462			if self.grouping == 'by_session':
1463				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1464			elif self.grouping == 'by_uid':
1465				weights = [1 for s in splits]
1466			sw = sum(weights)
1467			weights = [w/sw for w in weights]
1468			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1469
1470		CM_new = W @ CM_old @ W.T
1471		V = W @ np.array([[VD_old[k]] for k in vars_old])
1472		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1473
1474		self.standardization.covar = CM_new
1475		self.standardization.params.valuesdict = lambda : VD_new
1476		self.standardization.var_names = vars_new
1477
1478		for r in self:
1479			if r['Sample'] in self.unknowns:
1480				r['Sample_split'] = r['Sample']
1481				r['Sample'] = r['Sample_original']
1482
1483		self.refresh_samples()
1484		self.consolidate_samples()
1485		self.repeatabilities()
1486
1487		if tables:
1488			self.table_of_analyses()
1489			self.table_of_samples()
1490
1491	def assign_timestamps(self):
1492		'''
1493		Assign a time field `t` of type `float` to each analysis.
1494
1495		If `TimeTag` is one of the data fields, `t` is equal within a given session
1496		to `TimeTag` minus the mean value of `TimeTag` for that session.
1497		Otherwise, `TimeTag` is by default equal to the index of each analysis
1498		in the dataset and `t` is defined as above.
1499		'''
1500		for session in self.sessions:
1501			sdata = self.sessions[session]['data']
1502			try:
1503				t0 = np.mean([r['TimeTag'] for r in sdata])
1504				for r in sdata:
1505					r['t'] = r['TimeTag'] - t0
1506			except KeyError:
1507				t0 = (len(sdata)-1)/2
1508				for t,r in enumerate(sdata):
1509					r['t'] = t - t0
1510
1511
1512	def report(self):
1513		'''
1514		Prints a report on the standardization fit.
1515		Only applicable after `D4xdata.standardize(method='pooled')`.
1516		'''
1517		report_fit(self.standardization)
1518
1519
1520	def combine_samples(self, sample_groups):
1521		'''
1522		Combine analyses of different samples to compute weighted average Δ4x
1523		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1524		dictionary.
1525		
1526		Caution: samples are weighted by number of replicate analyses, which is a
1527		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1528		correlated analytical errors for one or more samples).
1529		
1530		Returns a tuplet of:
1531		
1532		+ the list of group names
1533		+ an array of the corresponding Δ4x values
1534		+ the corresponding (co)variance matrix
1535		
1536		**Parameters**
1537
1538		+ `sample_groups`: a dictionary of the form:
1539		```py
1540		{'group1': ['sample_1', 'sample_2'],
1541		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1542		```
1543		'''
1544		
1545		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1546		groups = sorted(sample_groups.keys())
1547		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1548		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1549		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1550		W = np.array([
1551			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1552			for j in groups])
1553		D4x_new = W @ D4x_old
1554		CM_new = W @ CM_old @ W.T
1555
1556		return groups, D4x_new[:,0], CM_new
1557		
1558
1559	@make_verbal
1560	def standardize(self,
1561		method = 'pooled',
1562		weighted_sessions = [],
1563		consolidate = True,
1564		consolidate_tables = False,
1565		consolidate_plots = False,
1566		constraints = {},
1567		):
1568		'''
1569		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1570		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1571		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1572		i.e. that their true Δ4x value does not change between sessions,
1573		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1574		`'indep_sessions'`, the standardization processes each session independently, based only
1575		on anchors analyses.
1576		'''
1577
1578		self.standardization_method = method
1579		self.assign_timestamps()
1580
1581		if method == 'pooled':
1582			if weighted_sessions:
1583				for session_group in weighted_sessions:
1584					if self._4x == '47':
1585						X = D47data([r for r in self if r['Session'] in session_group])
1586					elif self._4x == '48':
1587						X = D48data([r for r in self if r['Session'] in session_group])
1588					X.Nominal_D4x = self.Nominal_D4x.copy()
1589					X.refresh()
1590					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1591					w = np.sqrt(result.redchi)
1592					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1593					for r in X:
1594						r[f'wD{self._4x}raw'] *= w
1595			else:
1596				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1597				for r in self:
1598					r[f'wD{self._4x}raw'] = 1.
1599
1600			params = Parameters()
1601			for k,session in enumerate(self.sessions):
1602				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1603				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1604				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1605				s = pf(session)
1606				params.add(f'a_{s}', value = 0.9)
1607				params.add(f'b_{s}', value = 0.)
1608				params.add(f'c_{s}', value = -0.9)
1609				params.add(f'a2_{s}', value = 0.,
1610# 					vary = self.sessions[session]['scrambling_drift'],
1611					)
1612				params.add(f'b2_{s}', value = 0.,
1613# 					vary = self.sessions[session]['slope_drift'],
1614					)
1615				params.add(f'c2_{s}', value = 0.,
1616# 					vary = self.sessions[session]['wg_drift'],
1617					)
1618				if not self.sessions[session]['scrambling_drift']:
1619					params[f'a2_{s}'].expr = '0'
1620				if not self.sessions[session]['slope_drift']:
1621					params[f'b2_{s}'].expr = '0'
1622				if not self.sessions[session]['wg_drift']:
1623					params[f'c2_{s}'].expr = '0'
1624
1625			for sample in self.unknowns:
1626				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1627
1628			for k in constraints:
1629				params[k].expr = constraints[k]
1630
1631			def residuals(p):
1632				R = []
1633				for r in self:
1634					session = pf(r['Session'])
1635					sample = pf(r['Sample'])
1636					if r['Sample'] in self.Nominal_D4x:
1637						R += [ (
1638							r[f'D{self._4x}raw'] - (
1639								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1640								+ p[f'b_{session}'] * r[f'd{self._4x}']
1641								+	p[f'c_{session}']
1642								+ r['t'] * (
1643									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1644									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1645									+	p[f'c2_{session}']
1646									)
1647								)
1648							) / r[f'wD{self._4x}raw'] ]
1649					else:
1650						R += [ (
1651							r[f'D{self._4x}raw'] - (
1652								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1653								+ p[f'b_{session}'] * r[f'd{self._4x}']
1654								+	p[f'c_{session}']
1655								+ r['t'] * (
1656									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1657									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1658									+	p[f'c2_{session}']
1659									)
1660								)
1661							) / r[f'wD{self._4x}raw'] ]
1662				return R
1663
1664			M = Minimizer(residuals, params)
1665			result = M.least_squares()
1666			self.Nf = result.nfree
1667			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1668			new_names, new_covar, new_se = _fullcovar(result)[:3]
1669			result.var_names = new_names
1670			result.covar = new_covar
1671
1672			for r in self:
1673				s = pf(r["Session"])
1674				a = result.params.valuesdict()[f'a_{s}']
1675				b = result.params.valuesdict()[f'b_{s}']
1676				c = result.params.valuesdict()[f'c_{s}']
1677				a2 = result.params.valuesdict()[f'a2_{s}']
1678				b2 = result.params.valuesdict()[f'b2_{s}']
1679				c2 = result.params.valuesdict()[f'c2_{s}']
1680				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1681				
1682
1683			self.standardization = result
1684
1685			for session in self.sessions:
1686				self.sessions[session]['Np'] = 3
1687				for k in ['scrambling', 'slope', 'wg']:
1688					if self.sessions[session][f'{k}_drift']:
1689						self.sessions[session]['Np'] += 1
1690
1691			if consolidate:
1692				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1693			return result
1694
1695
1696		elif method == 'indep_sessions':
1697
1698			if weighted_sessions:
1699				for session_group in weighted_sessions:
1700					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1701					X.Nominal_D4x = self.Nominal_D4x.copy()
1702					X.refresh()
1703					# This is only done to assign r['wD47raw'] for r in X:
1704					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1705					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1706			else:
1707				self.msg('All weights set to 1 ‰')
1708				for r in self:
1709					r[f'wD{self._4x}raw'] = 1
1710
1711			for session in self.sessions:
1712				s = self.sessions[session]
1713				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1714				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1715				s['Np'] = sum(p_active)
1716				sdata = s['data']
1717
1718				A = np.array([
1719					[
1720						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1721						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1722						1 / r[f'wD{self._4x}raw'],
1723						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1724						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1725						r['t'] / r[f'wD{self._4x}raw']
1726						]
1727					for r in sdata if r['Sample'] in self.anchors
1728					])[:,p_active] # only keep columns for the active parameters
1729				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1730				s['Na'] = Y.size
1731				CM = linalg.inv(A.T @ A)
1732				bf = (CM @ A.T @ Y).T[0,:]
1733				k = 0
1734				for n,a in zip(p_names, p_active):
1735					if a:
1736						s[n] = bf[k]
1737# 						self.msg(f'{n} = {bf[k]}')
1738						k += 1
1739					else:
1740						s[n] = 0.
1741# 						self.msg(f'{n} = 0.0')
1742
1743				for r in sdata :
1744					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1745					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1746					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1747
1748				s['CM'] = np.zeros((6,6))
1749				i = 0
1750				k_active = [j for j,a in enumerate(p_active) if a]
1751				for j,a in enumerate(p_active):
1752					if a:
1753						s['CM'][j,k_active] = CM[i,:]
1754						i += 1
1755
1756			if not weighted_sessions:
1757				w = self.rmswd()['rmswd']
1758				for r in self:
1759						r[f'wD{self._4x}'] *= w
1760						r[f'wD{self._4x}raw'] *= w
1761				for session in self.sessions:
1762					self.sessions[session]['CM'] *= w**2
1763
1764			for session in self.sessions:
1765				s = self.sessions[session]
1766				s['SE_a'] = s['CM'][0,0]**.5
1767				s['SE_b'] = s['CM'][1,1]**.5
1768				s['SE_c'] = s['CM'][2,2]**.5
1769				s['SE_a2'] = s['CM'][3,3]**.5
1770				s['SE_b2'] = s['CM'][4,4]**.5
1771				s['SE_c2'] = s['CM'][5,5]**.5
1772
1773			if not weighted_sessions:
1774				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1775			else:
1776				self.Nf = 0
1777				for sg in weighted_sessions:
1778					self.Nf += self.rmswd(sessions = sg)['Nf']
1779
1780			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1781
1782			avgD4x = {
1783				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1784				for sample in self.samples
1785				}
1786			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1787			rD4x = (chi2/self.Nf)**.5
1788			self.repeatability[f'sigma_{self._4x}'] = rD4x
1789
1790			if consolidate:
1791				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1792
1793
1794	def standardization_error(self, session, d4x, D4x, t = 0):
1795		'''
1796		Compute standardization error for a given session and
1797		(δ47, Δ47) composition.
1798		'''
1799		a = self.sessions[session]['a']
1800		b = self.sessions[session]['b']
1801		c = self.sessions[session]['c']
1802		a2 = self.sessions[session]['a2']
1803		b2 = self.sessions[session]['b2']
1804		c2 = self.sessions[session]['c2']
1805		CM = self.sessions[session]['CM']
1806
1807		x, y = D4x, d4x
1808		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1809# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1810		dxdy = -(b+b2*t) / (a+a2*t)
1811		dxdz = 1. / (a+a2*t)
1812		dxda = -x / (a+a2*t)
1813		dxdb = -y / (a+a2*t)
1814		dxdc = -1. / (a+a2*t)
1815		dxda2 = -x * a2 / (a+a2*t)
1816		dxdb2 = -y * t / (a+a2*t)
1817		dxdc2 = -t / (a+a2*t)
1818		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1819		sx = (V @ CM @ V.T) ** .5
1820		return sx
1821
1822
1823	@make_verbal
1824	def summary(self,
1825		dir = 'output',
1826		filename = None,
1827		save_to_file = True,
1828		print_out = True,
1829		):
1830		'''
1831		Print out an/or save to disk a summary of the standardization results.
1832
1833		**Parameters**
1834
1835		+ `dir`: the directory in which to save the table
1836		+ `filename`: the name to the csv file to write to
1837		+ `save_to_file`: whether to save the table to disk
1838		+ `print_out`: whether to print out the table
1839		'''
1840
1841		out = []
1842		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1843		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1844		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1845		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1846		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1848		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1849		out += [['Model degrees of freedom', f"{self.Nf}"]]
1850		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1851		out += [['Standardization method', self.standardization_method]]
1852
1853		if save_to_file:
1854			if not os.path.exists(dir):
1855				os.makedirs(dir)
1856			if filename is None:
1857				filename = f'D{self._4x}_summary.csv'
1858			with open(f'{dir}/{filename}', 'w') as fid:
1859				fid.write(make_csv(out))
1860		if print_out:
1861			self.msg('\n' + pretty_table(out, header = 0))
1862
1863
1864	@make_verbal
1865	def table_of_sessions(self,
1866		dir = 'output',
1867		filename = None,
1868		save_to_file = True,
1869		print_out = True,
1870		output = None,
1871		):
1872		'''
1873		Print out an/or save to disk a table of sessions.
1874
1875		**Parameters**
1876
1877		+ `dir`: the directory in which to save the table
1878		+ `filename`: the name to the csv file to write to
1879		+ `save_to_file`: whether to save the table to disk
1880		+ `print_out`: whether to print out the table
1881		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1882		    if set to `'raw'`: return a list of list of strings
1883		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1884		'''
1885		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1886		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1887		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1888
1889		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1890		if include_a2:
1891			out[-1] += ['a2 ± SE']
1892		if include_b2:
1893			out[-1] += ['b2 ± SE']
1894		if include_c2:
1895			out[-1] += ['c2 ± SE']
1896		for session in self.sessions:
1897			out += [[
1898				session,
1899				f"{self.sessions[session]['Na']}",
1900				f"{self.sessions[session]['Nu']}",
1901				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1902				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1903				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1904				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1905				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1906				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1907				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1908				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1909				]]
1910			if include_a2:
1911				if self.sessions[session]['scrambling_drift']:
1912					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1913				else:
1914					out[-1] += ['']
1915			if include_b2:
1916				if self.sessions[session]['slope_drift']:
1917					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1918				else:
1919					out[-1] += ['']
1920			if include_c2:
1921				if self.sessions[session]['wg_drift']:
1922					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1923				else:
1924					out[-1] += ['']
1925
1926		if save_to_file:
1927			if not os.path.exists(dir):
1928				os.makedirs(dir)
1929			if filename is None:
1930				filename = f'D{self._4x}_sessions.csv'
1931			with open(f'{dir}/{filename}', 'w') as fid:
1932				fid.write(make_csv(out))
1933		if print_out:
1934			self.msg('\n' + pretty_table(out))
1935		if output == 'raw':
1936			return out
1937		elif output == 'pretty':
1938			return pretty_table(out)
1939
1940
1941	@make_verbal
1942	def table_of_analyses(
1943		self,
1944		dir = 'output',
1945		filename = None,
1946		save_to_file = True,
1947		print_out = True,
1948		output = None,
1949		):
1950		'''
1951		Print out an/or save to disk a table of analyses.
1952
1953		**Parameters**
1954
1955		+ `dir`: the directory in which to save the table
1956		+ `filename`: the name to the csv file to write to
1957		+ `save_to_file`: whether to save the table to disk
1958		+ `print_out`: whether to print out the table
1959		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1960		    if set to `'raw'`: return a list of list of strings
1961		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1962		'''
1963
1964		out = [['UID','Session','Sample']]
1965		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1966		for f in extra_fields:
1967			out[-1] += [f[0]]
1968		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1969		for r in self:
1970			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1971			for f in extra_fields:
1972				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1973			out[-1] += [
1974				f"{r['d13Cwg_VPDB']:.3f}",
1975				f"{r['d18Owg_VSMOW']:.3f}",
1976				f"{r['d45']:.6f}",
1977				f"{r['d46']:.6f}",
1978				f"{r['d47']:.6f}",
1979				f"{r['d48']:.6f}",
1980				f"{r['d49']:.6f}",
1981				f"{r['d13C_VPDB']:.6f}",
1982				f"{r['d18O_VSMOW']:.6f}",
1983				f"{r['D47raw']:.6f}",
1984				f"{r['D48raw']:.6f}",
1985				f"{r['D49raw']:.6f}",
1986				f"{r[f'D{self._4x}']:.6f}"
1987				]
1988		if save_to_file:
1989			if not os.path.exists(dir):
1990				os.makedirs(dir)
1991			if filename is None:
1992				filename = f'D{self._4x}_analyses.csv'
1993			with open(f'{dir}/{filename}', 'w') as fid:
1994				fid.write(make_csv(out))
1995		if print_out:
1996			self.msg('\n' + pretty_table(out))
1997		return out
1998
1999	@make_verbal
2000	def covar_table(
2001		self,
2002		correl = False,
2003		dir = 'output',
2004		filename = None,
2005		save_to_file = True,
2006		print_out = True,
2007		output = None,
2008		):
2009		'''
2010		Print out, save to disk and/or return the variance-covariance matrix of D4x
2011		for all unknown samples.
2012
2013		**Parameters**
2014
2015		+ `dir`: the directory in which to save the csv
2016		+ `filename`: the name of the csv file to write to
2017		+ `save_to_file`: whether to save the csv
2018		+ `print_out`: whether to print out the matrix
2019		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2020		    if set to `'raw'`: return a list of list of strings
2021		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2022		'''
2023		samples = sorted([u for u in self.unknowns])
2024		out = [[''] + samples]
2025		for s1 in samples:
2026			out.append([s1])
2027			for s2 in samples:
2028				if correl:
2029					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2030				else:
2031					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2032
2033		if save_to_file:
2034			if not os.path.exists(dir):
2035				os.makedirs(dir)
2036			if filename is None:
2037				if correl:
2038					filename = f'D{self._4x}_correl.csv'
2039				else:
2040					filename = f'D{self._4x}_covar.csv'
2041			with open(f'{dir}/{filename}', 'w') as fid:
2042				fid.write(make_csv(out))
2043		if print_out:
2044			self.msg('\n'+pretty_table(out))
2045		if output == 'raw':
2046			return out
2047		elif output == 'pretty':
2048			return pretty_table(out)
2049
2050	@make_verbal
2051	def table_of_samples(
2052		self,
2053		dir = 'output',
2054		filename = None,
2055		save_to_file = True,
2056		print_out = True,
2057		output = None,
2058		):
2059		'''
2060		Print out, save to disk and/or return a table of samples.
2061
2062		**Parameters**
2063
2064		+ `dir`: the directory in which to save the csv
2065		+ `filename`: the name of the csv file to write to
2066		+ `save_to_file`: whether to save the csv
2067		+ `print_out`: whether to print out the table
2068		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2069		    if set to `'raw'`: return a list of list of strings
2070		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2071		'''
2072
2073		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2074		for sample in self.anchors:
2075			out += [[
2076				f"{sample}",
2077				f"{self.samples[sample]['N']}",
2078				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2079				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2080				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2081				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2082				]]
2083		for sample in self.unknowns:
2084			out += [[
2085				f"{sample}",
2086				f"{self.samples[sample]['N']}",
2087				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2088				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2089				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2090				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2091				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2092				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2093				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2094				]]
2095		if save_to_file:
2096			if not os.path.exists(dir):
2097				os.makedirs(dir)
2098			if filename is None:
2099				filename = f'D{self._4x}_samples.csv'
2100			with open(f'{dir}/{filename}', 'w') as fid:
2101				fid.write(make_csv(out))
2102		if print_out:
2103			self.msg('\n'+pretty_table(out))
2104		if output == 'raw':
2105			return out
2106		elif output == 'pretty':
2107			return pretty_table(out)
2108
2109
2110	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2111		'''
2112		Generate session plots and save them to disk.
2113
2114		**Parameters**
2115
2116		+ `dir`: the directory in which to save the plots
2117		+ `figsize`: the width and height (in inches) of each plot
2118		+ `filetype`: 'pdf' or 'png'
2119		+ `dpi`: resolution for PNG output
2120		'''
2121		if not os.path.exists(dir):
2122			os.makedirs(dir)
2123
2124		for session in self.sessions:
2125			sp = self.plot_single_session(session, xylimits = 'constant')
2126			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2127			ppl.close(sp.fig)
2128			
2129
2130
2131	@make_verbal
2132	def consolidate_samples(self):
2133		'''
2134		Compile various statistics for each sample.
2135
2136		For each anchor sample:
2137
2138		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2139		+ `SE_D47` or `SE_D48`: set to zero by definition
2140
2141		For each unknown sample:
2142
2143		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2144		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2145
2146		For each anchor and unknown:
2147
2148		+ `N`: the total number of analyses of this sample
2149		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2150		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2151		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2152		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2153		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2154		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2155		'''
2156		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2157		for sample in self.samples:
2158			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2159			if self.samples[sample]['N'] > 1:
2160				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2161
2162			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2163			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2164
2165			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2166			if len(D4x_pop) > 2:
2167				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2168			
2169		if self.standardization_method == 'pooled':
2170			for sample in self.anchors:
2171				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2172				self.samples[sample][f'SE_D{self._4x}'] = 0.
2173			for sample in self.unknowns:
2174				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2175				try:
2176					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2177				except ValueError:
2178					# when `sample` is constrained by self.standardize(constraints = {...}),
2179					# it is no longer listed in self.standardization.var_names.
2180					# Temporary fix: define SE as zero for now
2181					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2182
2183		elif self.standardization_method == 'indep_sessions':
2184			for sample in self.anchors:
2185				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2186				self.samples[sample][f'SE_D{self._4x}'] = 0.
2187			for sample in self.unknowns:
2188				self.msg(f'Consolidating sample {sample}')
2189				self.unknowns[sample][f'session_D{self._4x}'] = {}
2190				session_avg = []
2191				for session in self.sessions:
2192					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2193					if sdata:
2194						self.msg(f'{sample} found in session {session}')
2195						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2196						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2197						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2198						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2199						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2200						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2201						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2202				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2203				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2204				wsum = sum([weights[s] for s in weights])
2205				for s in weights:
2206					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2207
2208		for r in self:
2209			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2210
2211
2212
2213	def consolidate_sessions(self):
2214		'''
2215		Compute various statistics for each session.
2216
2217		+ `Na`: Number of anchor analyses in the session
2218		+ `Nu`: Number of unknown analyses in the session
2219		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2220		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2221		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2222		+ `a`: scrambling factor
2223		+ `b`: compositional slope
2224		+ `c`: WG offset
2225		+ `SE_a`: Model stadard erorr of `a`
2226		+ `SE_b`: Model stadard erorr of `b`
2227		+ `SE_c`: Model stadard erorr of `c`
2228		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2229		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2230		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2231		+ `a2`: scrambling factor drift
2232		+ `b2`: compositional slope drift
2233		+ `c2`: WG offset drift
2234		+ `Np`: Number of standardization parameters to fit
2235		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2236		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2237		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2238		'''
2239		for session in self.sessions:
2240			if 'd13Cwg_VPDB' not in self.sessions[session]:
2241				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2242			if 'd18Owg_VSMOW' not in self.sessions[session]:
2243				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2244			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2245			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2246
2247			self.msg(f'Computing repeatabilities for session {session}')
2248			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2249			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2250			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2251
2252		if self.standardization_method == 'pooled':
2253			for session in self.sessions:
2254
2255				# different (better?) computation of D4x repeatability for each session:
2256				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2257				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2258
2259				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2260				i = self.standardization.var_names.index(f'a_{pf(session)}')
2261				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2262
2263				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2264				i = self.standardization.var_names.index(f'b_{pf(session)}')
2265				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2266
2267				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2268				i = self.standardization.var_names.index(f'c_{pf(session)}')
2269				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2270
2271				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2272				if self.sessions[session]['scrambling_drift']:
2273					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2274					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2275				else:
2276					self.sessions[session]['SE_a2'] = 0.
2277
2278				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2279				if self.sessions[session]['slope_drift']:
2280					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2281					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2282				else:
2283					self.sessions[session]['SE_b2'] = 0.
2284
2285				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2286				if self.sessions[session]['wg_drift']:
2287					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2288					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2289				else:
2290					self.sessions[session]['SE_c2'] = 0.
2291
2292				i = self.standardization.var_names.index(f'a_{pf(session)}')
2293				j = self.standardization.var_names.index(f'b_{pf(session)}')
2294				k = self.standardization.var_names.index(f'c_{pf(session)}')
2295				CM = np.zeros((6,6))
2296				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2297				try:
2298					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2299					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2300					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2301					try:
2302						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2303						CM[3,4] = self.standardization.covar[i2,j2]
2304						CM[4,3] = self.standardization.covar[j2,i2]
2305					except ValueError:
2306						pass
2307					try:
2308						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2309						CM[3,5] = self.standardization.covar[i2,k2]
2310						CM[5,3] = self.standardization.covar[k2,i2]
2311					except ValueError:
2312						pass
2313				except ValueError:
2314					pass
2315				try:
2316					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2317					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2318					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2319					try:
2320						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2321						CM[4,5] = self.standardization.covar[j2,k2]
2322						CM[5,4] = self.standardization.covar[k2,j2]
2323					except ValueError:
2324						pass
2325				except ValueError:
2326					pass
2327				try:
2328					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2329					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2330					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2331				except ValueError:
2332					pass
2333
2334				self.sessions[session]['CM'] = CM
2335
2336		elif self.standardization_method == 'indep_sessions':
2337			pass # Not implemented yet
2338
2339
2340	@make_verbal
2341	def repeatabilities(self):
2342		'''
2343		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2344		(for all samples, for anchors, and for unknowns).
2345		'''
2346		self.msg('Computing reproducibilities for all sessions')
2347
2348		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2349		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2350		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2351		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2352		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2353
2354
2355	@make_verbal
2356	def consolidate(self, tables = True, plots = True):
2357		'''
2358		Collect information about samples, sessions and repeatabilities.
2359		'''
2360		self.consolidate_samples()
2361		self.consolidate_sessions()
2362		self.repeatabilities()
2363
2364		if tables:
2365			self.summary()
2366			self.table_of_sessions()
2367			self.table_of_analyses()
2368			self.table_of_samples()
2369
2370		if plots:
2371			self.plot_sessions()
2372
2373
2374	@make_verbal
2375	def rmswd(self,
2376		samples = 'all samples',
2377		sessions = 'all sessions',
2378		):
2379		'''
2380		Compute the χ2, root mean squared weighted deviation
2381		(i.e. reduced χ2), and corresponding degrees of freedom of the
2382		Δ4x values for samples in `samples` and sessions in `sessions`.
2383		
2384		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2385		'''
2386		if samples == 'all samples':
2387			mysamples = [k for k in self.samples]
2388		elif samples == 'anchors':
2389			mysamples = [k for k in self.anchors]
2390		elif samples == 'unknowns':
2391			mysamples = [k for k in self.unknowns]
2392		else:
2393			mysamples = samples
2394
2395		if sessions == 'all sessions':
2396			sessions = [k for k in self.sessions]
2397
2398		chisq, Nf = 0, 0
2399		for sample in mysamples :
2400			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2401			if len(G) > 1 :
2402				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2403				Nf += (len(G) - 1)
2404				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2405		r = (chisq / Nf)**.5 if Nf > 0 else 0
2406		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2407		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2408
2409	
2410	@make_verbal
2411	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2412		'''
2413		Compute the repeatability of `[r[key] for r in self]`
2414		'''
2415
2416		if samples == 'all samples':
2417			mysamples = [k for k in self.samples]
2418		elif samples == 'anchors':
2419			mysamples = [k for k in self.anchors]
2420		elif samples == 'unknowns':
2421			mysamples = [k for k in self.unknowns]
2422		else:
2423			mysamples = samples
2424
2425		if sessions == 'all sessions':
2426			sessions = [k for k in self.sessions]
2427
2428		if key in ['D47', 'D48']:
2429			# Full disclosure: the definition of Nf is tricky/debatable
2430			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2431			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2432			Nf = len(G)
2433# 			print(f'len(G) = {Nf}')
2434			Nf -= len([s for s in mysamples if s in self.unknowns])
2435# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2436			for session in sessions:
2437				Np = len([
2438					_ for _ in self.standardization.params
2439					if (
2440						self.standardization.params[_].expr is not None
2441						and (
2442							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2443							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2444							)
2445						)
2446					])
2447# 				print(f'session {session}: {Np} parameters to consider')
2448				Na = len({
2449					r['Sample'] for r in self.sessions[session]['data']
2450					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2451					})
2452# 				print(f'session {session}: {Na} different anchors in that session')
2453				Nf -= min(Np, Na)
2454# 			print(f'Nf = {Nf}')
2455
2456# 			for sample in mysamples :
2457# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2458# 				if len(X) > 1 :
2459# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2460# 					if sample in self.unknowns:
2461# 						Nf += len(X) - 1
2462# 					else:
2463# 						Nf += len(X)
2464# 			if samples in ['anchors', 'all samples']:
2465# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2466			r = (chisq / Nf)**.5 if Nf > 0 else 0
2467
2468		else: # if key not in ['D47', 'D48']
2469			chisq, Nf = 0, 0
2470			for sample in mysamples :
2471				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2472				if len(X) > 1 :
2473					Nf += len(X) - 1
2474					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2475			r = (chisq / Nf)**.5 if Nf > 0 else 0
2476
2477		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2478		return r
2479
2480	def sample_average(self, samples, weights = 'equal', normalize = True):
2481		'''
2482		Weighted average Δ4x value of a group of samples, accounting for covariance.
2483
2484		Returns the weighed average Δ4x value and associated SE
2485		of a group of samples. Weights are equal by default. If `normalize` is
2486		true, `weights` will be rescaled so that their sum equals 1.
2487
2488		**Examples**
2489
2490		```python
2491		self.sample_average(['X','Y'], [1, 2])
2492		```
2493
2494		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2495		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2496		values of samples X and Y, respectively.
2497
2498		```python
2499		self.sample_average(['X','Y'], [1, -1], normalize = False)
2500		```
2501
2502		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2503		'''
2504		if weights == 'equal':
2505			weights = [1/len(samples)] * len(samples)
2506
2507		if normalize:
2508			s = sum(weights)
2509			if s:
2510				weights = [w/s for w in weights]
2511
2512		try:
2513# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2514# 			C = self.standardization.covar[indices,:][:,indices]
2515			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2516			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2517			return correlated_sum(X, C, weights)
2518		except ValueError:
2519			return (0., 0.)
2520
2521
2522	def sample_D4x_covar(self, sample1, sample2 = None):
2523		'''
2524		Covariance between Δ4x values of samples
2525
2526		Returns the error covariance between the average Δ4x values of two
2527		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2528		returns the Δ4x variance for that sample.
2529		'''
2530		if sample2 is None:
2531			sample2 = sample1
2532		if self.standardization_method == 'pooled':
2533			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2534			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2535			return self.standardization.covar[i, j]
2536		elif self.standardization_method == 'indep_sessions':
2537			if sample1 == sample2:
2538				return self.samples[sample1][f'SE_D{self._4x}']**2
2539			else:
2540				c = 0
2541				for session in self.sessions:
2542					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2543					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2544					if sdata1 and sdata2:
2545						a = self.sessions[session]['a']
2546						# !! TODO: CM below does not account for temporal changes in standardization parameters
2547						CM = self.sessions[session]['CM'][:3,:3]
2548						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2549						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2550						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2551						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2552						c += (
2553							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2554							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2555							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2556							@ CM
2557							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2558							) / a**2
2559				return float(c)
2560
2561	def sample_D4x_correl(self, sample1, sample2 = None):
2562		'''
2563		Correlation between Δ4x errors of samples
2564
2565		Returns the error correlation between the average Δ4x values of two samples.
2566		'''
2567		if sample2 is None or sample2 == sample1:
2568			return 1.
2569		return (
2570			self.sample_D4x_covar(sample1, sample2)
2571			/ self.unknowns[sample1][f'SE_D{self._4x}']
2572			/ self.unknowns[sample2][f'SE_D{self._4x}']
2573			)
2574
2575	def plot_single_session(self,
2576		session,
2577		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2578		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2579		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2580		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2581		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2582		xylimits = 'free', # | 'constant'
2583		x_label = None,
2584		y_label = None,
2585		error_contour_interval = 'auto',
2586		fig = 'new',
2587		):
2588		'''
2589		Generate plot for a single session
2590		'''
2591		if x_label is None:
2592			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2593		if y_label is None:
2594			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2595
2596		out = _SessionPlot()
2597		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2598		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2599		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2600		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2601		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2602		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2603		anchor_avg = (np.array([ np.array([
2604				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2605				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2606				]) for sample in anchors]).T,
2607			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2608		unknown_avg = (np.array([ np.array([
2609				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2610				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2611				]) for sample in unknowns]).T,
2612			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2613		
2614		
2615		if fig == 'new':
2616			out.fig = ppl.figure(figsize = (6,6))
2617			ppl.subplots_adjust(.1,.1,.9,.9)
2618
2619		out.anchor_analyses, = ppl.plot(
2620			anchors_d,
2621			anchors_D,
2622			**kw_plot_anchors)
2623		out.unknown_analyses, = ppl.plot(
2624			unknowns_d,
2625			unknowns_D,
2626			**kw_plot_unknowns)
2627		out.anchor_avg = ppl.plot(
2628			*anchor_avg,
2629			**kw_plot_anchor_avg)
2630		out.unknown_avg = ppl.plot(
2631			*unknown_avg,
2632			**kw_plot_unknown_avg)
2633		if xylimits == 'constant':
2634			x = [r[f'd{self._4x}'] for r in self]
2635			y = [r[f'D{self._4x}'] for r in self]
2636			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2637			w, h = x2-x1, y2-y1
2638			x1 -= w/20
2639			x2 += w/20
2640			y1 -= h/20
2641			y2 += h/20
2642			ppl.axis([x1, x2, y1, y2])
2643		elif xylimits == 'free':
2644			x1, x2, y1, y2 = ppl.axis()
2645		else:
2646			x1, x2, y1, y2 = ppl.axis(xylimits)
2647				
2648		if error_contour_interval != 'none':
2649			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2650			XI,YI = np.meshgrid(xi, yi)
2651			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2652			if error_contour_interval == 'auto':
2653				rng = np.max(SI) - np.min(SI)
2654				if rng <= 0.01:
2655					cinterval = 0.001
2656				elif rng <= 0.03:
2657					cinterval = 0.004
2658				elif rng <= 0.1:
2659					cinterval = 0.01
2660				elif rng <= 0.3:
2661					cinterval = 0.03
2662				elif rng <= 1.:
2663					cinterval = 0.1
2664				else:
2665					cinterval = 0.5
2666			else:
2667				cinterval = error_contour_interval
2668
2669			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2670			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2671			out.clabel = ppl.clabel(out.contour)
2672			contour = (XI, YI, SI, cval, cinterval)
2673
2674		if fig == None:
2675			return {
2676			'anchors':anchors,
2677			'unknowns':unknowns,
2678			'anchors_d':anchors_d,
2679			'anchors_D':anchors_D,
2680			'unknowns_d':unknowns_d,
2681			'unknowns_D':unknowns_D,
2682			'anchor_avg':anchor_avg,
2683			'unknown_avg':unknown_avg,
2684			'contour':contour,
2685			}
2686
2687		ppl.xlabel(x_label)
2688		ppl.ylabel(y_label)
2689		ppl.title(session, weight = 'bold')
2690		ppl.grid(alpha = .2)
2691		out.ax = ppl.gca()		
2692
2693		return out
2694
2695	def plot_residuals(
2696		self,
2697		kde = False,
2698		hist = False,
2699		binwidth = 2/3,
2700		dir = 'output',
2701		filename = None,
2702		highlight = [],
2703		colors = None,
2704		figsize = None,
2705		dpi = 100,
2706		yspan = None,
2707		):
2708		'''
2709		Plot residuals of each analysis as a function of time (actually, as a function of
2710		the order of analyses in the `D4xdata` object)
2711
2712		+ `kde`: whether to add a kernel density estimate of residuals
2713		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2714		+ `histbins`: specify bin edges for the histogram
2715		+ `dir`: the directory in which to save the plot
2716		+ `highlight`: a list of samples to highlight
2717		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2718		+ `figsize`: (width, height) of figure
2719		+ `dpi`: resolution for PNG output
2720		+ `yspan`: factor controlling the range of y values shown in plot
2721		  (by default: `yspan = 1.5 if kde else 1.0`)
2722		'''
2723		
2724		from matplotlib import ticker
2725
2726		if yspan is None:
2727			if kde:
2728				yspan = 1.5
2729			else:
2730				yspan = 1.0
2731		
2732		# Layout
2733		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2734		if hist or kde:
2735			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2736			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2737		else:
2738			ppl.subplots_adjust(.08,.05,.78,.8)
2739			ax1 = ppl.subplot(111)
2740		
2741		# Colors
2742		N = len(self.anchors)
2743		if colors is None:
2744			if len(highlight) > 0:
2745				Nh = len(highlight)
2746				if Nh == 1:
2747					colors = {highlight[0]: (0,0,0)}
2748				elif Nh == 3:
2749					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2750				elif Nh == 4:
2751					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2752				else:
2753					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2754			else:
2755				if N == 3:
2756					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2757				elif N == 4:
2758					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2759				else:
2760					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2761
2762		ppl.sca(ax1)
2763		
2764		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2765
2766		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2767
2768		session = self[0]['Session']
2769		x1 = 0
2770# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2771		x_sessions = {}
2772		one_or_more_singlets = False
2773		one_or_more_multiplets = False
2774		multiplets = set()
2775		for k,r in enumerate(self):
2776			if r['Session'] != session:
2777				x2 = k-1
2778				x_sessions[session] = (x1+x2)/2
2779				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2780				session = r['Session']
2781				x1 = k
2782			singlet = len(self.samples[r['Sample']]['data']) == 1
2783			if not singlet:
2784				multiplets.add(r['Sample'])
2785			if r['Sample'] in self.unknowns:
2786				if singlet:
2787					one_or_more_singlets = True
2788				else:
2789					one_or_more_multiplets = True
2790			kw = dict(
2791				marker = 'x' if singlet else '+',
2792				ms = 4 if singlet else 5,
2793				ls = 'None',
2794				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2795				mew = 1,
2796				alpha = 0.2 if singlet else 1,
2797				)
2798			if highlight and r['Sample'] not in highlight:
2799				kw['alpha'] = 0.2
2800			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2801		x2 = k
2802		x_sessions[session] = (x1+x2)/2
2803
2804		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2805		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2806		if not (hist or kde):
2807			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2808			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2809
2810		xmin, xmax, ymin, ymax = ppl.axis()
2811		if yspan != 1:
2812			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2813		for s in x_sessions:
2814			ppl.text(
2815				x_sessions[s],
2816				ymax +1,
2817				s,
2818				va = 'bottom',
2819				**(
2820					dict(ha = 'center')
2821					if len(self.sessions[s]['data']) > (0.15 * len(self))
2822					else dict(ha = 'left', rotation = 45)
2823					)
2824				)
2825
2826		if hist or kde:
2827			ppl.sca(ax2)
2828
2829		for s in colors:
2830			kw['marker'] = '+'
2831			kw['ms'] = 5
2832			kw['mec'] = colors[s]
2833			kw['label'] = s
2834			kw['alpha'] = 1
2835			ppl.plot([], [], **kw)
2836
2837		kw['mec'] = (0,0,0)
2838
2839		if one_or_more_singlets:
2840			kw['marker'] = 'x'
2841			kw['ms'] = 4
2842			kw['alpha'] = .2
2843			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2844			ppl.plot([], [], **kw)
2845
2846		if one_or_more_multiplets:
2847			kw['marker'] = '+'
2848			kw['ms'] = 4
2849			kw['alpha'] = 1
2850			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2851			ppl.plot([], [], **kw)
2852
2853		if hist or kde:
2854			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2855		else:
2856			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2857		leg.set_zorder(-1000)
2858
2859		ppl.sca(ax1)
2860
2861		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2862		ppl.xticks([])
2863		ppl.axis([-1, len(self), None, None])
2864
2865		if hist or kde:
2866			ppl.sca(ax2)
2867			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2868
2869			if kde:
2870				from scipy.stats import gaussian_kde
2871				yi = np.linspace(ymin, ymax, 201)
2872				xi = gaussian_kde(X).evaluate(yi)
2873				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2874# 				ppl.plot(xi, yi, 'k-', lw = 1)
2875			elif hist:
2876				ppl.hist(
2877					X,
2878					orientation = 'horizontal',
2879					histtype = 'stepfilled',
2880					ec = [.4]*3,
2881					fc = [.25]*3,
2882					alpha = .25,
2883					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2884					)
2885			ppl.text(0, 0,
2886				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2887				size = 7.5,
2888				alpha = 1,
2889				va = 'center',
2890				ha = 'left',
2891				)
2892
2893			ppl.axis([0, None, ymin, ymax])
2894			ppl.xticks([])
2895			ppl.yticks([])
2896# 			ax2.spines['left'].set_visible(False)
2897			ax2.spines['right'].set_visible(False)
2898			ax2.spines['top'].set_visible(False)
2899			ax2.spines['bottom'].set_visible(False)
2900
2901		ax1.axis([None, None, ymin, ymax])
2902
2903		if not os.path.exists(dir):
2904			os.makedirs(dir)
2905		if filename is None:
2906			return fig
2907		elif filename == '':
2908			filename = f'D{self._4x}_residuals.pdf'
2909		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2910		ppl.close(fig)
2911				
2912
2913	def simulate(self, *args, **kwargs):
2914		'''
2915		Legacy function with warning message pointing to `virtual_data()`
2916		'''
2917		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2918
2919	def plot_distribution_of_analyses(
2920		self,
2921		dir = 'output',
2922		filename = None,
2923		vs_time = False,
2924		figsize = (6,4),
2925		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2926		output = None,
2927		dpi = 100,
2928		):
2929		'''
2930		Plot temporal distribution of all analyses in the data set.
2931		
2932		**Parameters**
2933
2934		+ `dir`: the directory in which to save the plot
2935		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2936		+ `dpi`: resolution for PNG output
2937		+ `figsize`: (width, height) of figure
2938		+ `dpi`: resolution for PNG output
2939		'''
2940
2941		asamples = [s for s in self.anchors]
2942		usamples = [s for s in self.unknowns]
2943		if output is None or output == 'fig':
2944			fig = ppl.figure(figsize = figsize)
2945			ppl.subplots_adjust(*subplots_adjust)
2946		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2947		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2948		Xmax += (Xmax-Xmin)/40
2949		Xmin -= (Xmax-Xmin)/41
2950		for k, s in enumerate(asamples + usamples):
2951			if vs_time:
2952				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2953			else:
2954				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2955			Y = [-k for x in X]
2956			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2957			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2958			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2959		ppl.axis([Xmin, Xmax, -k-1, 1])
2960		ppl.xlabel('\ntime')
2961		ppl.gca().annotate('',
2962			xy = (0.6, -0.02),
2963			xycoords = 'axes fraction',
2964			xytext = (.4, -0.02), 
2965            arrowprops = dict(arrowstyle = "->", color = 'k'),
2966            )
2967			
2968
2969		x2 = -1
2970		for session in self.sessions:
2971			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2972			if vs_time:
2973				ppl.axvline(x1, color = 'k', lw = .75)
2974			if x2 > -1:
2975				if not vs_time:
2976					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2977			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2978# 			from xlrd import xldate_as_datetime
2979# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2980			if vs_time:
2981				ppl.axvline(x2, color = 'k', lw = .75)
2982				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2983			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2984
2985		ppl.xticks([])
2986		ppl.yticks([])
2987
2988		if output is None:
2989			if not os.path.exists(dir):
2990				os.makedirs(dir)
2991			if filename == None:
2992				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2993			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2994			ppl.close(fig)
2995		elif output == 'ax':
2996			return ppl.gca()
2997		elif output == 'fig':
2998			return fig
2999
3000
3001	def plot_bulk_compositions(
3002		self,
3003		samples = None,
3004		dir = 'output/bulk_compositions',
3005		figsize = (6,6),
3006		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3007		show = False,
3008		sample_color = (0,.5,1),
3009		analysis_color = (.7,.7,.7),
3010		labeldist = 0.3,
3011		radius = 0.05,
3012		):
3013		'''
3014		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3015		
3016		By default, creates a directory `./output/bulk_compositions` where plots for
3017		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3018		
3019		
3020		**Parameters**
3021
3022		+ `samples`: Only these samples are processed (by default: all samples).
3023		+ `dir`: where to save the plots
3024		+ `figsize`: (width, height) of figure
3025		+ `subplots_adjust`: passed to `subplots_adjust()`
3026		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3027		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3028		+ `sample_color`: color used for replicate markers/labels
3029		+ `analysis_color`: color used for sample markers/labels
3030		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3031		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3032		'''
3033
3034		from matplotlib.patches import Ellipse
3035
3036		if samples is None:
3037			samples = [_ for _ in self.samples]
3038
3039		saved = {}
3040
3041		for s in samples:
3042
3043			fig = ppl.figure(figsize = figsize)
3044			fig.subplots_adjust(*subplots_adjust)
3045			ax = ppl.subplot(111)
3046			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3047			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3048			ppl.title(s)
3049
3050
3051			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3052			UID = [_['UID'] for _ in self.samples[s]['data']]
3053			XY0 = XY.mean(0)
3054
3055			for xy in XY:
3056				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3057				
3058			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3059			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3060			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3061			saved[s] = [XY, XY0]
3062			
3063			x1, x2, y1, y2 = ppl.axis()
3064			x0, dx = (x1+x2)/2, (x2-x1)/2
3065			y0, dy = (y1+y2)/2, (y2-y1)/2
3066			dx, dy = [max(max(dx, dy), radius)]*2
3067
3068			ppl.axis([
3069				x0 - 1.2*dx,
3070				x0 + 1.2*dx,
3071				y0 - 1.2*dy,
3072				y0 + 1.2*dy,
3073				])			
3074
3075			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3076
3077			for xy, uid in zip(XY, UID):
3078
3079				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3080				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3081
3082				if (vector_in_display_space**2).sum() > 0:
3083
3084					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3085					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3086					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3087					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3088
3089					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3090
3091				else:
3092
3093					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3094
3095			if radius:
3096				ax.add_artist(Ellipse(
3097					xy = XY0,
3098					width = radius*2,
3099					height = radius*2,
3100					ls = (0, (2,2)),
3101					lw = .7,
3102					ec = analysis_color,
3103					fc = 'None',
3104					))
3105				ppl.text(
3106					XY0[0],
3107					XY0[1]-radius,
3108					f'\n± {radius*1e3:.0f} ppm',
3109					color = analysis_color,
3110					va = 'top',
3111					ha = 'center',
3112					linespacing = 0.4,
3113					size = 8,
3114					)
3115
3116			if not os.path.exists(dir):
3117				os.makedirs(dir)
3118			fig.savefig(f'{dir}/{s}.pdf')
3119			ppl.close(fig)
3120
3121		fig = ppl.figure(figsize = figsize)
3122		fig.subplots_adjust(*subplots_adjust)
3123		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3124		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3125
3126		for s in saved:
3127			for xy in saved[s][0]:
3128				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3129			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3130			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3131			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3132
3133		x1, x2, y1, y2 = ppl.axis()
3134		ppl.axis([
3135			x1 - (x2-x1)/10,
3136			x2 + (x2-x1)/10,
3137			y1 - (y2-y1)/10,
3138			y2 + (y2-y1)/10,
3139			])			
3140
3141
3142		if not os.path.exists(dir):
3143			os.makedirs(dir)
3144		fig.savefig(f'{dir}/__all__.pdf')
3145		if show:
3146			ppl.show()
3147		ppl.close(fig)
3148		
3149
3150	def _save_D4x_correl(
3151		self,
3152		samples = None,
3153		dir = 'output',
3154		filename = None,
3155		D4x_precision = 4,
3156		correl_precision = 4,
3157		):
3158		'''
3159		Save D4x values along with their SE and correlation matrix.
3160
3161		**Parameters**
3162
3163		+ `samples`: Only these samples are output (by default: all samples).
3164		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3165		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3166		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3167		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3168		'''
3169		if samples is None:
3170			samples = sorted([s for s in self.unknowns])
3171		
3172		out = [['Sample']] + [[s] for s in samples]
3173		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3174		for k,s in enumerate(samples):
3175			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3176			for s2 in samples:
3177				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3178		
3179		if not os.path.exists(dir):
3180			os.makedirs(dir)
3181		if filename is None:
3182			filename = f'D{self._4x}_correl.csv'
3183		with open(f'{dir}/{filename}', 'w') as fid:
3184			fid.write(make_csv(out))
3185		
3186		
3187		
3188
3189class D47data(D4xdata):
3190	'''
3191	Store and process data for a large set of Δ47 analyses,
3192	usually comprising more than one analytical session.
3193	'''
3194
3195	Nominal_D4x = {
3196		'ETH-1':   0.2052,
3197		'ETH-2':   0.2085,
3198		'ETH-3':   0.6132,
3199		'ETH-4':   0.4511,
3200		'IAEA-C1': 0.3018,
3201		'IAEA-C2': 0.6409,
3202		'MERCK':   0.5135,
3203		} # I-CDES (Bernasconi et al., 2021)
3204	'''
3205	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3206	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3207	reference frame.
3208
3209	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3210	```py
3211	{
3212		'ETH-1'   : 0.2052,
3213		'ETH-2'   : 0.2085,
3214		'ETH-3'   : 0.6132,
3215		'ETH-4'   : 0.4511,
3216		'IAEA-C1' : 0.3018,
3217		'IAEA-C2' : 0.6409,
3218		'MERCK'   : 0.5135,
3219	}
3220	```
3221	'''
3222
3223
3224	@property
3225	def Nominal_D47(self):
3226		return self.Nominal_D4x
3227	
3228
3229	@Nominal_D47.setter
3230	def Nominal_D47(self, new):
3231		self.Nominal_D4x = dict(**new)
3232		self.refresh()
3233
3234
3235	def __init__(self, l = [], **kwargs):
3236		'''
3237		**Parameters:** same as `D4xdata.__init__()`
3238		'''
3239		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3240
3241
3242	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3243		'''
3244		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3245		value for that temperature, and add treat these samples as additional anchors.
3246
3247		**Parameters**
3248
3249		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3250		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3251		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3252		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3253		if `new`: keep pre-existing anchors but update them in case of conflict
3254		between old and new Δ47 values;
3255		if `old`: keep pre-existing anchors but preserve their original Δ47
3256		values in case of conflict.
3257		'''
3258		f = {
3259			'petersen': fCO2eqD47_Petersen,
3260			'wang': fCO2eqD47_Wang,
3261			}[fCo2eqD47]
3262		foo = {}
3263		for r in self:
3264			if 'Teq' in r:
3265				if r['Sample'] in foo:
3266					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3267				else:
3268					foo[r['Sample']] = f(r['Teq'])
3269			else:
3270					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3271
3272		if priority == 'replace':
3273			self.Nominal_D47 = {}
3274		for s in foo:
3275			if priority != 'old' or s not in self.Nominal_D47:
3276				self.Nominal_D47[s] = foo[s]
3277	
3278	def save_D47_correl(self, *args, **kwargs):
3279		return self._save_D4x_correl(*args, **kwargs)
3280
3281	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
3282
3283
3284class D48data(D4xdata):
3285	'''
3286	Store and process data for a large set of Δ48 analyses,
3287	usually comprising more than one analytical session.
3288	'''
3289
3290	Nominal_D4x = {
3291		'ETH-1':  0.138,
3292		'ETH-2':  0.138,
3293		'ETH-3':  0.270,
3294		'ETH-4':  0.223,
3295		'GU-1':  -0.419,
3296		} # (Fiebig et al., 2019, 2021)
3297	'''
3298	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3299	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3300	reference frame.
3301
3302	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3303	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3304
3305	```py
3306	{
3307		'ETH-1' :  0.138,
3308		'ETH-2' :  0.138,
3309		'ETH-3' :  0.270,
3310		'ETH-4' :  0.223,
3311		'GU-1'  : -0.419,
3312	}
3313	```
3314	'''
3315
3316
3317	@property
3318	def Nominal_D48(self):
3319		return self.Nominal_D4x
3320
3321	
3322	@Nominal_D48.setter
3323	def Nominal_D48(self, new):
3324		self.Nominal_D4x = dict(**new)
3325		self.refresh()
3326
3327
3328	def __init__(self, l = [], **kwargs):
3329		'''
3330		**Parameters:** same as `D4xdata.__init__()`
3331		'''
3332		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3333
3334	def save_D48_correl(self, *args, **kwargs):
3335		return self._save_D4x_correl(*args, **kwargs)
3336
3337	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
3338
3339
3340class D49data(D4xdata):
3341	'''
3342	Store and process data for a large set of Δ49 analyses,
3343	usually comprising more than one analytical session.
3344	'''
3345	
3346	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3347	'''
3348	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3349	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3350	reference frame.
3351
3352	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3353
3354	```py
3355	{
3356		"1000C": 0.0,
3357		"25C": 2.228
3358	}
3359	```
3360	'''
3361	
3362	@property
3363	def Nominal_D49(self):
3364		return self.Nominal_D4x
3365	
3366	@Nominal_D49.setter
3367	def Nominal_D49(self, new):
3368		self.Nominal_D4x = dict(**new)
3369		self.refresh()
3370	
3371	def __init__(self, l=[], **kwargs):
3372		'''
3373		**Parameters:** same as `D4xdata.__init__()`
3374		'''
3375		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3376	
3377	def save_D49_correl(self, *args, **kwargs):
3378		return self._save_D4x_correl(*args, **kwargs)
3379	
3380	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
3381
3382class _SessionPlot():
3383	'''
3384	Simple placeholder class
3385	'''
3386	def __init__(self):
3387		pass
3388
3389_app = typer.Typer(
3390	add_completion = False,
3391	context_settings={'help_option_names': ['-h', '--help']},
3392	rich_markup_mode = 'rich',
3393	)
3394
3395@_app.command()
3396def _cli(
3397	rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")],
3398	exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none',
3399	anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none',
3400	output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output',
3401	run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False,
3402	):
3403	"""
3404	Process raw D47 data and return standardized results.
3405	
3406	See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details.
3407	
3408	Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes
3409	the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different
3410	user-specified anchors. A new directory (named `output` by default) is created to store the results and
3411	the following sequence is applied:
3412	
3413	* [b]D47data.wg()[/b]
3414	* [b]D47data.crunch()[/b]
3415	* [b]D47data.standardize()[/b]
3416	* [b]D47data.summary()[/b]
3417	* [b]D47data.table_of_samples()[/b]
3418	* [b]D47data.table_of_sessions()[/b]
3419	* [b]D47data.plot_sessions()[/b]
3420	* [b]D47data.plot_residuals()[/b]
3421	* [b]D47data.table_of_analyses()[/b]
3422	* [b]D47data.plot_distribution_of_analyses()[/b]
3423	* [b]D47data.plot_bulk_compositions()[/b]
3424	* [b]D47data.save_D47_correl()[/b]
3425	
3426	Optionally, also apply similar methods for [b]]D48[/b].
3427	
3428	[b]Example CSV file for --anchors option:[/b]	
3429	[i]
3430	Sample,  d13C_VPDB,  d18O_VPDB,     D47,    D48
3431	ETH-1,        2.02,      -2.19,  0.2052,  0.138
3432	ETH-2,      -10.17,     -18.69,  0.2085,  0.138
3433	ETH-3,        1.71,      -1.78,  0.6132,  0.270
3434	ETH-4,            ,           ,  0.4511,  0.223
3435	[/i]
3436	Except for [i]Sample[/i], none of the columns above are mandatory.
3437
3438	[b]Example CSV file for --exclude option:[/b]	
3439	[i]
3440	Sample,  UID
3441	 FOO-1,
3442	 BAR-2,
3443	      ,  A04
3444	      ,  A17
3445	      ,  A88
3446	[/i]
3447	This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i],
3448	and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i].
3449	Neither column is mandatory.
3450	"""
3451
3452	data = D47data()
3453	data.read(rawdata)
3454
3455	if exclude != 'none':
3456		exclude = read_csv(exclude)
3457		exclude_uid = {r['UID'] for r in exclude if 'UID' in r}
3458		exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r}
3459	else:
3460		exclude_uid = []
3461		exclude_sample = []
3462	
3463	data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3464
3465	if anchors != 'none':
3466		anchors = read_csv(anchors)
3467		if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3468			data.Nominal_d13C_VPDB = {
3469				_['Sample']: _['d13C_VPDB']
3470				for _ in anchors
3471				if 'd13C_VPDB' in _
3472				}
3473		if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3474			data.Nominal_d18O_VPDB = {
3475				_['Sample']: _['d18O_VPDB']
3476				for _ in anchors
3477				if 'd18O_VPDB' in _
3478				}
3479		if len([_ for _ in anchors if 'D47' in _]):
3480			data.Nominal_D4x = {
3481				_['Sample']: _['D47']
3482				for _ in anchors
3483				if 'D47' in _
3484				}
3485
3486	data.refresh()
3487	data.wg()
3488	data.crunch()
3489	data.standardize()
3490	data.summary(dir = output_dir)
3491	data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True)
3492	data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions')
3493	data.plot_sessions(dir = output_dir)
3494	data.save_D47_correl(dir = output_dir)
3495	
3496	if not run_D48:
3497		data.table_of_samples(dir = output_dir)
3498		data.table_of_analyses(dir = output_dir)
3499		data.table_of_sessions(dir = output_dir)
3500
3501
3502	if run_D48:
3503		data2 = D48data()
3504		print(rawdata)
3505		data2.read(rawdata)
3506
3507		data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample])
3508
3509		if anchors != 'none':
3510			if len([_ for _ in anchors if 'd13C_VPDB' in _]):
3511				data2.Nominal_d13C_VPDB = {
3512					_['Sample']: _['d13C_VPDB']
3513					for _ in anchors
3514					if 'd13C_VPDB' in _
3515					}
3516			if len([_ for _ in anchors if 'd18O_VPDB' in _]):
3517				data2.Nominal_d18O_VPDB = {
3518					_['Sample']: _['d18O_VPDB']
3519					for _ in anchors
3520					if 'd18O_VPDB' in _
3521					}
3522			if len([_ for _ in anchors if 'D48' in _]):
3523				data2.Nominal_D4x = {
3524					_['Sample']: _['D48']
3525					for _ in anchors
3526					if 'D48' in _
3527					}
3528
3529		data2.refresh()
3530		data2.wg()
3531		data2.crunch()
3532		data2.standardize()
3533		data2.summary(dir = output_dir)
3534		data2.plot_sessions(dir = output_dir)
3535		data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True)
3536		data2.plot_distribution_of_analyses(dir = output_dir)
3537		data2.save_D48_correl(dir = output_dir)
3538
3539		table_of_analyses(data, data2, dir = output_dir)
3540		table_of_samples(data, data2, dir = output_dir)
3541		table_of_sessions(data, data2, dir = output_dir)
3542		
3543def __cli():
3544	_app()
Petersen_etal_CO2eqD47 = array([[-1.20000000e+01, 1.14711357e+00], [-1.10000000e+01, 1.13996122e+00], [-1.00000000e+01, 1.13287286e+00], [-9.00000000e+00, 1.12584768e+00], [-8.00000000e+00, 1.11888489e+00], [-7.00000000e+00, 1.11198371e+00], [-6.00000000e+00, 1.10514337e+00], [-5.00000000e+00, 1.09836311e+00], [-4.00000000e+00, 1.09164218e+00], [-3.00000000e+00, 1.08497986e+00], [-2.00000000e+00, 1.07837542e+00], [-1.00000000e+00, 1.07182816e+00], [ 0.00000000e+00, 1.06533736e+00], [ 1.00000000e+00, 1.05890235e+00], [ 2.00000000e+00, 1.05252244e+00], [ 3.00000000e+00, 1.04619698e+00], [ 4.00000000e+00, 1.03992529e+00], [ 5.00000000e+00, 1.03370674e+00], [ 6.00000000e+00, 1.02754069e+00], [ 7.00000000e+00, 1.02142651e+00], [ 8.00000000e+00, 1.01536359e+00], [ 9.00000000e+00, 1.00935131e+00], [ 1.00000000e+01, 1.00338908e+00], [ 1.10000000e+01, 9.97476303e-01], [ 1.20000000e+01, 9.91612409e-01], [ 1.30000000e+01, 9.85796821e-01], [ 1.40000000e+01, 9.80028975e-01], [ 1.50000000e+01, 9.74308318e-01], [ 1.60000000e+01, 9.68634304e-01], [ 1.70000000e+01, 9.63006392e-01], [ 1.80000000e+01, 9.57424055e-01], [ 1.90000000e+01, 9.51886769e-01], [ 2.00000000e+01, 9.46394020e-01], [ 2.10000000e+01, 9.40945302e-01], [ 2.20000000e+01, 9.35540114e-01], [ 2.30000000e+01, 9.30177964e-01], [ 2.40000000e+01, 9.24858369e-01], [ 2.50000000e+01, 9.19580851e-01], [ 2.60000000e+01, 9.14344938e-01], [ 2.70000000e+01, 9.09150167e-01], [ 2.80000000e+01, 9.03996080e-01], [ 2.90000000e+01, 8.98882228e-01], [ 3.00000000e+01, 8.93808167e-01], [ 3.10000000e+01, 8.88773459e-01], [ 3.20000000e+01, 8.83777672e-01], [ 3.30000000e+01, 8.78820382e-01], [ 3.40000000e+01, 8.73901170e-01], [ 3.50000000e+01, 8.69019623e-01], [ 3.60000000e+01, 8.64175334e-01], [ 3.70000000e+01, 8.59367901e-01], [ 3.80000000e+01, 8.54596929e-01], [ 3.90000000e+01, 8.49862028e-01], [ 4.00000000e+01, 8.45162813e-01], [ 4.10000000e+01, 8.40498905e-01], [ 4.20000000e+01, 8.35869931e-01], [ 4.30000000e+01, 8.31275522e-01], [ 4.40000000e+01, 8.26715314e-01], [ 4.50000000e+01, 8.22188950e-01], [ 4.60000000e+01, 8.17696075e-01], [ 4.70000000e+01, 8.13236341e-01], [ 4.80000000e+01, 8.08809404e-01], [ 4.90000000e+01, 8.04414926e-01], [ 5.00000000e+01, 8.00052572e-01], [ 5.10000000e+01, 7.95722012e-01], [ 5.20000000e+01, 7.91422922e-01], [ 5.30000000e+01, 7.87154979e-01], [ 5.40000000e+01, 7.82917869e-01], [ 5.50000000e+01, 7.78711277e-01], [ 5.60000000e+01, 7.74534898e-01], [ 5.70000000e+01, 7.70388426e-01], [ 5.80000000e+01, 7.66271562e-01], [ 5.90000000e+01, 7.62184010e-01], [ 6.00000000e+01, 7.58125479e-01], [ 6.10000000e+01, 7.54095680e-01], [ 6.20000000e+01, 7.50094329e-01], [ 6.30000000e+01, 7.46121147e-01], [ 6.40000000e+01, 7.42175856e-01], [ 6.50000000e+01, 7.38258184e-01], [ 6.60000000e+01, 7.34367860e-01], [ 6.70000000e+01, 7.30504620e-01], [ 6.80000000e+01, 7.26668201e-01], [ 6.90000000e+01, 7.22858343e-01], [ 7.00000000e+01, 7.19074792e-01], [ 7.10000000e+01, 7.15317295e-01], [ 7.20000000e+01, 7.11585602e-01], [ 7.30000000e+01, 7.07879469e-01], [ 7.40000000e+01, 7.04198652e-01], [ 7.50000000e+01, 7.00542912e-01], [ 7.60000000e+01, 6.96912012e-01], [ 7.70000000e+01, 6.93305719e-01], [ 7.80000000e+01, 6.89723802e-01], [ 7.90000000e+01, 6.86166034e-01], [ 8.00000000e+01, 6.82632189e-01], [ 8.10000000e+01, 6.79122047e-01], [ 8.20000000e+01, 6.75635387e-01], [ 8.30000000e+01, 6.72171994e-01], [ 8.40000000e+01, 6.68731654e-01], [ 8.50000000e+01, 6.65314156e-01], [ 8.60000000e+01, 6.61919291e-01], [ 8.70000000e+01, 6.58546854e-01], [ 8.80000000e+01, 6.55196641e-01], [ 8.90000000e+01, 6.51868451e-01], [ 9.00000000e+01, 6.48562087e-01], [ 9.10000000e+01, 6.45277352e-01], [ 9.20000000e+01, 6.42014054e-01], [ 9.30000000e+01, 6.38771999e-01], [ 9.40000000e+01, 6.35551001e-01], [ 9.50000000e+01, 6.32350872e-01], [ 9.60000000e+01, 6.29171428e-01], [ 9.70000000e+01, 6.26012487e-01], [ 9.80000000e+01, 6.22873870e-01], [ 9.90000000e+01, 6.19755397e-01], [ 1.00000000e+02, 6.16656895e-01], [ 1.02000000e+02, 6.10519107e-01], [ 1.04000000e+02, 6.04459143e-01], [ 1.06000000e+02, 5.98475670e-01], [ 1.08000000e+02, 5.92567388e-01], [ 1.10000000e+02, 5.86733026e-01], [ 1.12000000e+02, 5.80971342e-01], [ 1.14000000e+02, 5.75281125e-01], [ 1.16000000e+02, 5.69661187e-01], [ 1.18000000e+02, 5.64110371e-01], [ 1.20000000e+02, 5.58627545e-01], [ 1.22000000e+02, 5.53211600e-01], [ 1.24000000e+02, 5.47861454e-01], [ 1.26000000e+02, 5.42576048e-01], [ 1.28000000e+02, 5.37354347e-01], [ 1.30000000e+02, 5.32195337e-01], [ 1.32000000e+02, 5.27098028e-01], [ 1.34000000e+02, 5.22061450e-01], [ 1.36000000e+02, 5.17084654e-01], [ 1.38000000e+02, 5.12166711e-01], [ 1.40000000e+02, 5.07306712e-01], [ 1.42000000e+02, 5.02503768e-01], [ 1.44000000e+02, 4.97757006e-01], [ 1.46000000e+02, 4.93065573e-01], [ 1.48000000e+02, 4.88428634e-01], [ 1.50000000e+02, 4.83845370e-01], [ 1.52000000e+02, 4.79314980e-01], [ 1.54000000e+02, 4.74836677e-01], [ 1.56000000e+02, 4.70409692e-01], [ 1.58000000e+02, 4.66033271e-01], [ 1.60000000e+02, 4.61706674e-01], [ 1.62000000e+02, 4.57429176e-01], [ 1.64000000e+02, 4.53200067e-01], [ 1.66000000e+02, 4.49018650e-01], [ 1.68000000e+02, 4.44884242e-01], [ 1.70000000e+02, 4.40796174e-01], [ 1.72000000e+02, 4.36753787e-01], [ 1.74000000e+02, 4.32756438e-01], [ 1.76000000e+02, 4.28803494e-01], [ 1.78000000e+02, 4.24894334e-01], [ 1.80000000e+02, 4.21028350e-01], [ 1.82000000e+02, 4.17204944e-01], [ 1.84000000e+02, 4.13423530e-01], [ 1.86000000e+02, 4.09683531e-01], [ 1.88000000e+02, 4.05984383e-01], [ 1.90000000e+02, 4.02325531e-01], [ 1.92000000e+02, 3.98706429e-01], [ 1.94000000e+02, 3.95126543e-01], [ 1.96000000e+02, 3.91585347e-01], [ 1.98000000e+02, 3.88082324e-01], [ 2.00000000e+02, 3.84616967e-01], [ 2.02000000e+02, 3.81188778e-01], [ 2.04000000e+02, 3.77797268e-01], [ 2.06000000e+02, 3.74441954e-01], [ 2.08000000e+02, 3.71122364e-01], [ 2.10000000e+02, 3.67838033e-01], [ 2.12000000e+02, 3.64588505e-01], [ 2.14000000e+02, 3.61373329e-01], [ 2.16000000e+02, 3.58192065e-01], [ 2.18000000e+02, 3.55044277e-01], [ 2.20000000e+02, 3.51929540e-01], [ 2.22000000e+02, 3.48847432e-01], [ 2.24000000e+02, 3.45797540e-01], [ 2.26000000e+02, 3.42779460e-01], [ 2.28000000e+02, 3.39792789e-01], [ 2.30000000e+02, 3.36837136e-01], [ 2.32000000e+02, 3.33912113e-01], [ 2.34000000e+02, 3.31017339e-01], [ 2.36000000e+02, 3.28152439e-01], [ 2.38000000e+02, 3.25317046e-01], [ 2.40000000e+02, 3.22510795e-01], [ 2.42000000e+02, 3.19733329e-01], [ 2.44000000e+02, 3.16984297e-01], [ 2.46000000e+02, 3.14263352e-01], [ 2.48000000e+02, 3.11570153e-01], [ 2.50000000e+02, 3.08904364e-01], [ 2.52000000e+02, 3.06265654e-01], [ 2.54000000e+02, 3.03653699e-01], [ 2.56000000e+02, 3.01068176e-01], [ 2.58000000e+02, 2.98508771e-01], [ 2.60000000e+02, 2.95975171e-01], [ 2.62000000e+02, 2.93467070e-01], [ 2.64000000e+02, 2.90984167e-01], [ 2.66000000e+02, 2.88526163e-01], [ 2.68000000e+02, 2.86092765e-01], [ 2.70000000e+02, 2.83683684e-01], [ 2.72000000e+02, 2.81298636e-01], [ 2.74000000e+02, 2.78937339e-01], [ 2.76000000e+02, 2.76599517e-01], [ 2.78000000e+02, 2.74284898e-01], [ 2.80000000e+02, 2.71993211e-01], [ 2.82000000e+02, 2.69724193e-01], [ 2.84000000e+02, 2.67477582e-01], [ 2.86000000e+02, 2.65253121e-01], [ 2.88000000e+02, 2.63050554e-01], [ 2.90000000e+02, 2.60869633e-01], [ 2.92000000e+02, 2.58710110e-01], [ 2.94000000e+02, 2.56571741e-01], [ 2.96000000e+02, 2.54454286e-01], [ 2.98000000e+02, 2.52357508e-01], [ 3.00000000e+02, 2.50281174e-01], [ 3.02000000e+02, 2.48225053e-01], [ 3.04000000e+02, 2.46188917e-01], [ 3.06000000e+02, 2.44172542e-01], [ 3.08000000e+02, 2.42175707e-01], [ 3.10000000e+02, 2.40198194e-01], [ 3.12000000e+02, 2.38239786e-01], [ 3.14000000e+02, 2.36300272e-01], [ 3.16000000e+02, 2.34379441e-01], [ 3.18000000e+02, 2.32477087e-01], [ 3.20000000e+02, 2.30593005e-01], [ 3.22000000e+02, 2.28726993e-01], [ 3.24000000e+02, 2.26878853e-01], [ 3.26000000e+02, 2.25048388e-01], [ 3.28000000e+02, 2.23235405e-01], [ 3.30000000e+02, 2.21439711e-01], [ 3.32000000e+02, 2.19661118e-01], [ 3.34000000e+02, 2.17899439e-01], [ 3.36000000e+02, 2.16154491e-01], [ 3.38000000e+02, 2.14426091e-01], [ 3.40000000e+02, 2.12714060e-01], [ 3.42000000e+02, 2.11018220e-01], [ 3.44000000e+02, 2.09338398e-01], [ 3.46000000e+02, 2.07674420e-01], [ 3.48000000e+02, 2.06026115e-01], [ 3.50000000e+02, 2.04393315e-01], [ 3.55000000e+02, 2.00378063e-01], [ 3.60000000e+02, 1.96456139e-01], [ 3.65000000e+02, 1.92625077e-01], [ 3.70000000e+02, 1.88882487e-01], [ 3.75000000e+02, 1.85226048e-01], [ 3.80000000e+02, 1.81653511e-01], [ 3.85000000e+02, 1.78162694e-01], [ 3.90000000e+02, 1.74751478e-01], [ 3.95000000e+02, 1.71417807e-01], [ 4.00000000e+02, 1.68159686e-01], [ 4.05000000e+02, 1.64975177e-01], [ 4.10000000e+02, 1.61862398e-01], [ 4.15000000e+02, 1.58819521e-01], [ 4.20000000e+02, 1.55844772e-01], [ 4.25000000e+02, 1.52936426e-01], [ 4.30000000e+02, 1.50092806e-01], [ 4.35000000e+02, 1.47312286e-01], [ 4.40000000e+02, 1.44593281e-01], [ 4.45000000e+02, 1.41934254e-01], [ 4.50000000e+02, 1.39333710e-01], [ 4.55000000e+02, 1.36790195e-01], [ 4.60000000e+02, 1.34302294e-01], [ 4.65000000e+02, 1.31868634e-01], [ 4.70000000e+02, 1.29487876e-01], [ 4.75000000e+02, 1.27158722e-01], [ 4.80000000e+02, 1.24879906e-01], [ 4.85000000e+02, 1.22650197e-01], [ 4.90000000e+02, 1.20468398e-01], [ 4.95000000e+02, 1.18333345e-01], [ 5.00000000e+02, 1.16243903e-01], [ 5.05000000e+02, 1.14198970e-01], [ 5.10000000e+02, 1.12197471e-01], [ 5.15000000e+02, 1.10238362e-01], [ 5.20000000e+02, 1.08320625e-01], [ 5.25000000e+02, 1.06443271e-01], [ 5.30000000e+02, 1.04605335e-01], [ 5.35000000e+02, 1.02805877e-01], [ 5.40000000e+02, 1.01043985e-01], [ 5.45000000e+02, 9.93187680e-02], [ 5.50000000e+02, 9.76293590e-02], [ 5.55000000e+02, 9.59749150e-02], [ 5.60000000e+02, 9.43546120e-02], [ 5.65000000e+02, 9.27676500e-02], [ 5.70000000e+02, 9.12132480e-02], [ 5.75000000e+02, 8.96906480e-02], [ 5.80000000e+02, 8.81991080e-02], [ 5.85000000e+02, 8.67379060e-02], [ 5.90000000e+02, 8.53063410e-02], [ 5.95000000e+02, 8.39037260e-02], [ 6.00000000e+02, 8.25293950e-02], [ 6.05000000e+02, 8.11826970e-02], [ 6.10000000e+02, 7.98629980e-02], [ 6.15000000e+02, 7.85696800e-02], [ 6.20000000e+02, 7.73021410e-02], [ 6.25000000e+02, 7.60597940e-02], [ 6.30000000e+02, 7.48420660e-02], [ 6.35000000e+02, 7.36484000e-02], [ 6.40000000e+02, 7.24782510e-02], [ 6.45000000e+02, 7.13310900e-02], [ 6.50000000e+02, 7.02063990e-02], [ 6.55000000e+02, 6.91036740e-02], [ 6.60000000e+02, 6.80224240e-02], [ 6.65000000e+02, 6.69621680e-02], [ 6.70000000e+02, 6.59224390e-02], [ 6.75000000e+02, 6.49027800e-02], [ 6.80000000e+02, 6.39027480e-02], [ 6.85000000e+02, 6.29219090e-02], [ 6.90000000e+02, 6.19598370e-02], [ 6.95000000e+02, 6.10161220e-02], [ 7.00000000e+02, 6.00903600e-02], [ 7.05000000e+02, 5.91821570e-02], [ 7.10000000e+02, 5.82911310e-02], [ 7.15000000e+02, 5.74169070e-02], [ 7.20000000e+02, 5.65591200e-02], [ 7.25000000e+02, 5.57174140e-02], [ 7.30000000e+02, 5.48914400e-02], [ 7.35000000e+02, 5.40808600e-02], [ 7.40000000e+02, 5.32853430e-02], [ 7.45000000e+02, 5.25045650e-02], [ 7.50000000e+02, 5.17382100e-02], [ 7.55000000e+02, 5.09859710e-02], [ 7.60000000e+02, 5.02475460e-02], [ 7.65000000e+02, 4.95226430e-02], [ 7.70000000e+02, 4.88109740e-02], [ 7.75000000e+02, 4.81122600e-02], [ 7.80000000e+02, 4.74262270e-02], [ 7.85000000e+02, 4.67526090e-02], [ 7.90000000e+02, 4.60911450e-02], [ 7.95000000e+02, 4.54415810e-02], [ 8.00000000e+02, 4.48036680e-02], [ 8.05000000e+02, 4.41771640e-02], [ 8.10000000e+02, 4.35618310e-02], [ 8.15000000e+02, 4.29574380e-02], [ 8.20000000e+02, 4.23637590e-02], [ 8.25000000e+02, 4.17805730e-02], [ 8.30000000e+02, 4.12076640e-02], [ 8.35000000e+02, 4.06448220e-02], [ 8.40000000e+02, 4.00918390e-02], [ 8.45000000e+02, 3.95485160e-02], [ 8.50000000e+02, 3.90146540e-02], [ 8.55000000e+02, 3.84900630e-02], [ 8.60000000e+02, 3.79745540e-02], [ 8.65000000e+02, 3.74679440e-02], [ 8.70000000e+02, 3.69700540e-02], [ 8.75000000e+02, 3.64807070e-02], [ 8.80000000e+02, 3.59997340e-02], [ 8.85000000e+02, 3.55269650e-02], [ 8.90000000e+02, 3.50622380e-02], [ 8.95000000e+02, 3.46053930e-02], [ 9.00000000e+02, 3.41562720e-02], [ 9.05000000e+02, 3.37147240e-02], [ 9.10000000e+02, 3.32805980e-02], [ 9.15000000e+02, 3.28537490e-02], [ 9.20000000e+02, 3.24340320e-02], [ 9.25000000e+02, 3.20213090e-02], [ 9.30000000e+02, 3.16154430e-02], [ 9.35000000e+02, 3.12163000e-02], [ 9.40000000e+02, 3.08237490e-02], [ 9.45000000e+02, 3.04376630e-02], [ 9.50000000e+02, 3.00579150e-02], [ 9.55000000e+02, 2.96843850e-02], [ 9.60000000e+02, 2.93169510e-02], [ 9.65000000e+02, 2.89554980e-02], [ 9.70000000e+02, 2.85999100e-02], [ 9.75000000e+02, 2.82500750e-02], [ 9.80000000e+02, 2.79058840e-02], [ 9.85000000e+02, 2.75672290e-02], [ 9.90000000e+02, 2.72340060e-02], [ 9.95000000e+02, 2.69061120e-02], [ 1.00000000e+03, 2.65834450e-02], [ 1.00500000e+03, 2.62659080e-02], [ 1.01000000e+03, 2.59534050e-02], [ 1.01500000e+03, 2.56458410e-02], [ 1.02000000e+03, 2.53431240e-02], [ 1.02500000e+03, 2.50451630e-02], [ 1.03000000e+03, 2.47518710e-02], [ 1.03500000e+03, 2.44631600e-02], [ 1.04000000e+03, 2.41789470e-02], [ 1.04500000e+03, 2.38991470e-02], [ 1.05000000e+03, 2.36236800e-02], [ 1.05500000e+03, 2.33524670e-02], [ 1.06000000e+03, 2.30854290e-02], [ 1.06500000e+03, 2.28224910e-02], [ 1.07000000e+03, 2.25635770e-02], [ 1.07500000e+03, 2.23086150e-02], [ 1.08000000e+03, 2.20575330e-02], [ 1.08500000e+03, 2.18102600e-02], [ 1.09000000e+03, 2.15667290e-02], [ 1.09500000e+03, 2.13268720e-02], [ 1.10000000e+03, 2.10906220e-02]])
def fCO2eqD47_Petersen(T):
69def fCO2eqD47_Petersen(T):
70	'''
71	CO2 equilibrium Δ47 value as a function of T (in degrees C)
72	according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127).
73
74	'''
75	return float(_fCO2eqD47_Petersen(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).

Wang_etal_CO2eqD47 = array([[-8.3000e+01, 1.8954e+00], [-7.3000e+01, 1.7530e+00], [-6.3000e+01, 1.6261e+00], [-5.3000e+01, 1.5126e+00], [-4.3000e+01, 1.4104e+00], [-3.3000e+01, 1.3182e+00], [-2.3000e+01, 1.2345e+00], [-1.3000e+01, 1.1584e+00], [-3.0000e+00, 1.0888e+00], [ 7.0000e+00, 1.0251e+00], [ 1.7000e+01, 9.6650e-01], [ 2.7000e+01, 9.1250e-01], [ 3.7000e+01, 8.6260e-01], [ 4.7000e+01, 8.1640e-01], [ 5.7000e+01, 7.7340e-01], [ 6.7000e+01, 7.3340e-01], [ 8.7000e+01, 6.6120e-01], [ 9.7000e+01, 6.2860e-01], [ 1.0700e+02, 5.9800e-01], [ 1.1700e+02, 5.6930e-01], [ 1.2700e+02, 5.4230e-01], [ 1.3700e+02, 5.1690e-01], [ 1.4700e+02, 4.9300e-01], [ 1.5700e+02, 4.7040e-01], [ 1.6700e+02, 4.4910e-01], [ 1.7700e+02, 4.2890e-01], [ 1.8700e+02, 4.0980e-01], [ 1.9700e+02, 3.9180e-01], [ 2.0700e+02, 3.7470e-01], [ 2.1700e+02, 3.5850e-01], [ 2.2700e+02, 3.4310e-01], [ 2.3700e+02, 3.2850e-01], [ 2.4700e+02, 3.1470e-01], [ 2.5700e+02, 3.0150e-01], [ 2.6700e+02, 2.8900e-01], [ 2.7700e+02, 2.7710e-01], [ 2.8700e+02, 2.6570e-01], [ 2.9700e+02, 2.5500e-01], [ 3.0700e+02, 2.4470e-01], [ 3.1700e+02, 2.3490e-01], [ 3.2700e+02, 2.2560e-01], [ 3.3700e+02, 2.1670e-01], [ 3.4700e+02, 2.0830e-01], [ 3.5700e+02, 2.0020e-01], [ 3.6700e+02, 1.9250e-01], [ 3.7700e+02, 1.8510e-01], [ 3.8700e+02, 1.7810e-01], [ 3.9700e+02, 1.7140e-01], [ 4.0700e+02, 1.6500e-01], [ 4.1700e+02, 1.5890e-01], [ 4.2700e+02, 1.5300e-01], [ 4.3700e+02, 1.4740e-01], [ 4.4700e+02, 1.4210e-01], [ 4.5700e+02, 1.3700e-01], [ 4.6700e+02, 1.3210e-01], [ 4.7700e+02, 1.2740e-01], [ 4.8700e+02, 1.2290e-01], [ 4.9700e+02, 1.1860e-01], [ 5.0700e+02, 1.1450e-01], [ 5.1700e+02, 1.1050e-01], [ 5.2700e+02, 1.0680e-01], [ 5.3700e+02, 1.0310e-01], [ 5.4700e+02, 9.9700e-02], [ 5.5700e+02, 9.6300e-02], [ 5.6700e+02, 9.3100e-02], [ 5.7700e+02, 9.0100e-02], [ 5.8700e+02, 8.7100e-02], [ 5.9700e+02, 8.4300e-02], [ 6.0700e+02, 8.1600e-02], [ 6.1700e+02, 7.9000e-02], [ 6.2700e+02, 7.6500e-02], [ 6.3700e+02, 7.4100e-02], [ 6.4700e+02, 7.1800e-02], [ 6.5700e+02, 6.9500e-02], [ 6.6700e+02, 6.7400e-02], [ 6.7700e+02, 6.5400e-02], [ 6.8700e+02, 6.3400e-02], [ 6.9700e+02, 6.1500e-02], [ 7.0700e+02, 5.9700e-02], [ 7.1700e+02, 5.7900e-02], [ 7.2700e+02, 5.6200e-02], [ 7.3700e+02, 5.4600e-02], [ 7.4700e+02, 5.3000e-02], [ 7.5700e+02, 5.1500e-02], [ 7.6700e+02, 5.0000e-02], [ 7.7700e+02, 4.8600e-02], [ 7.8700e+02, 4.7200e-02], [ 7.9700e+02, 4.5900e-02], [ 8.0700e+02, 4.4700e-02], [ 8.1700e+02, 4.3500e-02], [ 8.2700e+02, 4.2300e-02], [ 8.3700e+02, 4.1100e-02], [ 8.4700e+02, 4.0000e-02], [ 8.5700e+02, 3.9000e-02], [ 8.6700e+02, 3.8000e-02], [ 8.7700e+02, 3.7000e-02], [ 8.8700e+02, 3.6000e-02], [ 8.9700e+02, 3.5100e-02], [ 9.0700e+02, 3.4200e-02], [ 9.1700e+02, 3.3300e-02], [ 9.2700e+02, 3.2500e-02], [ 9.3700e+02, 3.1700e-02], [ 9.4700e+02, 3.0900e-02], [ 9.5700e+02, 3.0200e-02], [ 9.6700e+02, 2.9400e-02], [ 9.7700e+02, 2.8700e-02], [ 9.8700e+02, 2.8100e-02], [ 9.9700e+02, 2.7400e-02], [ 1.0070e+03, 2.6800e-02], [ 1.0170e+03, 2.6100e-02], [ 1.0270e+03, 2.5500e-02], [ 1.0370e+03, 2.4900e-02], [ 1.0470e+03, 2.4400e-02], [ 1.0570e+03, 2.3800e-02], [ 1.0670e+03, 2.3300e-02], [ 1.0770e+03, 2.2800e-02], [ 1.0870e+03, 2.2300e-02], [ 1.0970e+03, 2.1800e-02]])
def fCO2eqD47_Wang(T):
80def fCO2eqD47_Wang(T):
81	'''
82	CO2 equilibrium Δ47 value as a function of `T` (in degrees C)
83	according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)
84	(supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)).
85	'''
86	return float(_fCO2eqD47_Wang(T))

CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Wang et al. (2004) (supplementary data of Dennis et al., 2011).

def correlated_sum(X, C, w=None):
 89def correlated_sum(X, C, w = None):
 90	'''
 91	Compute covariance-aware linear combinations
 92
 93	**Parameters**
 94	
 95	+ `X`: list or 1-D array of values to sum
 96	+ `C`: covariance matrix for the elements of `X`
 97	+ `w`: list or 1-D array of weights to apply to the elements of `X`
 98	       (all equal to 1 by default)
 99
100	Return the sum (and its SE) of the elements of `X`, with optional weights equal
101	to the elements of `w`, accounting for covariances between the elements of `X`.
102	'''
103	if w is None:
104		w = [1 for x in X]
105	return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5

Compute covariance-aware linear combinations

Parameters

  • X: list or 1-D array of values to sum
  • C: covariance matrix for the elements of X
  • w: list or 1-D array of weights to apply to the elements of X (all equal to 1 by default)

Return the sum (and its SE) of the elements of X, with optional weights equal to the elements of w, accounting for covariances between the elements of X.

def make_csv(x, hsep=',', vsep='\n'):
108def make_csv(x, hsep = ',', vsep = '\n'):
109	'''
110	Formats a list of lists of strings as a CSV
111
112	**Parameters**
113
114	+ `x`: the list of lists of strings to format
115	+ `hsep`: the field separator (`,` by default)
116	+ `vsep`: the line-ending convention to use (`\\n` by default)
117
118	**Example**
119
120	```py
121	print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
122	```
123
124	outputs:
125
126	```py
127	a,b,c
128	d,e,f
129	```
130	'''
131	return vsep.join([hsep.join(l) for l in x])

Formats a list of lists of strings as a CSV

Parameters

  • x: the list of lists of strings to format
  • hsep: the field separator (, by default)
  • vsep: the line-ending convention to use (\n by default)

Example

print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))

outputs:

a,b,c
d,e,f
def pf(txt):
134def pf(txt):
135	'''
136	Modify string `txt` to follow `lmfit.Parameter()` naming rules.
137	'''
138	return txt.replace('-','_').replace('.','_').replace(' ','_')

Modify string txt to follow lmfit.Parameter() naming rules.

def smart_type(x):
141def smart_type(x):
142	'''
143	Tries to convert string `x` to a float if it includes a decimal point, or
144	to an integer if it does not. If both attempts fail, return the original
145	string unchanged.
146	'''
147	try:
148		y = float(x)
149	except ValueError:
150		return x
151	if '.' not in x:
152		return int(y)
153	return y

Tries to convert string x to a float if it includes a decimal point, or to an integer if it does not. If both attempts fail, return the original string unchanged.

D47crunch_defaults = <D47crunch._Defaults object>
def pretty_table(x, header=1, hsep=' ', vsep=None, align='<'):
162def pretty_table(x, header = 1, hsep = '  ', vsep = None, align = '<'):
163	'''
164	Reads a list of lists of strings and outputs an ascii table
165
166	**Parameters**
167
168	+ `x`: a list of lists of strings
169	+ `header`: the number of lines to treat as header lines
170	+ `hsep`: the horizontal separator between columns
171	+ `vsep`: the character to use as vertical separator
172	+ `align`: string of left (`<`) or right (`>`) alignment characters.
173
174	**Example**
175
176	```py
177	print(pretty_table([
178		['A', 'B', 'C'],
179		['1', '1.9999', 'foo'],
180		['10', 'x', 'bar'],
181	]))
182	```
183	yields:	
184	```
185	——  ——————  ———
186	A        B    C
187	——  ——————  ———
188	1   1.9999  foo
189	10       x  bar
190	——  ——————  ———
191	```
192
193	To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`:
194	
195	```py
196	D47crunch_defaults.PRETTY_TABLE_VSEP = '='
197	print(pretty_table([
198		['A', 'B', 'C'],
199		['1', '1.9999', 'foo'],
200		['10', 'x', 'bar'],
201	]))
202	```
203	yields:	
204	```
205	==  ======  ===
206	A        B    C
207	==  ======  ===
208	1   1.9999  foo
209	10       x  bar
210	==  ======  ===
211	```
212	'''
213	
214	if vsep is None:
215		vsep = D47crunch_defaults.PRETTY_TABLE_VSEP
216	
217	txt = []
218	widths = [np.max([len(e) for e in c]) for c in zip(*x)]
219
220	if len(widths) > len(align):
221		align += '>' * (len(widths)-len(align))
222	sepline = hsep.join([vsep*w for w in widths])
223	txt += [sepline]
224	for k,l in enumerate(x):
225		if k and k == header:
226			txt += [sepline]
227		txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])]
228	txt += [sepline]
229	txt += ['']
230	return '\n'.join(txt)

Reads a list of lists of strings and outputs an ascii table

Parameters

  • x: a list of lists of strings
  • header: the number of lines to treat as header lines
  • hsep: the horizontal separator between columns
  • vsep: the character to use as vertical separator
  • align: string of left (<) or right (>) alignment characters.

Example

print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

——  ——————  ———
A        B    C
——  ——————  ———
1   1.9999  foo
10       x  bar
——  ——————  ———

To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:

D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
        ['A', 'B', 'C'],
        ['1', '1.9999', 'foo'],
        ['10', 'x', 'bar'],
]))

yields:

==  ======  ===
A        B    C
==  ======  ===
1   1.9999  foo
10       x  bar
==  ======  ===
def transpose_table(x):
233def transpose_table(x):
234	'''
235	Transpose a list if lists
236
237	**Parameters**
238
239	+ `x`: a list of lists
240
241	**Example**
242
243	```py
244	x = [[1, 2], [3, 4]]
245	print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
246	```
247	'''
248	return [[e for e in c] for c in zip(*x)]

Transpose a list if lists

Parameters

  • x: a list of lists

Example

x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
def w_avg(X, sX):
251def w_avg(X, sX) :
252	'''
253	Compute variance-weighted average
254
255	Returns the value and SE of the weighted average of the elements of `X`,
256	with relative weights equal to their inverse variances (`1/sX**2`).
257
258	**Parameters**
259
260	+ `X`: array-like of elements to average
261	+ `sX`: array-like of the corresponding SE values
262
263	**Tip**
264
265	If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets,
266	they may be rearranged using `zip()`:
267
268	```python
269	foo = [(0, 1), (1, 0.5), (2, 0.5)]
270	print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
271	```
272	'''
273	X = [ x for x in X ]
274	sX = [ sx for sx in sX ]
275	W = [ sx**-2 for sx in sX ]
276	W = [ w/sum(W) for w in W ]
277	Xavg = sum([ w*x for w,x in zip(W,X) ])
278	sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5
279	return Xavg, sXavg

Compute variance-weighted average

Returns the value and SE of the weighted average of the elements of X, with relative weights equal to their inverse variances (1/sX**2).

Parameters

  • X: array-like of elements to average
  • sX: array-like of the corresponding SE values

Tip

If X and sX are initially arranged as a list of (x, sx) doublets, they may be rearranged using zip():

foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
def read_csv(filename, sep=''):
282def read_csv(filename, sep = ''):
283	'''
284	Read contents of `filename` in csv format and return a list of dictionaries.
285
286	In the csv string, spaces before and after field separators (`','` by default)
287	are optional.
288
289	**Parameters**
290
291	+ `filename`: the csv file to read
292	+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
293	whichever appers most often in the contents of `filename`.
294	'''
295	with open(filename) as fid:
296		txt = fid.read()
297
298	if sep == '':
299		sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
300	txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
301	return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]

Read contents of filename in csv format and return a list of dictionaries.

In the csv string, spaces before and after field separators (',' by default) are optional.

Parameters

  • filename: the csv file to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in the contents of filename.
def simulate_single_analysis( sample='MYSAMPLE', d13Cwg_VPDB=-4.0, d18Owg_VSMOW=26.0, d13C_VPDB=None, d18O_VPDB=None, D47=None, D48=None, D49=0.0, D17O=0.0, a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None):
304def simulate_single_analysis(
305	sample = 'MYSAMPLE',
306	d13Cwg_VPDB = -4., d18Owg_VSMOW = 26.,
307	d13C_VPDB = None, d18O_VPDB = None,
308	D47 = None, D48 = None, D49 = 0., D17O = 0.,
309	a47 = 1., b47 = 0., c47 = -0.9,
310	a48 = 1., b48 = 0., c48 = -0.45,
311	Nominal_D47 = None,
312	Nominal_D48 = None,
313	Nominal_d13C_VPDB = None,
314	Nominal_d18O_VPDB = None,
315	ALPHA_18O_ACID_REACTION = None,
316	R13_VPDB = None,
317	R17_VSMOW = None,
318	R18_VSMOW = None,
319	LAMBDA_17 = None,
320	R18_VPDB = None,
321	):
322	'''
323	Compute working-gas delta values for a single analysis, assuming a stochastic working
324	gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
325	
326	**Parameters**
327
328	+ `sample`: sample name
329	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
330		(respectively –4 and +26 ‰ by default)
331	+ `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
332	+ `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies
333		of the carbonate sample
334	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and
335		Δ48 values if `D47` or `D48` are not specified
336	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
337		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified
338	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
339	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
340		correction parameters (by default equal to the `D4xdata` default values)
341	
342	Returns a dictionary with fields
343	`['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`.
344	'''
345
346	if Nominal_d13C_VPDB is None:
347		Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB
348
349	if Nominal_d18O_VPDB is None:
350		Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB
351
352	if ALPHA_18O_ACID_REACTION is None:
353		ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION
354
355	if R13_VPDB is None:
356		R13_VPDB = D4xdata().R13_VPDB
357
358	if R17_VSMOW is None:
359		R17_VSMOW = D4xdata().R17_VSMOW
360
361	if R18_VSMOW is None:
362		R18_VSMOW = D4xdata().R18_VSMOW
363
364	if LAMBDA_17 is None:
365		LAMBDA_17 = D4xdata().LAMBDA_17
366
367	if R18_VPDB is None:
368		R18_VPDB = D4xdata().R18_VPDB
369	
370	R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17
371	
372	if Nominal_D47 is None:
373		Nominal_D47 = D47data().Nominal_D47
374
375	if Nominal_D48 is None:
376		Nominal_D48 = D48data().Nominal_D48
377	
378	if d13C_VPDB is None:
379		if sample in Nominal_d13C_VPDB:
380			d13C_VPDB = Nominal_d13C_VPDB[sample]
381		else:
382			raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.")
383
384	if d18O_VPDB is None:
385		if sample in Nominal_d18O_VPDB:
386			d18O_VPDB = Nominal_d18O_VPDB[sample]
387		else:
388			raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.")
389
390	if D47 is None:
391		if sample in Nominal_D47:
392			D47 = Nominal_D47[sample]
393		else:
394			raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.")
395
396	if D48 is None:
397		if sample in Nominal_D48:
398			D48 = Nominal_D48[sample]
399		else:
400			raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.")
401
402	X = D4xdata()
403	X.R13_VPDB = R13_VPDB
404	X.R17_VSMOW = R17_VSMOW
405	X.R18_VSMOW = R18_VSMOW
406	X.LAMBDA_17 = LAMBDA_17
407	X.R18_VPDB = R18_VPDB
408	X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17
409
410	R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios(
411		R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000),
412		R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000),
413		)
414	R45, R46, R47, R48, R49 = X.compute_isobar_ratios(
415		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
416		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
417		D17O=D17O, D47=D47, D48=D48, D49=D49,
418		)
419	R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios(
420		R13 = R13_VPDB * (1 + d13C_VPDB/1000),
421		R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION,
422		D17O=D17O,
423		)
424	
425	d45 = 1000 * (R45/R45wg - 1)
426	d46 = 1000 * (R46/R46wg - 1)
427	d47 = 1000 * (R47/R47wg - 1)
428	d48 = 1000 * (R48/R48wg - 1)
429	d49 = 1000 * (R49/R49wg - 1)
430
431	for k in range(3): # dumb iteration to adjust for small changes in d47
432		R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch
433		R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch	
434		d47 = 1000 * (R47raw/R47wg - 1)
435		d48 = 1000 * (R48raw/R48wg - 1)
436
437	return dict(
438		Sample = sample,
439		D17O = D17O,
440		d13Cwg_VPDB = d13Cwg_VPDB,
441		d18Owg_VSMOW = d18Owg_VSMOW,
442		d45 = d45,
443		d46 = d46,
444		d47 = d47,
445		d48 = d48,
446		d49 = d49,
447		)

Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).

Parameters

  • sample: sample name
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)
  • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
  • D47, D48, D49, D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sample
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the D4xdata default values)

Returns a dictionary with fields ['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].

def virtual_data( samples=[], a47=1.0, b47=0.0, c47=-0.9, a48=1.0, b48=0.0, c48=-0.45, rd45=0.02, rd46=0.06, rD47=0.015, rD48=0.045, d13Cwg_VPDB=None, d18Owg_VSMOW=None, session=None, Nominal_D47=None, Nominal_D48=None, Nominal_d13C_VPDB=None, Nominal_d18O_VPDB=None, ALPHA_18O_ACID_REACTION=None, R13_VPDB=None, R17_VSMOW=None, R18_VSMOW=None, LAMBDA_17=None, R18_VPDB=None, seed=0, shuffle=True):
450def virtual_data(
451	samples = [],
452	a47 = 1., b47 = 0., c47 = -0.9,
453	a48 = 1., b48 = 0., c48 = -0.45,
454	rd45 = 0.020, rd46 = 0.060,
455	rD47 = 0.015, rD48 = 0.045,
456	d13Cwg_VPDB = None, d18Owg_VSMOW = None,
457	session = None,
458	Nominal_D47 = None, Nominal_D48 = None,
459	Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None,
460	ALPHA_18O_ACID_REACTION = None,
461	R13_VPDB = None,
462	R17_VSMOW = None,
463	R18_VSMOW = None,
464	LAMBDA_17 = None,
465	R18_VPDB = None,
466	seed = 0,
467	shuffle = True,
468	):
469	'''
470	Return list with simulated analyses from a single session.
471	
472	**Parameters**
473	
474	+ `samples`: a list of entries; each entry is a dictionary with the following fields:
475	    * `Sample`: the name of the sample
476	    * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample
477	    * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
478	    * `N`: how many analyses to generate for this sample
479	+ `a47`: scrambling factor for Δ47
480	+ `b47`: compositional nonlinearity for Δ47
481	+ `c47`: working gas offset for Δ47
482	+ `a48`: scrambling factor for Δ48
483	+ `b48`: compositional nonlinearity for Δ48
484	+ `c48`: working gas offset for Δ48
485	+ `rd45`: analytical repeatability of δ45
486	+ `rd46`: analytical repeatability of δ46
487	+ `rD47`: analytical repeatability of Δ47
488	+ `rD48`: analytical repeatability of Δ48
489	+ `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas
490		(by default equal to the `simulate_single_analysis` default values)
491	+ `session`: name of the session (no name by default)
492	+ `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values
493		if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults)
494	+ `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and
495		δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 
496		(by default equal to the `simulate_single_analysis` defaults)
497	+ `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor
498		(by default equal to the `simulate_single_analysis` defaults)
499	+ `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17
500		correction parameters (by default equal to the `simulate_single_analysis` default)
501	+ `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations
502	+ `shuffle`: randomly reorder the sequence of analyses
503	
504		
505	Here is an example of using this method to generate an arbitrary combination of
506	anchors and unknowns for a bunch of sessions:
507
508	```py
509	.. include:: ../../code_examples/virtual_data/example.py
510	```
511	
512	This should output something like:
513	
514	```
515	.. include:: ../../code_examples/virtual_data/output.txt
516	```
517	'''
518	
519	kwargs = locals().copy()
520
521	from numpy import random as nprandom
522	if seed:
523		nprandom.seed(seed)
524		rng = nprandom.default_rng(seed)
525	else:
526		rng = nprandom.default_rng()
527	
528	N = sum([s['N'] for s in samples])
529	errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
530	errors45 *= rd45 / stdev(errors45) # scale errors to rd45
531	errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
532	errors46 *= rd46 / stdev(errors46) # scale errors to rd46
533	errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
534	errors47 *= rD47 / stdev(errors47) # scale errors to rD47
535	errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors
536	errors48 *= rD48 / stdev(errors48) # scale errors to rD48
537	
538	k = 0
539	out = []
540	for s in samples:
541		kw = {}
542		kw['sample'] = s['Sample']
543		kw = {
544			**kw,
545			**{var: kwargs[var]
546				for var in [
547					'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION',
548					'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB',
549					'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB',
550					'a47', 'b47', 'c47', 'a48', 'b48', 'c48',
551					]
552				if kwargs[var] is not None},
553			**{var: s[var]
554				for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O']
555				if var in s},
556			}
557
558		sN = s['N']
559		while sN:
560			out.append(simulate_single_analysis(**kw))
561			out[-1]['d45'] += errors45[k]
562			out[-1]['d46'] += errors46[k]
563			out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47
564			out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48
565			sN -= 1
566			k += 1
567
568		if session is not None:
569			for r in out:
570				r['Session'] = session
571
572		if shuffle:
573			nprandom.shuffle(out)
574
575	return out

Return list with simulated analyses from a single session.

Parameters

  • samples: a list of entries; each entry is a dictionary with the following fields:
    • Sample: the name of the sample
    • d13C_VPDB, d18O_VPDB: bulk composition of the carbonate sample
    • D47, D48, D49, D17O (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample
    • N: how many analyses to generate for this sample
  • a47: scrambling factor for Δ47
  • b47: compositional nonlinearity for Δ47
  • c47: working gas offset for Δ47
  • a48: scrambling factor for Δ48
  • b48: compositional nonlinearity for Δ48
  • c48: working gas offset for Δ48
  • rd45: analytical repeatability of δ45
  • rd46: analytical repeatability of δ46
  • rD47: analytical repeatability of Δ47
  • rD48: analytical repeatability of Δ48
  • d13Cwg_VPDB, d18Owg_VSMOW: bulk composition of the working gas (by default equal to the simulate_single_analysis default values)
  • session: name of the session (no name by default)
  • Nominal_D47, Nominal_D48: where to lookup Δ47 and Δ48 values if D47 or D48 are not specified (by default equal to the simulate_single_analysis defaults)
  • Nominal_d13C_VPDB, Nominal_d18O_VPDB: where to lookup δ13C and δ18O values if d13C_VPDB or d18O_VPDB are not specified (by default equal to the simulate_single_analysis defaults)
  • ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to the simulate_single_analysis defaults)
  • R13_VPDB, R17_VSMOW, R18_VSMOW, LAMBDA_17, R18_VPDB: oxygen-17 correction parameters (by default equal to the simulate_single_analysis default)
  • seed: explicitly set to a non-zero value to achieve random but repeatable simulations
  • shuffle: randomly reorder the sequence of analyses

Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:

from D47crunch import virtual_data, D47data

args = dict(
    samples = [
        dict(Sample = 'ETH-1', N = 3),
        dict(Sample = 'ETH-2', N = 3),
        dict(Sample = 'ETH-3', N = 3),
        dict(Sample = 'FOO', N = 3,
            d13C_VPDB = -5., d18O_VPDB = -10.,
            D47 = 0.3, D48 = 0.15),
        dict(Sample = 'BAR', N = 3,
            d13C_VPDB = -15., d18O_VPDB = -2.,
            D47 = 0.6, D48 = 0.2),
        ], rD47 = 0.010, rD48 = 0.030)

session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)

D = D47data(session1 + session2 + session3 + session4)

D.crunch()
D.standardize()

D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)

This should output something like:

[table_of_sessions] 
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session     Na  Nu  d13Cwg_VPDB  d18Owg_VSMOW  r_d13C  r_d18O   r_D47         a ± SE   1e3 x b ± SE          c ± SE
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————
Session_01   9   6       -4.000        26.000  0.0205  0.0633  0.0075  1.015 ± 0.015  0.427 ± 0.232  -0.909 ± 0.006
Session_02   9   6       -4.000        26.000  0.0210  0.0882  0.0082  0.990 ± 0.015  0.484 ± 0.232  -0.905 ± 0.006
Session_03   9   6       -4.000        26.000  0.0186  0.0505  0.0091  0.997 ± 0.015  0.167 ± 0.233  -0.901 ± 0.006
Session_04   9   6       -4.000        26.000  0.0192  0.0467  0.0070  1.017 ± 0.015  0.229 ± 0.232  -0.910 ± 0.006
——————————  ——  ——  ———————————  ————————————  ——————  ——————  ——————  —————————————  —————————————  ——————————————

[table_of_samples] 
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
Sample   N  d13C_VPDB  d18O_VSMOW     D47      SE    95% CL      SD  p_Levene
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————
ETH-1   12       2.02       37.01  0.2052                    0.0083          
ETH-2   12     -10.17       19.88  0.2085                    0.0090          
ETH-3   12       1.71       37.46  0.6132                    0.0083          
BAR     12     -15.02       37.22  0.6057  0.0042  ± 0.0085  0.0088     0.753
FOO     12      -5.00       28.89  0.3024  0.0031  ± 0.0062  0.0070     0.497
——————  ——  —————————  ——————————  ——————  ——————  ————————  ——————  ————————

[table_of_analyses] 
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
UID     Session  Sample  d13Cwg_VPDB  d18Owg_VSMOW        d45        d46         d47         d48         d49   d13C_VPDB  d18O_VSMOW     D47raw     D48raw     D49raw       D47
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————
1    Session_01   ETH-1       -4.000        26.000   5.995601  10.755323   16.116087   21.285428   27.780042    1.998631   36.986704  -0.696924  -0.333640   0.008600  0.201787
2    Session_01     FOO       -4.000        26.000  -0.838118   2.819853    1.310384    5.326005    4.665655   -5.004629   28.895933  -0.593755  -0.319861   0.014956  0.309692
3    Session_01   ETH-3       -4.000        26.000   5.727341  11.211663   16.713472   22.364770   28.306614    1.695479   37.453503  -0.278056  -0.180158  -0.082015  0.614365
4    Session_01     BAR       -4.000        26.000  -9.959983  10.926995    0.053806   21.724901   10.707292  -15.041279   37.199026  -0.300066  -0.243252  -0.029371  0.599675
5    Session_01   ETH-1       -4.000        26.000   6.010276  10.840276   16.207960   21.475150   27.780042    2.011176   37.073454  -0.704188  -0.315986  -0.172089  0.194589
6    Session_01   ETH-1       -4.000        26.000   6.049381  10.706856   16.135579   21.196941   27.780042    2.057827   36.937067  -0.685751  -0.324384   0.045870  0.212791
7    Session_01   ETH-2       -4.000        26.000  -5.974124  -5.955517  -12.668784  -12.208184  -18.023381  -10.163274   19.943159  -0.694902  -0.336672  -0.063946  0.215880
8    Session_01   ETH-3       -4.000        26.000   5.755174  11.255104   16.792797   22.451660   28.306614    1.723596   37.497816  -0.270825  -0.181089  -0.195908  0.621458
9    Session_01     FOO       -4.000        26.000  -0.848028   2.874679    1.346196    5.439150    4.665655   -5.017230   28.951964  -0.601502  -0.316664  -0.081898  0.302042
10   Session_01     BAR       -4.000        26.000  -9.915975  10.968470    0.153453   21.749385   10.707292  -14.995822   37.241294  -0.286638  -0.301325  -0.157376  0.612868
11   Session_01     BAR       -4.000        26.000  -9.920507  10.903408    0.065076   21.704075   10.707292  -14.998270   37.174839  -0.307018  -0.216978  -0.026076  0.592818
12   Session_01     FOO       -4.000        26.000  -0.876454   2.906764    1.341194    5.490264    4.665655   -5.048760   28.984806  -0.608593  -0.329808  -0.114437  0.295055
13   Session_01   ETH-2       -4.000        26.000  -5.982229  -6.110437  -12.827036  -12.492272  -18.023381  -10.166188   19.784916  -0.693555  -0.312598   0.251040  0.217274
14   Session_01   ETH-2       -4.000        26.000  -5.991278  -5.995054  -12.741562  -12.184075  -18.023381  -10.180122   19.902809  -0.711697  -0.232746   0.032602  0.199357
15   Session_01   ETH-3       -4.000        26.000   5.734896  11.229855   16.740410   22.402091   28.306614    1.702875   37.472070  -0.276998  -0.179635  -0.125368  0.615396
16   Session_02   ETH-3       -4.000        26.000   5.716356  11.091821   16.582487   22.123857   28.306614    1.692901   37.370126  -0.279100  -0.178789   0.162540  0.624067
17   Session_02   ETH-2       -4.000        26.000  -5.950370  -5.959974  -12.650784  -12.197864  -18.023381  -10.143809   19.897777  -0.696916  -0.317263  -0.080604  0.216441
18   Session_02     BAR       -4.000        26.000  -9.957566  10.903888    0.031785   21.739434   10.707292  -15.048386   37.213724  -0.302139  -0.183327   0.012926  0.608897
19   Session_02   ETH-1       -4.000        26.000   6.030532  10.851030   16.245571   21.457100   27.780042    2.037466   37.122284  -0.698413  -0.354920  -0.214443  0.200795
20   Session_02     FOO       -4.000        26.000  -0.819742   2.826793    1.317044    5.330616    4.665655   -4.986618   28.903335  -0.612871  -0.329113  -0.018244  0.294481
21   Session_02     BAR       -4.000        26.000  -9.936020  10.862339    0.024660   21.563307   10.707292  -15.023836   37.171034  -0.291333  -0.273498   0.070452  0.619812
22   Session_02   ETH-3       -4.000        26.000   5.719281  11.207303   16.681693   22.370886   28.306614    1.691780   37.488633  -0.296801  -0.165556  -0.065004  0.606143
23   Session_02   ETH-1       -4.000        26.000   5.993918  10.617469   15.991900   21.070358   27.780042    2.006934   36.882679  -0.683329  -0.271476   0.278458  0.216152
24   Session_02   ETH-2       -4.000        26.000  -5.982371  -6.036210  -12.762399  -12.309944  -18.023381  -10.175178   19.819614  -0.701348  -0.277354   0.104418  0.212021
25   Session_02   ETH-1       -4.000        26.000   6.019963  10.773112   16.163825   21.331060   27.780042    2.029040   37.042346  -0.692234  -0.324161  -0.051788  0.207075
26   Session_02     BAR       -4.000        26.000  -9.963888  10.865863   -0.023549   21.615868   10.707292  -15.053743   37.174715  -0.313906  -0.229031   0.093637  0.597041
27   Session_02     FOO       -4.000        26.000  -0.835046   2.870518    1.355370    5.487896    4.665655   -5.004585   28.948243  -0.601666  -0.259900  -0.087592  0.305777
28   Session_02     FOO       -4.000        26.000  -0.848415   2.849823    1.308081    5.427767    4.665655   -5.018107   28.927036  -0.614791  -0.278426  -0.032784  0.292547
29   Session_02   ETH-3       -4.000        26.000   5.757137  11.232751   16.744567   22.398244   28.306614    1.731295   37.514660  -0.298533  -0.189123  -0.154557  0.604363
30   Session_02   ETH-2       -4.000        26.000  -5.993476  -5.944866  -12.696865  -12.149754  -18.023381  -10.190430   19.913381  -0.713779  -0.298963  -0.064251  0.199436
31   Session_03   ETH-3       -4.000        26.000   5.718991  11.146227   16.640814   22.243185   28.306614    1.689442   37.449023  -0.277332  -0.169668   0.053997  0.623187
32   Session_03   ETH-2       -4.000        26.000  -5.997147  -5.905858  -12.655382  -12.081612  -18.023381  -10.165400   19.891551  -0.706536  -0.308464  -0.137414  0.197550
33   Session_03   ETH-1       -4.000        26.000   6.040566  10.786620   16.205283   21.374963   27.780042    2.045244   37.077432  -0.685706  -0.307909  -0.099869  0.213609
34   Session_03   ETH-1       -4.000        26.000   5.994622  10.743980   16.116098   21.243734   27.780042    1.997857   37.033567  -0.684883  -0.352014   0.031692  0.214449
35   Session_03   ETH-3       -4.000        26.000   5.748546  11.079879   16.580826   22.120063   28.306614    1.723364   37.380534  -0.302133  -0.158882   0.151641  0.598318
36   Session_03   ETH-2       -4.000        26.000  -6.000290  -5.947172  -12.697463  -12.164602  -18.023381  -10.167221   19.848953  -0.705037  -0.309350  -0.052386  0.199061
37   Session_03     FOO       -4.000        26.000  -0.800284   2.851299    1.376828    5.379547    4.665655   -4.951581   28.910199  -0.597293  -0.329315  -0.087015  0.304784
38   Session_03     FOO       -4.000        26.000  -0.873798   2.820799    1.272165    5.370745    4.665655   -5.028782   28.878917  -0.596008  -0.277258   0.051165  0.306090
39   Session_03   ETH-2       -4.000        26.000  -6.008525  -5.909707  -12.647727  -12.075913  -18.023381  -10.177379   19.887608  -0.683183  -0.294956  -0.117608  0.220975
40   Session_03     BAR       -4.000        26.000  -9.928709  10.989665    0.148059   21.852677   10.707292  -14.976237   37.324152  -0.299358  -0.242185  -0.184835  0.603855
41   Session_03   ETH-1       -4.000        26.000   6.004078  10.683951   16.045192   21.214355   27.780042    2.010134   36.971642  -0.705956  -0.262026   0.138399  0.193323
42   Session_03     BAR       -4.000        26.000  -9.957114  10.898997    0.044946   21.602296   10.707292  -15.003175   37.230716  -0.284699  -0.307849   0.021944  0.618578
43   Session_03     BAR       -4.000        26.000  -9.952115  11.034508    0.169809   21.885915   10.707292  -15.002819   37.370451  -0.296804  -0.298351  -0.246731  0.606414
44   Session_03     FOO       -4.000        26.000  -0.823857   2.761300    1.258060    5.239992    4.665655   -4.973383   28.817444  -0.603327  -0.288652   0.114488  0.298751
45   Session_03   ETH-3       -4.000        26.000   5.753467  11.206589   16.719131   22.373244   28.306614    1.723960   37.511190  -0.294350  -0.161838  -0.099835  0.606103
46   Session_04     FOO       -4.000        26.000  -0.791191   2.708220    1.256167    5.145784    4.665655   -4.960004   28.750896  -0.586913  -0.276505   0.183674  0.317065
47   Session_04   ETH-1       -4.000        26.000   6.017312  10.735930   16.123043   21.270597   27.780042    2.005824   36.995214  -0.693479  -0.309795   0.023309  0.208980
48   Session_04   ETH-2       -4.000        26.000  -5.986501  -5.915157  -12.656583  -12.060382  -18.023381  -10.182247   19.889836  -0.709603  -0.268277  -0.130450  0.199604
49   Session_04     BAR       -4.000        26.000  -9.951025  10.951923    0.089386   21.738926   10.707292  -15.031949   37.254709  -0.298065  -0.278834  -0.087463  0.601230
50   Session_04   ETH-2       -4.000        26.000  -5.966627  -5.893789  -12.597717  -12.120719  -18.023381  -10.161842   19.911776  -0.691757  -0.372308  -0.193986  0.217132
51   Session_04   ETH-1       -4.000        26.000   6.029937  10.766997   16.151273   21.345479   27.780042    2.018148   37.027152  -0.708855  -0.297953  -0.050465  0.193862
52   Session_04     FOO       -4.000        26.000  -0.853969   2.805035    1.267571    5.353907    4.665655   -5.030523   28.850660  -0.605611  -0.262571   0.060903  0.298685
53   Session_04   ETH-3       -4.000        26.000   5.798016  11.254135   16.832228   22.432473   28.306614    1.752928   37.528936  -0.275047  -0.197935  -0.239408  0.620088
54   Session_04   ETH-1       -4.000        26.000   6.023822  10.730714   16.121184   21.235757   27.780042    2.012958   36.989833  -0.696908  -0.333582   0.026555  0.205610
55   Session_04   ETH-2       -4.000        26.000  -5.973623  -5.975018  -12.694278  -12.194472  -18.023381  -10.166297   19.828211  -0.701951  -0.283570  -0.025935  0.207135
56   Session_04   ETH-3       -4.000        26.000   5.739420  11.128582   16.641344   22.166106   28.306614    1.695046   37.399884  -0.280608  -0.210162   0.066645  0.614665
57   Session_04     BAR       -4.000        26.000  -9.931741  10.819830   -0.023748   21.529372   10.707292  -15.006533   37.118743  -0.302866  -0.222623   0.148462  0.596536
58   Session_04     FOO       -4.000        26.000  -0.848192   2.777763    1.251297    5.280272    4.665655   -5.023358   28.822585  -0.601094  -0.281419   0.108186  0.303128
59   Session_04   ETH-3       -4.000        26.000   5.751908  11.207110   16.726741   22.380392   28.306614    1.705481   37.480657  -0.285776  -0.155878  -0.099197  0.609567
60   Session_04     BAR       -4.000        26.000  -9.926078  10.884823    0.060864   21.650722   10.707292  -15.002880   37.185606  -0.287358  -0.232425   0.016044  0.611760
———  ——————————  ——————  ———————————  ————————————  —————————  —————————  ——————————  ——————————  ——————————  ——————————  ——————————  —————————  —————————  —————————  ————————


def table_of_samples( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
577def table_of_samples(
578	data47 = None,
579	data48 = None,
580	dir = 'output',
581	filename = None,
582	save_to_file = True,
583	print_out = True,
584	output = None,
585	):
586	'''
587	Print out, save to disk and/or return a combined table of samples
588	for a pair of `D47data` and `D48data` objects.
589
590	**Parameters**
591
592	+ `data47`: `D47data` instance
593	+ `data48`: `D48data` instance
594	+ `dir`: the directory in which to save the table
595	+ `filename`: the name to the csv file to write to
596	+ `save_to_file`: whether to save the table to disk
597	+ `print_out`: whether to print out the table
598	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
599		if set to `'raw'`: return a list of list of strings
600		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
601	'''
602	if data47 is None:
603		if data48 is None:
604			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
605		else:
606			return data48.table_of_samples(
607				dir = dir,
608				filename = filename,
609				save_to_file = save_to_file,
610				print_out = print_out,
611				output = output
612				)
613	else:
614		if data48 is None:
615			return data47.table_of_samples(
616				dir = dir,
617				filename = filename,
618				save_to_file = save_to_file,
619				print_out = print_out,
620				output = output
621				)
622		else:
623			out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
624			out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw')
625			out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:])
626
627			if save_to_file:
628				if not os.path.exists(dir):
629					os.makedirs(dir)
630				if filename is None:
631					filename = f'D47D48_samples.csv'
632				with open(f'{dir}/{filename}', 'w') as fid:
633					fid.write(make_csv(out))
634			if print_out:
635				print('\n'+pretty_table(out))
636			if output == 'raw':
637				return out
638			elif output == 'pretty':
639				return pretty_table(out)

Print out, save to disk and/or return a combined table of samples for a pair of D47data and D48data objects.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_sessions( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
642def table_of_sessions(
643	data47 = None,
644	data48 = None,
645	dir = 'output',
646	filename = None,
647	save_to_file = True,
648	print_out = True,
649	output = None,
650	):
651	'''
652	Print out, save to disk and/or return a combined table of sessions
653	for a pair of `D47data` and `D48data` objects.
654	***Only applicable if the sessions in `data47` and those in `data48`
655	consist of the exact same sets of analyses.***
656
657	**Parameters**
658
659	+ `data47`: `D47data` instance
660	+ `data48`: `D48data` instance
661	+ `dir`: the directory in which to save the table
662	+ `filename`: the name to the csv file to write to
663	+ `save_to_file`: whether to save the table to disk
664	+ `print_out`: whether to print out the table
665	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
666		if set to `'raw'`: return a list of list of strings
667		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
668	'''
669	if data47 is None:
670		if data48 is None:
671			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
672		else:
673			return data48.table_of_sessions(
674				dir = dir,
675				filename = filename,
676				save_to_file = save_to_file,
677				print_out = print_out,
678				output = output
679				)
680	else:
681		if data48 is None:
682			return data47.table_of_sessions(
683				dir = dir,
684				filename = filename,
685				save_to_file = save_to_file,
686				print_out = print_out,
687				output = output
688				)
689		else:
690			out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
691			out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw')
692			for k,x in enumerate(out47[0]):
693				if k>7:
694					out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47')
695					out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48')
696			out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:])
697
698			if save_to_file:
699				if not os.path.exists(dir):
700					os.makedirs(dir)
701				if filename is None:
702					filename = f'D47D48_sessions.csv'
703				with open(f'{dir}/{filename}', 'w') as fid:
704					fid.write(make_csv(out))
705			if print_out:
706				print('\n'+pretty_table(out))
707			if output == 'raw':
708				return out
709			elif output == 'pretty':
710				return pretty_table(out)

Print out, save to disk and/or return a combined table of sessions for a pair of D47data and D48data objects. Only applicable if the sessions in data47 and those in data48 consist of the exact same sets of analyses.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def table_of_analyses( data47=None, data48=None, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
713def table_of_analyses(
714	data47 = None,
715	data48 = None,
716	dir = 'output',
717	filename = None,
718	save_to_file = True,
719	print_out = True,
720	output = None,
721	):
722	'''
723	Print out, save to disk and/or return a combined table of analyses
724	for a pair of `D47data` and `D48data` objects.
725
726	If the sessions in `data47` and those in `data48` do not consist of
727	the exact same sets of analyses, the table will have two columns
728	`Session_47` and `Session_48` instead of a single `Session` column.
729
730	**Parameters**
731
732	+ `data47`: `D47data` instance
733	+ `data48`: `D48data` instance
734	+ `dir`: the directory in which to save the table
735	+ `filename`: the name to the csv file to write to
736	+ `save_to_file`: whether to save the table to disk
737	+ `print_out`: whether to print out the table
738	+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
739		if set to `'raw'`: return a list of list of strings
740		(e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
741	'''
742	if data47 is None:
743		if data48 is None:
744			raise TypeError("Arguments must include at least one D47data() or D48data() instance.")
745		else:
746			return data48.table_of_analyses(
747				dir = dir,
748				filename = filename,
749				save_to_file = save_to_file,
750				print_out = print_out,
751				output = output
752				)
753	else:
754		if data48 is None:
755			return data47.table_of_analyses(
756				dir = dir,
757				filename = filename,
758				save_to_file = save_to_file,
759				print_out = print_out,
760				output = output
761				)
762		else:
763			out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
764			out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw')
765			
766			if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical
767				out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:])
768			else:
769				out47[0][1] = 'Session_47'
770				out48[0][1] = 'Session_48'
771				out47 = transpose_table(out47)
772				out48 = transpose_table(out48)
773				out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:])
774
775			if save_to_file:
776				if not os.path.exists(dir):
777					os.makedirs(dir)
778				if filename is None:
779					filename = f'D47D48_sessions.csv'
780				with open(f'{dir}/{filename}', 'w') as fid:
781					fid.write(make_csv(out))
782			if print_out:
783				print('\n'+pretty_table(out))
784			if output == 'raw':
785				return out
786			elif output == 'pretty':
787				return pretty_table(out)

Print out, save to disk and/or return a combined table of analyses for a pair of D47data and D48data objects.

If the sessions in data47 and those in data48 do not consist of the exact same sets of analyses, the table will have two columns Session_47 and Session_48 instead of a single Session column.

Parameters

  • data47: D47data instance
  • data48: D48data instance
  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
class D4xdata(builtins.list):
 835class D4xdata(list):
 836	'''
 837	Store and process data for a large set of Δ47 and/or Δ48
 838	analyses, usually comprising more than one analytical session.
 839	'''
 840
 841	### 17O CORRECTION PARAMETERS
 842	R13_VPDB = 0.01118  # (Chang & Li, 1990)
 843	'''
 844	Absolute (13C/12C) ratio of VPDB.
 845	By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm))
 846	'''
 847
 848	R18_VSMOW = 0.0020052  # (Baertschi, 1976)
 849	'''
 850	Absolute (18O/16C) ratio of VSMOW.
 851	By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1))
 852	'''
 853
 854	LAMBDA_17 = 0.528  # (Barkan & Luz, 2005)
 855	'''
 856	Mass-dependent exponent for triple oxygen isotopes.
 857	By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250))
 858	'''
 859
 860	R17_VSMOW = 0.00038475  # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)
 861	'''
 862	Absolute (17O/16C) ratio of VSMOW.
 863	By default equal to 0.00038475
 864	([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011),
 865	rescaled to `R13_VPDB`)
 866	'''
 867
 868	R18_VPDB = R18_VSMOW * 1.03092
 869	'''
 870	Absolute (18O/16C) ratio of VPDB.
 871	By definition equal to `R18_VSMOW * 1.03092`.
 872	'''
 873
 874	R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17
 875	'''
 876	Absolute (17O/16C) ratio of VPDB.
 877	By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`.
 878	'''
 879
 880	LEVENE_REF_SAMPLE = 'ETH-3'
 881	'''
 882	After the Δ4x standardization step, each sample is tested to
 883	assess whether the Δ4x variance within all analyses for that
 884	sample differs significantly from that observed for a given reference
 885	sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test),
 886	which yields a p-value corresponding to the null hypothesis that the
 887	underlying variances are equal).
 888
 889	`LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which
 890	sample should be used as a reference for this test.
 891	'''
 892
 893	ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6)  # (Kim et al., 2007, calcite)
 894	'''
 895	Specifies the 18O/16O fractionation factor generally applicable
 896	to acid reactions in the dataset. Currently used by `D4xdata.wg()`,
 897	`D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`.
 898
 899	By default equal to 1.008129 (calcite reacted at 90 °C,
 900	[Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)).
 901	'''
 902
 903	Nominal_d13C_VPDB = {
 904		'ETH-1': 2.02,
 905		'ETH-2': -10.17,
 906		'ETH-3': 1.71,
 907		}	# (Bernasconi et al., 2018)
 908	'''
 909	Nominal δ13C_VPDB values assigned to carbonate standards, used by
 910	`D4xdata.standardize_d13C()`.
 911
 912	By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after
 913	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 914	'''
 915
 916	Nominal_d18O_VPDB = {
 917		'ETH-1': -2.19,
 918		'ETH-2': -18.69,
 919		'ETH-3': -1.78,
 920		}	# (Bernasconi et al., 2018)
 921	'''
 922	Nominal δ18O_VPDB values assigned to carbonate standards, used by
 923	`D4xdata.standardize_d18O()`.
 924
 925	By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after
 926	[Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385).
 927	'''
 928
 929	d13C_STANDARDIZATION_METHOD = '2pt'
 930	'''
 931	Method by which to standardize δ13C values:
 932	
 933	+ `none`: do not apply any δ13C standardization.
 934	+ `'1pt'`: within each session, offset all initial δ13C values so as to
 935	minimize the difference between final δ13C_VPDB values and
 936	`Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined).
 937	+ `'2pt'`: within each session, apply a affine trasformation to all δ13C
 938	values so as to minimize the difference between final δ13C_VPDB
 939	values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB`
 940	is defined).
 941	'''
 942
 943	d18O_STANDARDIZATION_METHOD = '2pt'
 944	'''
 945	Method by which to standardize δ18O values:
 946	
 947	+ `none`: do not apply any δ18O standardization.
 948	+ `'1pt'`: within each session, offset all initial δ18O values so as to
 949	minimize the difference between final δ18O_VPDB values and
 950	`Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined).
 951	+ `'2pt'`: within each session, apply a affine trasformation to all δ18O
 952	values so as to minimize the difference between final δ18O_VPDB
 953	values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB`
 954	is defined).
 955	'''
 956
 957	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
 958		'''
 959		**Parameters**
 960
 961		+ `l`: a list of dictionaries, with each dictionary including at least the keys
 962		`Sample`, `d45`, `d46`, and `d47` or `d48`.
 963		+ `mass`: `'47'` or `'48'`
 964		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
 965		+ `session`: define session name for analyses without a `Session` key
 966		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
 967
 968		Returns a `D4xdata` object derived from `list`.
 969		'''
 970		self._4x = mass
 971		self.verbose = verbose
 972		self.prefix = 'D4xdata'
 973		self.logfile = logfile
 974		list.__init__(self, l)
 975		self.Nf = None
 976		self.repeatability = {}
 977		self.refresh(session = session)
 978
 979
 980	def make_verbal(oldfun):
 981		'''
 982		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
 983		'''
 984		@wraps(oldfun)
 985		def newfun(*args, verbose = '', **kwargs):
 986			myself = args[0]
 987			oldprefix = myself.prefix
 988			myself.prefix = oldfun.__name__
 989			if verbose != '':
 990				oldverbose = myself.verbose
 991				myself.verbose = verbose
 992			out = oldfun(*args, **kwargs)
 993			myself.prefix = oldprefix
 994			if verbose != '':
 995				myself.verbose = oldverbose
 996			return out
 997		return newfun
 998
 999
1000	def msg(self, txt):
1001		'''
1002		Log a message to `self.logfile`, and print it out if `verbose = True`
1003		'''
1004		self.log(txt)
1005		if self.verbose:
1006			print(f'{f"[{self.prefix}]":<16} {txt}')
1007
1008
1009	def vmsg(self, txt):
1010		'''
1011		Log a message to `self.logfile` and print it out
1012		'''
1013		self.log(txt)
1014		print(txt)
1015
1016
1017	def log(self, *txts):
1018		'''
1019		Log a message to `self.logfile`
1020		'''
1021		if self.logfile:
1022			with open(self.logfile, 'a') as fid:
1023				for txt in txts:
1024					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
1025
1026
1027	def refresh(self, session = 'mySession'):
1028		'''
1029		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1030		'''
1031		self.fill_in_missing_info(session = session)
1032		self.refresh_sessions()
1033		self.refresh_samples()
1034
1035
1036	def refresh_sessions(self):
1037		'''
1038		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1039		to `False` for all sessions.
1040		'''
1041		self.sessions = {
1042			s: {'data': [r for r in self if r['Session'] == s]}
1043			for s in sorted({r['Session'] for r in self})
1044			}
1045		for s in self.sessions:
1046			self.sessions[s]['scrambling_drift'] = False
1047			self.sessions[s]['slope_drift'] = False
1048			self.sessions[s]['wg_drift'] = False
1049			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1050			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
1051
1052
1053	def refresh_samples(self):
1054		'''
1055		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1056		'''
1057		self.samples = {
1058			s: {'data': [r for r in self if r['Sample'] == s]}
1059			for s in sorted({r['Sample'] for r in self})
1060			}
1061		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1062		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
1063
1064
1065	def read(self, filename, sep = '', session = ''):
1066		'''
1067		Read file in csv format to load data into a `D47data` object.
1068
1069		In the csv file, spaces before and after field separators (`','` by default)
1070		are optional. Each line corresponds to a single analysis.
1071
1072		The required fields are:
1073
1074		+ `UID`: a unique identifier
1075		+ `Session`: an identifier for the analytical session
1076		+ `Sample`: a sample identifier
1077		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1078
1079		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1080		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1081		and `d49` are optional, and set to NaN by default.
1082
1083		**Parameters**
1084
1085		+ `fileneme`: the path of the file to read
1086		+ `sep`: csv separator delimiting the fields
1087		+ `session`: set `Session` field to this string for all analyses
1088		'''
1089		with open(filename) as fid:
1090			self.input(fid.read(), sep = sep, session = session)
1091
1092
1093	def input(self, txt, sep = '', session = ''):
1094		'''
1095		Read `txt` string in csv format to load analysis data into a `D47data` object.
1096
1097		In the csv string, spaces before and after field separators (`','` by default)
1098		are optional. Each line corresponds to a single analysis.
1099
1100		The required fields are:
1101
1102		+ `UID`: a unique identifier
1103		+ `Session`: an identifier for the analytical session
1104		+ `Sample`: a sample identifier
1105		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1106
1107		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1108		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1109		and `d49` are optional, and set to NaN by default.
1110
1111		**Parameters**
1112
1113		+ `txt`: the csv string to read
1114		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1115		whichever appers most often in `txt`.
1116		+ `session`: set `Session` field to this string for all analyses
1117		'''
1118		if sep == '':
1119			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1120		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1121		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1122
1123		if session != '':
1124			for r in data:
1125				r['Session'] = session
1126
1127		self += data
1128		self.refresh()
1129
1130
1131	@make_verbal
1132	def wg(self, samples = None, a18_acid = None):
1133		'''
1134		Compute bulk composition of the working gas for each session based on
1135		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1136		`self.Nominal_d18O_VPDB`.
1137		'''
1138
1139		self.msg('Computing WG composition:')
1140
1141		if a18_acid is None:
1142			a18_acid = self.ALPHA_18O_ACID_REACTION
1143		if samples is None:
1144			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1145
1146		assert a18_acid, f'Acid fractionation factor should not be zero.'
1147
1148		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1149		R45R46_standards = {}
1150		for sample in samples:
1151			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1152			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1153			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1154			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1155			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1156
1157			C12_s = 1 / (1 + R13_s)
1158			C13_s = R13_s / (1 + R13_s)
1159			C16_s = 1 / (1 + R17_s + R18_s)
1160			C17_s = R17_s / (1 + R17_s + R18_s)
1161			C18_s = R18_s / (1 + R17_s + R18_s)
1162
1163			C626_s = C12_s * C16_s ** 2
1164			C627_s = 2 * C12_s * C16_s * C17_s
1165			C628_s = 2 * C12_s * C16_s * C18_s
1166			C636_s = C13_s * C16_s ** 2
1167			C637_s = 2 * C13_s * C16_s * C17_s
1168			C727_s = C12_s * C17_s ** 2
1169
1170			R45_s = (C627_s + C636_s) / C626_s
1171			R46_s = (C628_s + C637_s + C727_s) / C626_s
1172			R45R46_standards[sample] = (R45_s, R46_s)
1173		
1174		for s in self.sessions:
1175			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1176			assert db, f'No sample from {samples} found in session "{s}".'
1177# 			dbsamples = sorted({r['Sample'] for r in db})
1178
1179			X = [r['d45'] for r in db]
1180			Y = [R45R46_standards[r['Sample']][0] for r in db]
1181			x1, x2 = np.min(X), np.max(X)
1182
1183			if x1 < x2:
1184				wgcoord = x1/(x1-x2)
1185			else:
1186				wgcoord = 999
1187
1188			if wgcoord < -.5 or wgcoord > 1.5:
1189				# unreasonable to extrapolate to d45 = 0
1190				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1191			else :
1192				# d45 = 0 is reasonably well bracketed
1193				R45_wg = np.polyfit(X, Y, 1)[1]
1194
1195			X = [r['d46'] for r in db]
1196			Y = [R45R46_standards[r['Sample']][1] for r in db]
1197			x1, x2 = np.min(X), np.max(X)
1198
1199			if x1 < x2:
1200				wgcoord = x1/(x1-x2)
1201			else:
1202				wgcoord = 999
1203
1204			if wgcoord < -.5 or wgcoord > 1.5:
1205				# unreasonable to extrapolate to d46 = 0
1206				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1207			else :
1208				# d46 = 0 is reasonably well bracketed
1209				R46_wg = np.polyfit(X, Y, 1)[1]
1210
1211			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1212
1213			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1214
1215			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1216			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1217			for r in self.sessions[s]['data']:
1218				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1219				r['d18Owg_VSMOW'] = d18Owg_VSMOW
1220
1221
1222	def compute_bulk_delta(self, R45, R46, D17O = 0):
1223		'''
1224		Compute δ13C_VPDB and δ18O_VSMOW,
1225		by solving the generalized form of equation (17) from
1226		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1227		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1228		solving the corresponding second-order Taylor polynomial.
1229		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1230		'''
1231
1232		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1233
1234		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1235		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1236		C = 2 * self.R18_VSMOW
1237		D = -R46
1238
1239		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1240		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1241		cc = A + B + C + D
1242
1243		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1244
1245		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1246		R17 = K * R18 ** self.LAMBDA_17
1247		R13 = R45 - 2 * R17
1248
1249		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1250
1251		return d13C_VPDB, d18O_VSMOW
1252
1253
1254	@make_verbal
1255	def crunch(self, verbose = ''):
1256		'''
1257		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1258		'''
1259		for r in self:
1260			self.compute_bulk_and_clumping_deltas(r)
1261		self.standardize_d13C()
1262		self.standardize_d18O()
1263		self.msg(f"Crunched {len(self)} analyses.")
1264
1265
1266	def fill_in_missing_info(self, session = 'mySession'):
1267		'''
1268		Fill in optional fields with default values
1269		'''
1270		for i,r in enumerate(self):
1271			if 'D17O' not in r:
1272				r['D17O'] = 0.
1273			if 'UID' not in r:
1274				r['UID'] = f'{i+1}'
1275			if 'Session' not in r:
1276				r['Session'] = session
1277			for k in ['d47', 'd48', 'd49']:
1278				if k not in r:
1279					r[k] = np.nan
1280
1281
1282	def standardize_d13C(self):
1283		'''
1284		Perform δ13C standadization within each session `s` according to
1285		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1286		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1287		may be redefined abitrarily at a later stage.
1288		'''
1289		for s in self.sessions:
1290			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1291				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1292				X,Y = zip(*XY)
1293				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1294					offset = np.mean(Y) - np.mean(X)
1295					for r in self.sessions[s]['data']:
1296						r['d13C_VPDB'] += offset				
1297				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1298					a,b = np.polyfit(X,Y,1)
1299					for r in self.sessions[s]['data']:
1300						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
1301
1302	def standardize_d18O(self):
1303		'''
1304		Perform δ18O standadization within each session `s` according to
1305		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1306		which is defined by default by `D47data.refresh_sessions()`as equal to
1307		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1308		'''
1309		for s in self.sessions:
1310			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1311				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1312				X,Y = zip(*XY)
1313				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1314				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1315					offset = np.mean(Y) - np.mean(X)
1316					for r in self.sessions[s]['data']:
1317						r['d18O_VSMOW'] += offset				
1318				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1319					a,b = np.polyfit(X,Y,1)
1320					for r in self.sessions[s]['data']:
1321						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
1322	
1323
1324	def compute_bulk_and_clumping_deltas(self, r):
1325		'''
1326		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1327		'''
1328
1329		# Compute working gas R13, R18, and isobar ratios
1330		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1331		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1332		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1333
1334		# Compute analyte isobar ratios
1335		R45 = (1 + r['d45'] / 1000) * R45_wg
1336		R46 = (1 + r['d46'] / 1000) * R46_wg
1337		R47 = (1 + r['d47'] / 1000) * R47_wg
1338		R48 = (1 + r['d48'] / 1000) * R48_wg
1339		R49 = (1 + r['d49'] / 1000) * R49_wg
1340
1341		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1342		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1343		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1344
1345		# Compute stochastic isobar ratios of the analyte
1346		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1347			R13, R18, D17O = r['D17O']
1348		)
1349
1350		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1351		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1352		if (R45 / R45stoch - 1) > 5e-8:
1353			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1354		if (R46 / R46stoch - 1) > 5e-8:
1355			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1356
1357		# Compute raw clumped isotope anomalies
1358		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1359		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1360		r['D49raw'] = 1000 * (R49 / R49stoch - 1)
1361
1362
1363	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1364		'''
1365		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1366		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1367		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1368		'''
1369
1370		# Compute R17
1371		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1372
1373		# Compute isotope concentrations
1374		C12 = (1 + R13) ** -1
1375		C13 = C12 * R13
1376		C16 = (1 + R17 + R18) ** -1
1377		C17 = C16 * R17
1378		C18 = C16 * R18
1379
1380		# Compute stochastic isotopologue concentrations
1381		C626 = C16 * C12 * C16
1382		C627 = C16 * C12 * C17 * 2
1383		C628 = C16 * C12 * C18 * 2
1384		C636 = C16 * C13 * C16
1385		C637 = C16 * C13 * C17 * 2
1386		C638 = C16 * C13 * C18 * 2
1387		C727 = C17 * C12 * C17
1388		C728 = C17 * C12 * C18 * 2
1389		C737 = C17 * C13 * C17
1390		C738 = C17 * C13 * C18 * 2
1391		C828 = C18 * C12 * C18
1392		C838 = C18 * C13 * C18
1393
1394		# Compute stochastic isobar ratios
1395		R45 = (C636 + C627) / C626
1396		R46 = (C628 + C637 + C727) / C626
1397		R47 = (C638 + C728 + C737) / C626
1398		R48 = (C738 + C828) / C626
1399		R49 = C838 / C626
1400
1401		# Account for stochastic anomalies
1402		R47 *= 1 + D47 / 1000
1403		R48 *= 1 + D48 / 1000
1404		R49 *= 1 + D49 / 1000
1405
1406		# Return isobar ratios
1407		return R45, R46, R47, R48, R49
1408
1409
1410	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1411		'''
1412		Split unknown samples by UID (treat all analyses as different samples)
1413		or by session (treat analyses of a given sample in different sessions as
1414		different samples).
1415
1416		**Parameters**
1417
1418		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1419		+ `grouping`: `by_uid` | `by_session`
1420		'''
1421		if samples_to_split == 'all':
1422			samples_to_split = [s for s in self.unknowns]
1423		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1424		self.grouping = grouping.lower()
1425		if self.grouping in gkeys:
1426			gkey = gkeys[self.grouping]
1427		for r in self:
1428			if r['Sample'] in samples_to_split:
1429				r['Sample_original'] = r['Sample']
1430				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1431			elif r['Sample'] in self.unknowns:
1432				r['Sample_original'] = r['Sample']
1433		self.refresh_samples()
1434
1435
1436	def unsplit_samples(self, tables = False):
1437		'''
1438		Reverse the effects of `D47data.split_samples()`.
1439		
1440		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1441		
1442		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1443		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1444		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1445		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1446		that case session-averaged Δ4x values are statistically independent).
1447		'''
1448		unknowns_old = sorted({s for s in self.unknowns})
1449		CM_old = self.standardization.covar[:,:]
1450		VD_old = self.standardization.params.valuesdict().copy()
1451		vars_old = self.standardization.var_names
1452
1453		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1454
1455		Ns = len(vars_old) - len(unknowns_old)
1456		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1457		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1458
1459		W = np.zeros((len(vars_new), len(vars_old)))
1460		W[:Ns,:Ns] = np.eye(Ns)
1461		for u in unknowns_new:
1462			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1463			if self.grouping == 'by_session':
1464				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1465			elif self.grouping == 'by_uid':
1466				weights = [1 for s in splits]
1467			sw = sum(weights)
1468			weights = [w/sw for w in weights]
1469			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1470
1471		CM_new = W @ CM_old @ W.T
1472		V = W @ np.array([[VD_old[k]] for k in vars_old])
1473		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1474
1475		self.standardization.covar = CM_new
1476		self.standardization.params.valuesdict = lambda : VD_new
1477		self.standardization.var_names = vars_new
1478
1479		for r in self:
1480			if r['Sample'] in self.unknowns:
1481				r['Sample_split'] = r['Sample']
1482				r['Sample'] = r['Sample_original']
1483
1484		self.refresh_samples()
1485		self.consolidate_samples()
1486		self.repeatabilities()
1487
1488		if tables:
1489			self.table_of_analyses()
1490			self.table_of_samples()
1491
1492	def assign_timestamps(self):
1493		'''
1494		Assign a time field `t` of type `float` to each analysis.
1495
1496		If `TimeTag` is one of the data fields, `t` is equal within a given session
1497		to `TimeTag` minus the mean value of `TimeTag` for that session.
1498		Otherwise, `TimeTag` is by default equal to the index of each analysis
1499		in the dataset and `t` is defined as above.
1500		'''
1501		for session in self.sessions:
1502			sdata = self.sessions[session]['data']
1503			try:
1504				t0 = np.mean([r['TimeTag'] for r in sdata])
1505				for r in sdata:
1506					r['t'] = r['TimeTag'] - t0
1507			except KeyError:
1508				t0 = (len(sdata)-1)/2
1509				for t,r in enumerate(sdata):
1510					r['t'] = t - t0
1511
1512
1513	def report(self):
1514		'''
1515		Prints a report on the standardization fit.
1516		Only applicable after `D4xdata.standardize(method='pooled')`.
1517		'''
1518		report_fit(self.standardization)
1519
1520
1521	def combine_samples(self, sample_groups):
1522		'''
1523		Combine analyses of different samples to compute weighted average Δ4x
1524		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1525		dictionary.
1526		
1527		Caution: samples are weighted by number of replicate analyses, which is a
1528		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1529		correlated analytical errors for one or more samples).
1530		
1531		Returns a tuplet of:
1532		
1533		+ the list of group names
1534		+ an array of the corresponding Δ4x values
1535		+ the corresponding (co)variance matrix
1536		
1537		**Parameters**
1538
1539		+ `sample_groups`: a dictionary of the form:
1540		```py
1541		{'group1': ['sample_1', 'sample_2'],
1542		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1543		```
1544		'''
1545		
1546		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1547		groups = sorted(sample_groups.keys())
1548		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1549		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1550		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1551		W = np.array([
1552			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1553			for j in groups])
1554		D4x_new = W @ D4x_old
1555		CM_new = W @ CM_old @ W.T
1556
1557		return groups, D4x_new[:,0], CM_new
1558		
1559
1560	@make_verbal
1561	def standardize(self,
1562		method = 'pooled',
1563		weighted_sessions = [],
1564		consolidate = True,
1565		consolidate_tables = False,
1566		consolidate_plots = False,
1567		constraints = {},
1568		):
1569		'''
1570		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1571		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1572		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1573		i.e. that their true Δ4x value does not change between sessions,
1574		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1575		`'indep_sessions'`, the standardization processes each session independently, based only
1576		on anchors analyses.
1577		'''
1578
1579		self.standardization_method = method
1580		self.assign_timestamps()
1581
1582		if method == 'pooled':
1583			if weighted_sessions:
1584				for session_group in weighted_sessions:
1585					if self._4x == '47':
1586						X = D47data([r for r in self if r['Session'] in session_group])
1587					elif self._4x == '48':
1588						X = D48data([r for r in self if r['Session'] in session_group])
1589					X.Nominal_D4x = self.Nominal_D4x.copy()
1590					X.refresh()
1591					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1592					w = np.sqrt(result.redchi)
1593					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1594					for r in X:
1595						r[f'wD{self._4x}raw'] *= w
1596			else:
1597				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1598				for r in self:
1599					r[f'wD{self._4x}raw'] = 1.
1600
1601			params = Parameters()
1602			for k,session in enumerate(self.sessions):
1603				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1604				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1605				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1606				s = pf(session)
1607				params.add(f'a_{s}', value = 0.9)
1608				params.add(f'b_{s}', value = 0.)
1609				params.add(f'c_{s}', value = -0.9)
1610				params.add(f'a2_{s}', value = 0.,
1611# 					vary = self.sessions[session]['scrambling_drift'],
1612					)
1613				params.add(f'b2_{s}', value = 0.,
1614# 					vary = self.sessions[session]['slope_drift'],
1615					)
1616				params.add(f'c2_{s}', value = 0.,
1617# 					vary = self.sessions[session]['wg_drift'],
1618					)
1619				if not self.sessions[session]['scrambling_drift']:
1620					params[f'a2_{s}'].expr = '0'
1621				if not self.sessions[session]['slope_drift']:
1622					params[f'b2_{s}'].expr = '0'
1623				if not self.sessions[session]['wg_drift']:
1624					params[f'c2_{s}'].expr = '0'
1625
1626			for sample in self.unknowns:
1627				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1628
1629			for k in constraints:
1630				params[k].expr = constraints[k]
1631
1632			def residuals(p):
1633				R = []
1634				for r in self:
1635					session = pf(r['Session'])
1636					sample = pf(r['Sample'])
1637					if r['Sample'] in self.Nominal_D4x:
1638						R += [ (
1639							r[f'D{self._4x}raw'] - (
1640								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1641								+ p[f'b_{session}'] * r[f'd{self._4x}']
1642								+	p[f'c_{session}']
1643								+ r['t'] * (
1644									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1645									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1646									+	p[f'c2_{session}']
1647									)
1648								)
1649							) / r[f'wD{self._4x}raw'] ]
1650					else:
1651						R += [ (
1652							r[f'D{self._4x}raw'] - (
1653								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1654								+ p[f'b_{session}'] * r[f'd{self._4x}']
1655								+	p[f'c_{session}']
1656								+ r['t'] * (
1657									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1658									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1659									+	p[f'c2_{session}']
1660									)
1661								)
1662							) / r[f'wD{self._4x}raw'] ]
1663				return R
1664
1665			M = Minimizer(residuals, params)
1666			result = M.least_squares()
1667			self.Nf = result.nfree
1668			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1669			new_names, new_covar, new_se = _fullcovar(result)[:3]
1670			result.var_names = new_names
1671			result.covar = new_covar
1672
1673			for r in self:
1674				s = pf(r["Session"])
1675				a = result.params.valuesdict()[f'a_{s}']
1676				b = result.params.valuesdict()[f'b_{s}']
1677				c = result.params.valuesdict()[f'c_{s}']
1678				a2 = result.params.valuesdict()[f'a2_{s}']
1679				b2 = result.params.valuesdict()[f'b2_{s}']
1680				c2 = result.params.valuesdict()[f'c2_{s}']
1681				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1682				
1683
1684			self.standardization = result
1685
1686			for session in self.sessions:
1687				self.sessions[session]['Np'] = 3
1688				for k in ['scrambling', 'slope', 'wg']:
1689					if self.sessions[session][f'{k}_drift']:
1690						self.sessions[session]['Np'] += 1
1691
1692			if consolidate:
1693				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1694			return result
1695
1696
1697		elif method == 'indep_sessions':
1698
1699			if weighted_sessions:
1700				for session_group in weighted_sessions:
1701					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1702					X.Nominal_D4x = self.Nominal_D4x.copy()
1703					X.refresh()
1704					# This is only done to assign r['wD47raw'] for r in X:
1705					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1706					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1707			else:
1708				self.msg('All weights set to 1 ‰')
1709				for r in self:
1710					r[f'wD{self._4x}raw'] = 1
1711
1712			for session in self.sessions:
1713				s = self.sessions[session]
1714				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1715				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1716				s['Np'] = sum(p_active)
1717				sdata = s['data']
1718
1719				A = np.array([
1720					[
1721						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1722						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1723						1 / r[f'wD{self._4x}raw'],
1724						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1725						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1726						r['t'] / r[f'wD{self._4x}raw']
1727						]
1728					for r in sdata if r['Sample'] in self.anchors
1729					])[:,p_active] # only keep columns for the active parameters
1730				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1731				s['Na'] = Y.size
1732				CM = linalg.inv(A.T @ A)
1733				bf = (CM @ A.T @ Y).T[0,:]
1734				k = 0
1735				for n,a in zip(p_names, p_active):
1736					if a:
1737						s[n] = bf[k]
1738# 						self.msg(f'{n} = {bf[k]}')
1739						k += 1
1740					else:
1741						s[n] = 0.
1742# 						self.msg(f'{n} = 0.0')
1743
1744				for r in sdata :
1745					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1746					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1747					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1748
1749				s['CM'] = np.zeros((6,6))
1750				i = 0
1751				k_active = [j for j,a in enumerate(p_active) if a]
1752				for j,a in enumerate(p_active):
1753					if a:
1754						s['CM'][j,k_active] = CM[i,:]
1755						i += 1
1756
1757			if not weighted_sessions:
1758				w = self.rmswd()['rmswd']
1759				for r in self:
1760						r[f'wD{self._4x}'] *= w
1761						r[f'wD{self._4x}raw'] *= w
1762				for session in self.sessions:
1763					self.sessions[session]['CM'] *= w**2
1764
1765			for session in self.sessions:
1766				s = self.sessions[session]
1767				s['SE_a'] = s['CM'][0,0]**.5
1768				s['SE_b'] = s['CM'][1,1]**.5
1769				s['SE_c'] = s['CM'][2,2]**.5
1770				s['SE_a2'] = s['CM'][3,3]**.5
1771				s['SE_b2'] = s['CM'][4,4]**.5
1772				s['SE_c2'] = s['CM'][5,5]**.5
1773
1774			if not weighted_sessions:
1775				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1776			else:
1777				self.Nf = 0
1778				for sg in weighted_sessions:
1779					self.Nf += self.rmswd(sessions = sg)['Nf']
1780
1781			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1782
1783			avgD4x = {
1784				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1785				for sample in self.samples
1786				}
1787			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1788			rD4x = (chi2/self.Nf)**.5
1789			self.repeatability[f'sigma_{self._4x}'] = rD4x
1790
1791			if consolidate:
1792				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1793
1794
1795	def standardization_error(self, session, d4x, D4x, t = 0):
1796		'''
1797		Compute standardization error for a given session and
1798		(δ47, Δ47) composition.
1799		'''
1800		a = self.sessions[session]['a']
1801		b = self.sessions[session]['b']
1802		c = self.sessions[session]['c']
1803		a2 = self.sessions[session]['a2']
1804		b2 = self.sessions[session]['b2']
1805		c2 = self.sessions[session]['c2']
1806		CM = self.sessions[session]['CM']
1807
1808		x, y = D4x, d4x
1809		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1810# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1811		dxdy = -(b+b2*t) / (a+a2*t)
1812		dxdz = 1. / (a+a2*t)
1813		dxda = -x / (a+a2*t)
1814		dxdb = -y / (a+a2*t)
1815		dxdc = -1. / (a+a2*t)
1816		dxda2 = -x * a2 / (a+a2*t)
1817		dxdb2 = -y * t / (a+a2*t)
1818		dxdc2 = -t / (a+a2*t)
1819		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1820		sx = (V @ CM @ V.T) ** .5
1821		return sx
1822
1823
1824	@make_verbal
1825	def summary(self,
1826		dir = 'output',
1827		filename = None,
1828		save_to_file = True,
1829		print_out = True,
1830		):
1831		'''
1832		Print out an/or save to disk a summary of the standardization results.
1833
1834		**Parameters**
1835
1836		+ `dir`: the directory in which to save the table
1837		+ `filename`: the name to the csv file to write to
1838		+ `save_to_file`: whether to save the table to disk
1839		+ `print_out`: whether to print out the table
1840		'''
1841
1842		out = []
1843		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1844		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1845		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1846		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1848		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1849		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1850		out += [['Model degrees of freedom', f"{self.Nf}"]]
1851		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1852		out += [['Standardization method', self.standardization_method]]
1853
1854		if save_to_file:
1855			if not os.path.exists(dir):
1856				os.makedirs(dir)
1857			if filename is None:
1858				filename = f'D{self._4x}_summary.csv'
1859			with open(f'{dir}/{filename}', 'w') as fid:
1860				fid.write(make_csv(out))
1861		if print_out:
1862			self.msg('\n' + pretty_table(out, header = 0))
1863
1864
1865	@make_verbal
1866	def table_of_sessions(self,
1867		dir = 'output',
1868		filename = None,
1869		save_to_file = True,
1870		print_out = True,
1871		output = None,
1872		):
1873		'''
1874		Print out an/or save to disk a table of sessions.
1875
1876		**Parameters**
1877
1878		+ `dir`: the directory in which to save the table
1879		+ `filename`: the name to the csv file to write to
1880		+ `save_to_file`: whether to save the table to disk
1881		+ `print_out`: whether to print out the table
1882		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1883		    if set to `'raw'`: return a list of list of strings
1884		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1885		'''
1886		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1887		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1888		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1889
1890		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1891		if include_a2:
1892			out[-1] += ['a2 ± SE']
1893		if include_b2:
1894			out[-1] += ['b2 ± SE']
1895		if include_c2:
1896			out[-1] += ['c2 ± SE']
1897		for session in self.sessions:
1898			out += [[
1899				session,
1900				f"{self.sessions[session]['Na']}",
1901				f"{self.sessions[session]['Nu']}",
1902				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1903				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1904				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1905				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1906				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1907				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1908				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1909				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1910				]]
1911			if include_a2:
1912				if self.sessions[session]['scrambling_drift']:
1913					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1914				else:
1915					out[-1] += ['']
1916			if include_b2:
1917				if self.sessions[session]['slope_drift']:
1918					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1919				else:
1920					out[-1] += ['']
1921			if include_c2:
1922				if self.sessions[session]['wg_drift']:
1923					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1924				else:
1925					out[-1] += ['']
1926
1927		if save_to_file:
1928			if not os.path.exists(dir):
1929				os.makedirs(dir)
1930			if filename is None:
1931				filename = f'D{self._4x}_sessions.csv'
1932			with open(f'{dir}/{filename}', 'w') as fid:
1933				fid.write(make_csv(out))
1934		if print_out:
1935			self.msg('\n' + pretty_table(out))
1936		if output == 'raw':
1937			return out
1938		elif output == 'pretty':
1939			return pretty_table(out)
1940
1941
1942	@make_verbal
1943	def table_of_analyses(
1944		self,
1945		dir = 'output',
1946		filename = None,
1947		save_to_file = True,
1948		print_out = True,
1949		output = None,
1950		):
1951		'''
1952		Print out an/or save to disk a table of analyses.
1953
1954		**Parameters**
1955
1956		+ `dir`: the directory in which to save the table
1957		+ `filename`: the name to the csv file to write to
1958		+ `save_to_file`: whether to save the table to disk
1959		+ `print_out`: whether to print out the table
1960		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1961		    if set to `'raw'`: return a list of list of strings
1962		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1963		'''
1964
1965		out = [['UID','Session','Sample']]
1966		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1967		for f in extra_fields:
1968			out[-1] += [f[0]]
1969		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1970		for r in self:
1971			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1972			for f in extra_fields:
1973				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1974			out[-1] += [
1975				f"{r['d13Cwg_VPDB']:.3f}",
1976				f"{r['d18Owg_VSMOW']:.3f}",
1977				f"{r['d45']:.6f}",
1978				f"{r['d46']:.6f}",
1979				f"{r['d47']:.6f}",
1980				f"{r['d48']:.6f}",
1981				f"{r['d49']:.6f}",
1982				f"{r['d13C_VPDB']:.6f}",
1983				f"{r['d18O_VSMOW']:.6f}",
1984				f"{r['D47raw']:.6f}",
1985				f"{r['D48raw']:.6f}",
1986				f"{r['D49raw']:.6f}",
1987				f"{r[f'D{self._4x}']:.6f}"
1988				]
1989		if save_to_file:
1990			if not os.path.exists(dir):
1991				os.makedirs(dir)
1992			if filename is None:
1993				filename = f'D{self._4x}_analyses.csv'
1994			with open(f'{dir}/{filename}', 'w') as fid:
1995				fid.write(make_csv(out))
1996		if print_out:
1997			self.msg('\n' + pretty_table(out))
1998		return out
1999
2000	@make_verbal
2001	def covar_table(
2002		self,
2003		correl = False,
2004		dir = 'output',
2005		filename = None,
2006		save_to_file = True,
2007		print_out = True,
2008		output = None,
2009		):
2010		'''
2011		Print out, save to disk and/or return the variance-covariance matrix of D4x
2012		for all unknown samples.
2013
2014		**Parameters**
2015
2016		+ `dir`: the directory in which to save the csv
2017		+ `filename`: the name of the csv file to write to
2018		+ `save_to_file`: whether to save the csv
2019		+ `print_out`: whether to print out the matrix
2020		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2021		    if set to `'raw'`: return a list of list of strings
2022		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2023		'''
2024		samples = sorted([u for u in self.unknowns])
2025		out = [[''] + samples]
2026		for s1 in samples:
2027			out.append([s1])
2028			for s2 in samples:
2029				if correl:
2030					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2031				else:
2032					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2033
2034		if save_to_file:
2035			if not os.path.exists(dir):
2036				os.makedirs(dir)
2037			if filename is None:
2038				if correl:
2039					filename = f'D{self._4x}_correl.csv'
2040				else:
2041					filename = f'D{self._4x}_covar.csv'
2042			with open(f'{dir}/{filename}', 'w') as fid:
2043				fid.write(make_csv(out))
2044		if print_out:
2045			self.msg('\n'+pretty_table(out))
2046		if output == 'raw':
2047			return out
2048		elif output == 'pretty':
2049			return pretty_table(out)
2050
2051	@make_verbal
2052	def table_of_samples(
2053		self,
2054		dir = 'output',
2055		filename = None,
2056		save_to_file = True,
2057		print_out = True,
2058		output = None,
2059		):
2060		'''
2061		Print out, save to disk and/or return a table of samples.
2062
2063		**Parameters**
2064
2065		+ `dir`: the directory in which to save the csv
2066		+ `filename`: the name of the csv file to write to
2067		+ `save_to_file`: whether to save the csv
2068		+ `print_out`: whether to print out the table
2069		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2070		    if set to `'raw'`: return a list of list of strings
2071		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2072		'''
2073
2074		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2075		for sample in self.anchors:
2076			out += [[
2077				f"{sample}",
2078				f"{self.samples[sample]['N']}",
2079				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2080				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2081				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2082				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2083				]]
2084		for sample in self.unknowns:
2085			out += [[
2086				f"{sample}",
2087				f"{self.samples[sample]['N']}",
2088				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2089				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2090				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2091				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2092				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2093				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2094				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2095				]]
2096		if save_to_file:
2097			if not os.path.exists(dir):
2098				os.makedirs(dir)
2099			if filename is None:
2100				filename = f'D{self._4x}_samples.csv'
2101			with open(f'{dir}/{filename}', 'w') as fid:
2102				fid.write(make_csv(out))
2103		if print_out:
2104			self.msg('\n'+pretty_table(out))
2105		if output == 'raw':
2106			return out
2107		elif output == 'pretty':
2108			return pretty_table(out)
2109
2110
2111	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2112		'''
2113		Generate session plots and save them to disk.
2114
2115		**Parameters**
2116
2117		+ `dir`: the directory in which to save the plots
2118		+ `figsize`: the width and height (in inches) of each plot
2119		+ `filetype`: 'pdf' or 'png'
2120		+ `dpi`: resolution for PNG output
2121		'''
2122		if not os.path.exists(dir):
2123			os.makedirs(dir)
2124
2125		for session in self.sessions:
2126			sp = self.plot_single_session(session, xylimits = 'constant')
2127			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2128			ppl.close(sp.fig)
2129			
2130
2131
2132	@make_verbal
2133	def consolidate_samples(self):
2134		'''
2135		Compile various statistics for each sample.
2136
2137		For each anchor sample:
2138
2139		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2140		+ `SE_D47` or `SE_D48`: set to zero by definition
2141
2142		For each unknown sample:
2143
2144		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2145		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2146
2147		For each anchor and unknown:
2148
2149		+ `N`: the total number of analyses of this sample
2150		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2151		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2152		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2153		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2154		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2155		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2156		'''
2157		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2158		for sample in self.samples:
2159			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2160			if self.samples[sample]['N'] > 1:
2161				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2162
2163			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2164			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2165
2166			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2167			if len(D4x_pop) > 2:
2168				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2169			
2170		if self.standardization_method == 'pooled':
2171			for sample in self.anchors:
2172				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2173				self.samples[sample][f'SE_D{self._4x}'] = 0.
2174			for sample in self.unknowns:
2175				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2176				try:
2177					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2178				except ValueError:
2179					# when `sample` is constrained by self.standardize(constraints = {...}),
2180					# it is no longer listed in self.standardization.var_names.
2181					# Temporary fix: define SE as zero for now
2182					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2183
2184		elif self.standardization_method == 'indep_sessions':
2185			for sample in self.anchors:
2186				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2187				self.samples[sample][f'SE_D{self._4x}'] = 0.
2188			for sample in self.unknowns:
2189				self.msg(f'Consolidating sample {sample}')
2190				self.unknowns[sample][f'session_D{self._4x}'] = {}
2191				session_avg = []
2192				for session in self.sessions:
2193					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2194					if sdata:
2195						self.msg(f'{sample} found in session {session}')
2196						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2197						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2198						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2199						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2200						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2201						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2202						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2203				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2204				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2205				wsum = sum([weights[s] for s in weights])
2206				for s in weights:
2207					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2208
2209		for r in self:
2210			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
2211
2212
2213
2214	def consolidate_sessions(self):
2215		'''
2216		Compute various statistics for each session.
2217
2218		+ `Na`: Number of anchor analyses in the session
2219		+ `Nu`: Number of unknown analyses in the session
2220		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2221		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2222		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2223		+ `a`: scrambling factor
2224		+ `b`: compositional slope
2225		+ `c`: WG offset
2226		+ `SE_a`: Model stadard erorr of `a`
2227		+ `SE_b`: Model stadard erorr of `b`
2228		+ `SE_c`: Model stadard erorr of `c`
2229		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2230		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2231		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2232		+ `a2`: scrambling factor drift
2233		+ `b2`: compositional slope drift
2234		+ `c2`: WG offset drift
2235		+ `Np`: Number of standardization parameters to fit
2236		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2237		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2238		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2239		'''
2240		for session in self.sessions:
2241			if 'd13Cwg_VPDB' not in self.sessions[session]:
2242				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2243			if 'd18Owg_VSMOW' not in self.sessions[session]:
2244				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2245			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2246			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2247
2248			self.msg(f'Computing repeatabilities for session {session}')
2249			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2250			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2251			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2252
2253		if self.standardization_method == 'pooled':
2254			for session in self.sessions:
2255
2256				# different (better?) computation of D4x repeatability for each session:
2257				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2258				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2259
2260				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2261				i = self.standardization.var_names.index(f'a_{pf(session)}')
2262				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2263
2264				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2265				i = self.standardization.var_names.index(f'b_{pf(session)}')
2266				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2267
2268				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2269				i = self.standardization.var_names.index(f'c_{pf(session)}')
2270				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2271
2272				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2273				if self.sessions[session]['scrambling_drift']:
2274					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2275					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2276				else:
2277					self.sessions[session]['SE_a2'] = 0.
2278
2279				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2280				if self.sessions[session]['slope_drift']:
2281					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2282					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2283				else:
2284					self.sessions[session]['SE_b2'] = 0.
2285
2286				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2287				if self.sessions[session]['wg_drift']:
2288					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2289					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2290				else:
2291					self.sessions[session]['SE_c2'] = 0.
2292
2293				i = self.standardization.var_names.index(f'a_{pf(session)}')
2294				j = self.standardization.var_names.index(f'b_{pf(session)}')
2295				k = self.standardization.var_names.index(f'c_{pf(session)}')
2296				CM = np.zeros((6,6))
2297				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2298				try:
2299					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2300					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2301					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2302					try:
2303						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2304						CM[3,4] = self.standardization.covar[i2,j2]
2305						CM[4,3] = self.standardization.covar[j2,i2]
2306					except ValueError:
2307						pass
2308					try:
2309						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2310						CM[3,5] = self.standardization.covar[i2,k2]
2311						CM[5,3] = self.standardization.covar[k2,i2]
2312					except ValueError:
2313						pass
2314				except ValueError:
2315					pass
2316				try:
2317					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2318					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2319					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2320					try:
2321						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2322						CM[4,5] = self.standardization.covar[j2,k2]
2323						CM[5,4] = self.standardization.covar[k2,j2]
2324					except ValueError:
2325						pass
2326				except ValueError:
2327					pass
2328				try:
2329					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2330					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2331					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2332				except ValueError:
2333					pass
2334
2335				self.sessions[session]['CM'] = CM
2336
2337		elif self.standardization_method == 'indep_sessions':
2338			pass # Not implemented yet
2339
2340
2341	@make_verbal
2342	def repeatabilities(self):
2343		'''
2344		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2345		(for all samples, for anchors, and for unknowns).
2346		'''
2347		self.msg('Computing reproducibilities for all sessions')
2348
2349		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2350		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2351		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2352		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2353		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
2354
2355
2356	@make_verbal
2357	def consolidate(self, tables = True, plots = True):
2358		'''
2359		Collect information about samples, sessions and repeatabilities.
2360		'''
2361		self.consolidate_samples()
2362		self.consolidate_sessions()
2363		self.repeatabilities()
2364
2365		if tables:
2366			self.summary()
2367			self.table_of_sessions()
2368			self.table_of_analyses()
2369			self.table_of_samples()
2370
2371		if plots:
2372			self.plot_sessions()
2373
2374
2375	@make_verbal
2376	def rmswd(self,
2377		samples = 'all samples',
2378		sessions = 'all sessions',
2379		):
2380		'''
2381		Compute the χ2, root mean squared weighted deviation
2382		(i.e. reduced χ2), and corresponding degrees of freedom of the
2383		Δ4x values for samples in `samples` and sessions in `sessions`.
2384		
2385		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2386		'''
2387		if samples == 'all samples':
2388			mysamples = [k for k in self.samples]
2389		elif samples == 'anchors':
2390			mysamples = [k for k in self.anchors]
2391		elif samples == 'unknowns':
2392			mysamples = [k for k in self.unknowns]
2393		else:
2394			mysamples = samples
2395
2396		if sessions == 'all sessions':
2397			sessions = [k for k in self.sessions]
2398
2399		chisq, Nf = 0, 0
2400		for sample in mysamples :
2401			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2402			if len(G) > 1 :
2403				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2404				Nf += (len(G) - 1)
2405				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2406		r = (chisq / Nf)**.5 if Nf > 0 else 0
2407		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2408		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
2409
2410	
2411	@make_verbal
2412	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2413		'''
2414		Compute the repeatability of `[r[key] for r in self]`
2415		'''
2416
2417		if samples == 'all samples':
2418			mysamples = [k for k in self.samples]
2419		elif samples == 'anchors':
2420			mysamples = [k for k in self.anchors]
2421		elif samples == 'unknowns':
2422			mysamples = [k for k in self.unknowns]
2423		else:
2424			mysamples = samples
2425
2426		if sessions == 'all sessions':
2427			sessions = [k for k in self.sessions]
2428
2429		if key in ['D47', 'D48']:
2430			# Full disclosure: the definition of Nf is tricky/debatable
2431			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2432			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2433			Nf = len(G)
2434# 			print(f'len(G) = {Nf}')
2435			Nf -= len([s for s in mysamples if s in self.unknowns])
2436# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2437			for session in sessions:
2438				Np = len([
2439					_ for _ in self.standardization.params
2440					if (
2441						self.standardization.params[_].expr is not None
2442						and (
2443							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2444							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2445							)
2446						)
2447					])
2448# 				print(f'session {session}: {Np} parameters to consider')
2449				Na = len({
2450					r['Sample'] for r in self.sessions[session]['data']
2451					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2452					})
2453# 				print(f'session {session}: {Na} different anchors in that session')
2454				Nf -= min(Np, Na)
2455# 			print(f'Nf = {Nf}')
2456
2457# 			for sample in mysamples :
2458# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2459# 				if len(X) > 1 :
2460# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2461# 					if sample in self.unknowns:
2462# 						Nf += len(X) - 1
2463# 					else:
2464# 						Nf += len(X)
2465# 			if samples in ['anchors', 'all samples']:
2466# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2467			r = (chisq / Nf)**.5 if Nf > 0 else 0
2468
2469		else: # if key not in ['D47', 'D48']
2470			chisq, Nf = 0, 0
2471			for sample in mysamples :
2472				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2473				if len(X) > 1 :
2474					Nf += len(X) - 1
2475					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2476			r = (chisq / Nf)**.5 if Nf > 0 else 0
2477
2478		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2479		return r
2480
2481	def sample_average(self, samples, weights = 'equal', normalize = True):
2482		'''
2483		Weighted average Δ4x value of a group of samples, accounting for covariance.
2484
2485		Returns the weighed average Δ4x value and associated SE
2486		of a group of samples. Weights are equal by default. If `normalize` is
2487		true, `weights` will be rescaled so that their sum equals 1.
2488
2489		**Examples**
2490
2491		```python
2492		self.sample_average(['X','Y'], [1, 2])
2493		```
2494
2495		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2496		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2497		values of samples X and Y, respectively.
2498
2499		```python
2500		self.sample_average(['X','Y'], [1, -1], normalize = False)
2501		```
2502
2503		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2504		'''
2505		if weights == 'equal':
2506			weights = [1/len(samples)] * len(samples)
2507
2508		if normalize:
2509			s = sum(weights)
2510			if s:
2511				weights = [w/s for w in weights]
2512
2513		try:
2514# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2515# 			C = self.standardization.covar[indices,:][:,indices]
2516			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2517			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2518			return correlated_sum(X, C, weights)
2519		except ValueError:
2520			return (0., 0.)
2521
2522
2523	def sample_D4x_covar(self, sample1, sample2 = None):
2524		'''
2525		Covariance between Δ4x values of samples
2526
2527		Returns the error covariance between the average Δ4x values of two
2528		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2529		returns the Δ4x variance for that sample.
2530		'''
2531		if sample2 is None:
2532			sample2 = sample1
2533		if self.standardization_method == 'pooled':
2534			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2535			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2536			return self.standardization.covar[i, j]
2537		elif self.standardization_method == 'indep_sessions':
2538			if sample1 == sample2:
2539				return self.samples[sample1][f'SE_D{self._4x}']**2
2540			else:
2541				c = 0
2542				for session in self.sessions:
2543					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2544					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2545					if sdata1 and sdata2:
2546						a = self.sessions[session]['a']
2547						# !! TODO: CM below does not account for temporal changes in standardization parameters
2548						CM = self.sessions[session]['CM'][:3,:3]
2549						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2550						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2551						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2552						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2553						c += (
2554							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2555							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2556							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2557							@ CM
2558							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2559							) / a**2
2560				return float(c)
2561
2562	def sample_D4x_correl(self, sample1, sample2 = None):
2563		'''
2564		Correlation between Δ4x errors of samples
2565
2566		Returns the error correlation between the average Δ4x values of two samples.
2567		'''
2568		if sample2 is None or sample2 == sample1:
2569			return 1.
2570		return (
2571			self.sample_D4x_covar(sample1, sample2)
2572			/ self.unknowns[sample1][f'SE_D{self._4x}']
2573			/ self.unknowns[sample2][f'SE_D{self._4x}']
2574			)
2575
2576	def plot_single_session(self,
2577		session,
2578		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2579		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2580		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2581		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2582		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2583		xylimits = 'free', # | 'constant'
2584		x_label = None,
2585		y_label = None,
2586		error_contour_interval = 'auto',
2587		fig = 'new',
2588		):
2589		'''
2590		Generate plot for a single session
2591		'''
2592		if x_label is None:
2593			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2594		if y_label is None:
2595			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2596
2597		out = _SessionPlot()
2598		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2599		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2600		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2601		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2602		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2603		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2604		anchor_avg = (np.array([ np.array([
2605				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2606				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2607				]) for sample in anchors]).T,
2608			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2609		unknown_avg = (np.array([ np.array([
2610				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2611				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2612				]) for sample in unknowns]).T,
2613			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2614		
2615		
2616		if fig == 'new':
2617			out.fig = ppl.figure(figsize = (6,6))
2618			ppl.subplots_adjust(.1,.1,.9,.9)
2619
2620		out.anchor_analyses, = ppl.plot(
2621			anchors_d,
2622			anchors_D,
2623			**kw_plot_anchors)
2624		out.unknown_analyses, = ppl.plot(
2625			unknowns_d,
2626			unknowns_D,
2627			**kw_plot_unknowns)
2628		out.anchor_avg = ppl.plot(
2629			*anchor_avg,
2630			**kw_plot_anchor_avg)
2631		out.unknown_avg = ppl.plot(
2632			*unknown_avg,
2633			**kw_plot_unknown_avg)
2634		if xylimits == 'constant':
2635			x = [r[f'd{self._4x}'] for r in self]
2636			y = [r[f'D{self._4x}'] for r in self]
2637			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2638			w, h = x2-x1, y2-y1
2639			x1 -= w/20
2640			x2 += w/20
2641			y1 -= h/20
2642			y2 += h/20
2643			ppl.axis([x1, x2, y1, y2])
2644		elif xylimits == 'free':
2645			x1, x2, y1, y2 = ppl.axis()
2646		else:
2647			x1, x2, y1, y2 = ppl.axis(xylimits)
2648				
2649		if error_contour_interval != 'none':
2650			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2651			XI,YI = np.meshgrid(xi, yi)
2652			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2653			if error_contour_interval == 'auto':
2654				rng = np.max(SI) - np.min(SI)
2655				if rng <= 0.01:
2656					cinterval = 0.001
2657				elif rng <= 0.03:
2658					cinterval = 0.004
2659				elif rng <= 0.1:
2660					cinterval = 0.01
2661				elif rng <= 0.3:
2662					cinterval = 0.03
2663				elif rng <= 1.:
2664					cinterval = 0.1
2665				else:
2666					cinterval = 0.5
2667			else:
2668				cinterval = error_contour_interval
2669
2670			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2671			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2672			out.clabel = ppl.clabel(out.contour)
2673			contour = (XI, YI, SI, cval, cinterval)
2674
2675		if fig == None:
2676			return {
2677			'anchors':anchors,
2678			'unknowns':unknowns,
2679			'anchors_d':anchors_d,
2680			'anchors_D':anchors_D,
2681			'unknowns_d':unknowns_d,
2682			'unknowns_D':unknowns_D,
2683			'anchor_avg':anchor_avg,
2684			'unknown_avg':unknown_avg,
2685			'contour':contour,
2686			}
2687
2688		ppl.xlabel(x_label)
2689		ppl.ylabel(y_label)
2690		ppl.title(session, weight = 'bold')
2691		ppl.grid(alpha = .2)
2692		out.ax = ppl.gca()		
2693
2694		return out
2695
2696	def plot_residuals(
2697		self,
2698		kde = False,
2699		hist = False,
2700		binwidth = 2/3,
2701		dir = 'output',
2702		filename = None,
2703		highlight = [],
2704		colors = None,
2705		figsize = None,
2706		dpi = 100,
2707		yspan = None,
2708		):
2709		'''
2710		Plot residuals of each analysis as a function of time (actually, as a function of
2711		the order of analyses in the `D4xdata` object)
2712
2713		+ `kde`: whether to add a kernel density estimate of residuals
2714		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2715		+ `histbins`: specify bin edges for the histogram
2716		+ `dir`: the directory in which to save the plot
2717		+ `highlight`: a list of samples to highlight
2718		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2719		+ `figsize`: (width, height) of figure
2720		+ `dpi`: resolution for PNG output
2721		+ `yspan`: factor controlling the range of y values shown in plot
2722		  (by default: `yspan = 1.5 if kde else 1.0`)
2723		'''
2724		
2725		from matplotlib import ticker
2726
2727		if yspan is None:
2728			if kde:
2729				yspan = 1.5
2730			else:
2731				yspan = 1.0
2732		
2733		# Layout
2734		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2735		if hist or kde:
2736			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2737			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2738		else:
2739			ppl.subplots_adjust(.08,.05,.78,.8)
2740			ax1 = ppl.subplot(111)
2741		
2742		# Colors
2743		N = len(self.anchors)
2744		if colors is None:
2745			if len(highlight) > 0:
2746				Nh = len(highlight)
2747				if Nh == 1:
2748					colors = {highlight[0]: (0,0,0)}
2749				elif Nh == 3:
2750					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2751				elif Nh == 4:
2752					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2753				else:
2754					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2755			else:
2756				if N == 3:
2757					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2758				elif N == 4:
2759					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2760				else:
2761					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2762
2763		ppl.sca(ax1)
2764		
2765		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2766
2767		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2768
2769		session = self[0]['Session']
2770		x1 = 0
2771# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2772		x_sessions = {}
2773		one_or_more_singlets = False
2774		one_or_more_multiplets = False
2775		multiplets = set()
2776		for k,r in enumerate(self):
2777			if r['Session'] != session:
2778				x2 = k-1
2779				x_sessions[session] = (x1+x2)/2
2780				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2781				session = r['Session']
2782				x1 = k
2783			singlet = len(self.samples[r['Sample']]['data']) == 1
2784			if not singlet:
2785				multiplets.add(r['Sample'])
2786			if r['Sample'] in self.unknowns:
2787				if singlet:
2788					one_or_more_singlets = True
2789				else:
2790					one_or_more_multiplets = True
2791			kw = dict(
2792				marker = 'x' if singlet else '+',
2793				ms = 4 if singlet else 5,
2794				ls = 'None',
2795				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2796				mew = 1,
2797				alpha = 0.2 if singlet else 1,
2798				)
2799			if highlight and r['Sample'] not in highlight:
2800				kw['alpha'] = 0.2
2801			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2802		x2 = k
2803		x_sessions[session] = (x1+x2)/2
2804
2805		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2806		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2807		if not (hist or kde):
2808			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2809			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2810
2811		xmin, xmax, ymin, ymax = ppl.axis()
2812		if yspan != 1:
2813			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2814		for s in x_sessions:
2815			ppl.text(
2816				x_sessions[s],
2817				ymax +1,
2818				s,
2819				va = 'bottom',
2820				**(
2821					dict(ha = 'center')
2822					if len(self.sessions[s]['data']) > (0.15 * len(self))
2823					else dict(ha = 'left', rotation = 45)
2824					)
2825				)
2826
2827		if hist or kde:
2828			ppl.sca(ax2)
2829
2830		for s in colors:
2831			kw['marker'] = '+'
2832			kw['ms'] = 5
2833			kw['mec'] = colors[s]
2834			kw['label'] = s
2835			kw['alpha'] = 1
2836			ppl.plot([], [], **kw)
2837
2838		kw['mec'] = (0,0,0)
2839
2840		if one_or_more_singlets:
2841			kw['marker'] = 'x'
2842			kw['ms'] = 4
2843			kw['alpha'] = .2
2844			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2845			ppl.plot([], [], **kw)
2846
2847		if one_or_more_multiplets:
2848			kw['marker'] = '+'
2849			kw['ms'] = 4
2850			kw['alpha'] = 1
2851			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2852			ppl.plot([], [], **kw)
2853
2854		if hist or kde:
2855			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2856		else:
2857			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2858		leg.set_zorder(-1000)
2859
2860		ppl.sca(ax1)
2861
2862		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2863		ppl.xticks([])
2864		ppl.axis([-1, len(self), None, None])
2865
2866		if hist or kde:
2867			ppl.sca(ax2)
2868			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2869
2870			if kde:
2871				from scipy.stats import gaussian_kde
2872				yi = np.linspace(ymin, ymax, 201)
2873				xi = gaussian_kde(X).evaluate(yi)
2874				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2875# 				ppl.plot(xi, yi, 'k-', lw = 1)
2876			elif hist:
2877				ppl.hist(
2878					X,
2879					orientation = 'horizontal',
2880					histtype = 'stepfilled',
2881					ec = [.4]*3,
2882					fc = [.25]*3,
2883					alpha = .25,
2884					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2885					)
2886			ppl.text(0, 0,
2887				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2888				size = 7.5,
2889				alpha = 1,
2890				va = 'center',
2891				ha = 'left',
2892				)
2893
2894			ppl.axis([0, None, ymin, ymax])
2895			ppl.xticks([])
2896			ppl.yticks([])
2897# 			ax2.spines['left'].set_visible(False)
2898			ax2.spines['right'].set_visible(False)
2899			ax2.spines['top'].set_visible(False)
2900			ax2.spines['bottom'].set_visible(False)
2901
2902		ax1.axis([None, None, ymin, ymax])
2903
2904		if not os.path.exists(dir):
2905			os.makedirs(dir)
2906		if filename is None:
2907			return fig
2908		elif filename == '':
2909			filename = f'D{self._4x}_residuals.pdf'
2910		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2911		ppl.close(fig)
2912				
2913
2914	def simulate(self, *args, **kwargs):
2915		'''
2916		Legacy function with warning message pointing to `virtual_data()`
2917		'''
2918		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
2919
2920	def plot_distribution_of_analyses(
2921		self,
2922		dir = 'output',
2923		filename = None,
2924		vs_time = False,
2925		figsize = (6,4),
2926		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2927		output = None,
2928		dpi = 100,
2929		):
2930		'''
2931		Plot temporal distribution of all analyses in the data set.
2932		
2933		**Parameters**
2934
2935		+ `dir`: the directory in which to save the plot
2936		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2937		+ `dpi`: resolution for PNG output
2938		+ `figsize`: (width, height) of figure
2939		+ `dpi`: resolution for PNG output
2940		'''
2941
2942		asamples = [s for s in self.anchors]
2943		usamples = [s for s in self.unknowns]
2944		if output is None or output == 'fig':
2945			fig = ppl.figure(figsize = figsize)
2946			ppl.subplots_adjust(*subplots_adjust)
2947		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2948		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2949		Xmax += (Xmax-Xmin)/40
2950		Xmin -= (Xmax-Xmin)/41
2951		for k, s in enumerate(asamples + usamples):
2952			if vs_time:
2953				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2954			else:
2955				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2956			Y = [-k for x in X]
2957			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2958			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2959			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2960		ppl.axis([Xmin, Xmax, -k-1, 1])
2961		ppl.xlabel('\ntime')
2962		ppl.gca().annotate('',
2963			xy = (0.6, -0.02),
2964			xycoords = 'axes fraction',
2965			xytext = (.4, -0.02), 
2966            arrowprops = dict(arrowstyle = "->", color = 'k'),
2967            )
2968			
2969
2970		x2 = -1
2971		for session in self.sessions:
2972			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2973			if vs_time:
2974				ppl.axvline(x1, color = 'k', lw = .75)
2975			if x2 > -1:
2976				if not vs_time:
2977					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2978			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2979# 			from xlrd import xldate_as_datetime
2980# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2981			if vs_time:
2982				ppl.axvline(x2, color = 'k', lw = .75)
2983				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2984			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2985
2986		ppl.xticks([])
2987		ppl.yticks([])
2988
2989		if output is None:
2990			if not os.path.exists(dir):
2991				os.makedirs(dir)
2992			if filename == None:
2993				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2994			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2995			ppl.close(fig)
2996		elif output == 'ax':
2997			return ppl.gca()
2998		elif output == 'fig':
2999			return fig
3000
3001
3002	def plot_bulk_compositions(
3003		self,
3004		samples = None,
3005		dir = 'output/bulk_compositions',
3006		figsize = (6,6),
3007		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3008		show = False,
3009		sample_color = (0,.5,1),
3010		analysis_color = (.7,.7,.7),
3011		labeldist = 0.3,
3012		radius = 0.05,
3013		):
3014		'''
3015		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3016		
3017		By default, creates a directory `./output/bulk_compositions` where plots for
3018		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3019		
3020		
3021		**Parameters**
3022
3023		+ `samples`: Only these samples are processed (by default: all samples).
3024		+ `dir`: where to save the plots
3025		+ `figsize`: (width, height) of figure
3026		+ `subplots_adjust`: passed to `subplots_adjust()`
3027		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3028		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3029		+ `sample_color`: color used for replicate markers/labels
3030		+ `analysis_color`: color used for sample markers/labels
3031		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3032		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3033		'''
3034
3035		from matplotlib.patches import Ellipse
3036
3037		if samples is None:
3038			samples = [_ for _ in self.samples]
3039
3040		saved = {}
3041
3042		for s in samples:
3043
3044			fig = ppl.figure(figsize = figsize)
3045			fig.subplots_adjust(*subplots_adjust)
3046			ax = ppl.subplot(111)
3047			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3048			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3049			ppl.title(s)
3050
3051
3052			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3053			UID = [_['UID'] for _ in self.samples[s]['data']]
3054			XY0 = XY.mean(0)
3055
3056			for xy in XY:
3057				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3058				
3059			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3060			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3061			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3062			saved[s] = [XY, XY0]
3063			
3064			x1, x2, y1, y2 = ppl.axis()
3065			x0, dx = (x1+x2)/2, (x2-x1)/2
3066			y0, dy = (y1+y2)/2, (y2-y1)/2
3067			dx, dy = [max(max(dx, dy), radius)]*2
3068
3069			ppl.axis([
3070				x0 - 1.2*dx,
3071				x0 + 1.2*dx,
3072				y0 - 1.2*dy,
3073				y0 + 1.2*dy,
3074				])			
3075
3076			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3077
3078			for xy, uid in zip(XY, UID):
3079
3080				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3081				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3082
3083				if (vector_in_display_space**2).sum() > 0:
3084
3085					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3086					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3087					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3088					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3089
3090					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3091
3092				else:
3093
3094					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3095
3096			if radius:
3097				ax.add_artist(Ellipse(
3098					xy = XY0,
3099					width = radius*2,
3100					height = radius*2,
3101					ls = (0, (2,2)),
3102					lw = .7,
3103					ec = analysis_color,
3104					fc = 'None',
3105					))
3106				ppl.text(
3107					XY0[0],
3108					XY0[1]-radius,
3109					f'\n± {radius*1e3:.0f} ppm',
3110					color = analysis_color,
3111					va = 'top',
3112					ha = 'center',
3113					linespacing = 0.4,
3114					size = 8,
3115					)
3116
3117			if not os.path.exists(dir):
3118				os.makedirs(dir)
3119			fig.savefig(f'{dir}/{s}.pdf')
3120			ppl.close(fig)
3121
3122		fig = ppl.figure(figsize = figsize)
3123		fig.subplots_adjust(*subplots_adjust)
3124		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3125		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3126
3127		for s in saved:
3128			for xy in saved[s][0]:
3129				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3130			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3131			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3132			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3133
3134		x1, x2, y1, y2 = ppl.axis()
3135		ppl.axis([
3136			x1 - (x2-x1)/10,
3137			x2 + (x2-x1)/10,
3138			y1 - (y2-y1)/10,
3139			y2 + (y2-y1)/10,
3140			])			
3141
3142
3143		if not os.path.exists(dir):
3144			os.makedirs(dir)
3145		fig.savefig(f'{dir}/__all__.pdf')
3146		if show:
3147			ppl.show()
3148		ppl.close(fig)
3149		
3150
3151	def _save_D4x_correl(
3152		self,
3153		samples = None,
3154		dir = 'output',
3155		filename = None,
3156		D4x_precision = 4,
3157		correl_precision = 4,
3158		):
3159		'''
3160		Save D4x values along with their SE and correlation matrix.
3161
3162		**Parameters**
3163
3164		+ `samples`: Only these samples are output (by default: all samples).
3165		+ `dir`: the directory in which to save the faile (by defaut: `output`)
3166		+ `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`)
3167		+ `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4)
3168		+ `correl_precision`: the precision to use when writing correlation factor values (by default: 4)
3169		'''
3170		if samples is None:
3171			samples = sorted([s for s in self.unknowns])
3172		
3173		out = [['Sample']] + [[s] for s in samples]
3174		out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl']
3175		for k,s in enumerate(samples):
3176			out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}']
3177			for s2 in samples:
3178				out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}']
3179		
3180		if not os.path.exists(dir):
3181			os.makedirs(dir)
3182		if filename is None:
3183			filename = f'D{self._4x}_correl.csv'
3184		with open(f'{dir}/{filename}', 'w') as fid:
3185			fid.write(make_csv(out))

Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.

D4xdata(l=[], mass='47', logfile='', session='mySession', verbose=False)
957	def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False):
958		'''
959		**Parameters**
960
961		+ `l`: a list of dictionaries, with each dictionary including at least the keys
962		`Sample`, `d45`, `d46`, and `d47` or `d48`.
963		+ `mass`: `'47'` or `'48'`
964		+ `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods.
965		+ `session`: define session name for analyses without a `Session` key
966		+ `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods.
967
968		Returns a `D4xdata` object derived from `list`.
969		'''
970		self._4x = mass
971		self.verbose = verbose
972		self.prefix = 'D4xdata'
973		self.logfile = logfile
974		list.__init__(self, l)
975		self.Nf = None
976		self.repeatability = {}
977		self.refresh(session = session)

Parameters

  • l: a list of dictionaries, with each dictionary including at least the keys Sample, d45, d46, and d47 or d48.
  • mass: '47' or '48'
  • logfile: if specified, write detailed logs to this file path when calling D4xdata methods.
  • session: define session name for analyses without a Session key
  • verbose: if True, print out detailed logs when calling D4xdata methods.

Returns a D4xdata object derived from list.

R13_VPDB = 0.01118

Absolute (13C/12C) ratio of VPDB. By default equal to 0.01118 (Chang & Li, 1990)

R18_VSMOW = 0.0020052

Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)

LAMBDA_17 = 0.528

Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)

R17_VSMOW = 0.00038475

Absolute (17O/16C) ratio of VSMOW. By default equal to 0.00038475 (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB)

R18_VPDB = 0.0020672007840000003

Absolute (18O/16C) ratio of VPDB. By definition equal to R18_VSMOW * 1.03092.

R17_VPDB = 0.0003909861828790272

Absolute (17O/16C) ratio of VPDB. By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.

LEVENE_REF_SAMPLE = 'ETH-3'

After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).

LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which sample should be used as a reference for this test.

ALPHA_18O_ACID_REACTION = np.float64(1.008129)

Specifies the 18O/16O fractionation factor generally applicable to acid reactions in the dataset. Currently used by D4xdata.wg(), D4xdata.standardize_d13C, and D4xdata.standardize_d18O.

By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).

Nominal_d13C_VPDB = {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}

Nominal δ13CVPDB values assigned to carbonate standards, used by D4xdata.standardize_d13C().

By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after Bernasconi et al. (2018).

Nominal_d18O_VPDB = {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}

Nominal δ18OVPDB values assigned to carbonate standards, used by D4xdata.standardize_d18O().

By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after Bernasconi et al. (2018).

d13C_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ13C values:

  • none: do not apply any δ13C standardization.
  • '1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values and Nominal_d13C_VPDB (averaged over all analyses for which Nominal_d13C_VPDB is defined).
d18O_STANDARDIZATION_METHOD = '2pt'

Method by which to standardize δ18O values:

  • none: do not apply any δ18O standardization.
  • '1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
  • '2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values and Nominal_d18O_VPDB (averaged over all analyses for which Nominal_d18O_VPDB is defined).
verbose
prefix
logfile
Nf
repeatability
def make_verbal(oldfun):
980	def make_verbal(oldfun):
981		'''
982		Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`.
983		'''
984		@wraps(oldfun)
985		def newfun(*args, verbose = '', **kwargs):
986			myself = args[0]
987			oldprefix = myself.prefix
988			myself.prefix = oldfun.__name__
989			if verbose != '':
990				oldverbose = myself.verbose
991				myself.verbose = verbose
992			out = oldfun(*args, **kwargs)
993			myself.prefix = oldprefix
994			if verbose != '':
995				myself.verbose = oldverbose
996			return out
997		return newfun

Decorator: allow temporarily changing self.prefix and overriding self.verbose.

def msg(self, txt):
1000	def msg(self, txt):
1001		'''
1002		Log a message to `self.logfile`, and print it out if `verbose = True`
1003		'''
1004		self.log(txt)
1005		if self.verbose:
1006			print(f'{f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile, and print it out if verbose = True

def vmsg(self, txt):
1009	def vmsg(self, txt):
1010		'''
1011		Log a message to `self.logfile` and print it out
1012		'''
1013		self.log(txt)
1014		print(txt)

Log a message to self.logfile and print it out

def log(self, *txts):
1017	def log(self, *txts):
1018		'''
1019		Log a message to `self.logfile`
1020		'''
1021		if self.logfile:
1022			with open(self.logfile, 'a') as fid:
1023				for txt in txts:
1024					fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')

Log a message to self.logfile

def refresh(self, session='mySession'):
1027	def refresh(self, session = 'mySession'):
1028		'''
1029		Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`.
1030		'''
1031		self.fill_in_missing_info(session = session)
1032		self.refresh_sessions()
1033		self.refresh_samples()

Update self.sessions, self.samples, self.anchors, and self.unknowns.

def refresh_sessions(self):
1036	def refresh_sessions(self):
1037		'''
1038		Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift`
1039		to `False` for all sessions.
1040		'''
1041		self.sessions = {
1042			s: {'data': [r for r in self if r['Session'] == s]}
1043			for s in sorted({r['Session'] for r in self})
1044			}
1045		for s in self.sessions:
1046			self.sessions[s]['scrambling_drift'] = False
1047			self.sessions[s]['slope_drift'] = False
1048			self.sessions[s]['wg_drift'] = False
1049			self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD
1050			self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD

Update self.sessions and set scrambling_drift, slope_drift, and wg_drift to False for all sessions.

def refresh_samples(self):
1053	def refresh_samples(self):
1054		'''
1055		Define `self.samples`, `self.anchors`, and `self.unknowns`.
1056		'''
1057		self.samples = {
1058			s: {'data': [r for r in self if r['Sample'] == s]}
1059			for s in sorted({r['Sample'] for r in self})
1060			}
1061		self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x}
1062		self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}

Define self.samples, self.anchors, and self.unknowns.

def read(self, filename, sep='', session=''):
1065	def read(self, filename, sep = '', session = ''):
1066		'''
1067		Read file in csv format to load data into a `D47data` object.
1068
1069		In the csv file, spaces before and after field separators (`','` by default)
1070		are optional. Each line corresponds to a single analysis.
1071
1072		The required fields are:
1073
1074		+ `UID`: a unique identifier
1075		+ `Session`: an identifier for the analytical session
1076		+ `Sample`: a sample identifier
1077		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1078
1079		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1080		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1081		and `d49` are optional, and set to NaN by default.
1082
1083		**Parameters**
1084
1085		+ `fileneme`: the path of the file to read
1086		+ `sep`: csv separator delimiting the fields
1087		+ `session`: set `Session` field to this string for all analyses
1088		'''
1089		with open(filename) as fid:
1090			self.input(fid.read(), sep = sep, session = session)

Read file in csv format to load data into a D47data object.

In the csv file, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • fileneme: the path of the file to read
  • sep: csv separator delimiting the fields
  • session: set Session field to this string for all analyses
def input(self, txt, sep='', session=''):
1093	def input(self, txt, sep = '', session = ''):
1094		'''
1095		Read `txt` string in csv format to load analysis data into a `D47data` object.
1096
1097		In the csv string, spaces before and after field separators (`','` by default)
1098		are optional. Each line corresponds to a single analysis.
1099
1100		The required fields are:
1101
1102		+ `UID`: a unique identifier
1103		+ `Session`: an identifier for the analytical session
1104		+ `Sample`: a sample identifier
1105		+ `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values
1106
1107		Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to
1108		VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48`
1109		and `d49` are optional, and set to NaN by default.
1110
1111		**Parameters**
1112
1113		+ `txt`: the csv string to read
1114		+ `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`,
1115		whichever appers most often in `txt`.
1116		+ `session`: set `Session` field to this string for all analyses
1117		'''
1118		if sep == '':
1119			sep = sorted(',;\t', key = lambda x: - txt.count(x))[0]
1120		txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()]
1121		data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]]
1122
1123		if session != '':
1124			for r in data:
1125				r['Session'] = session
1126
1127		self += data
1128		self.refresh()

Read txt string in csv format to load analysis data into a D47data object.

In the csv string, spaces before and after field separators (',' by default) are optional. Each line corresponds to a single analysis.

The required fields are:

  • UID: a unique identifier
  • Session: an identifier for the analytical session
  • Sample: a sample identifier
  • d45, d46, and at least one of d47 or d48: the working-gas delta values

Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48 and d49 are optional, and set to NaN by default.

Parameters

  • txt: the csv string to read
  • sep: csv separator delimiting the fields. By default, use ,, ;, or , whichever appers most often in txt.
  • session: set Session field to this string for all analyses
@make_verbal
def wg(self, samples=None, a18_acid=None):
1131	@make_verbal
1132	def wg(self, samples = None, a18_acid = None):
1133		'''
1134		Compute bulk composition of the working gas for each session based on
1135		the carbonate standards defined in both `self.Nominal_d13C_VPDB` and
1136		`self.Nominal_d18O_VPDB`.
1137		'''
1138
1139		self.msg('Computing WG composition:')
1140
1141		if a18_acid is None:
1142			a18_acid = self.ALPHA_18O_ACID_REACTION
1143		if samples is None:
1144			samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB]
1145
1146		assert a18_acid, f'Acid fractionation factor should not be zero.'
1147
1148		samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB]
1149		R45R46_standards = {}
1150		for sample in samples:
1151			d13C_vpdb = self.Nominal_d13C_VPDB[sample]
1152			d18O_vpdb = self.Nominal_d18O_VPDB[sample]
1153			R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000)
1154			R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17
1155			R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid
1156
1157			C12_s = 1 / (1 + R13_s)
1158			C13_s = R13_s / (1 + R13_s)
1159			C16_s = 1 / (1 + R17_s + R18_s)
1160			C17_s = R17_s / (1 + R17_s + R18_s)
1161			C18_s = R18_s / (1 + R17_s + R18_s)
1162
1163			C626_s = C12_s * C16_s ** 2
1164			C627_s = 2 * C12_s * C16_s * C17_s
1165			C628_s = 2 * C12_s * C16_s * C18_s
1166			C636_s = C13_s * C16_s ** 2
1167			C637_s = 2 * C13_s * C16_s * C17_s
1168			C727_s = C12_s * C17_s ** 2
1169
1170			R45_s = (C627_s + C636_s) / C626_s
1171			R46_s = (C628_s + C637_s + C727_s) / C626_s
1172			R45R46_standards[sample] = (R45_s, R46_s)
1173		
1174		for s in self.sessions:
1175			db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples]
1176			assert db, f'No sample from {samples} found in session "{s}".'
1177# 			dbsamples = sorted({r['Sample'] for r in db})
1178
1179			X = [r['d45'] for r in db]
1180			Y = [R45R46_standards[r['Sample']][0] for r in db]
1181			x1, x2 = np.min(X), np.max(X)
1182
1183			if x1 < x2:
1184				wgcoord = x1/(x1-x2)
1185			else:
1186				wgcoord = 999
1187
1188			if wgcoord < -.5 or wgcoord > 1.5:
1189				# unreasonable to extrapolate to d45 = 0
1190				R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1191			else :
1192				# d45 = 0 is reasonably well bracketed
1193				R45_wg = np.polyfit(X, Y, 1)[1]
1194
1195			X = [r['d46'] for r in db]
1196			Y = [R45R46_standards[r['Sample']][1] for r in db]
1197			x1, x2 = np.min(X), np.max(X)
1198
1199			if x1 < x2:
1200				wgcoord = x1/(x1-x2)
1201			else:
1202				wgcoord = 999
1203
1204			if wgcoord < -.5 or wgcoord > 1.5:
1205				# unreasonable to extrapolate to d46 = 0
1206				R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)])
1207			else :
1208				# d46 = 0 is reasonably well bracketed
1209				R46_wg = np.polyfit(X, Y, 1)[1]
1210
1211			d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg)
1212
1213			self.msg(f'Session {s} WG:   δ13C_VPDB = {d13Cwg_VPDB:.3f}   δ18O_VSMOW = {d18Owg_VSMOW:.3f}')
1214
1215			self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB
1216			self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW
1217			for r in self.sessions[s]['data']:
1218				r['d13Cwg_VPDB'] = d13Cwg_VPDB
1219				r['d18Owg_VSMOW'] = d18Owg_VSMOW

Compute bulk composition of the working gas for each session based on the carbonate standards defined in both self.Nominal_d13C_VPDB and self.Nominal_d18O_VPDB.

def compute_bulk_delta(self, R45, R46, D17O=0):
1222	def compute_bulk_delta(self, R45, R46, D17O = 0):
1223		'''
1224		Compute δ13C_VPDB and δ18O_VSMOW,
1225		by solving the generalized form of equation (17) from
1226		[Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05),
1227		assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and
1228		solving the corresponding second-order Taylor polynomial.
1229		(Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014))
1230		'''
1231
1232		K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17
1233
1234		A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17)
1235		B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17
1236		C = 2 * self.R18_VSMOW
1237		D = -R46
1238
1239		aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2
1240		bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C
1241		cc = A + B + C + D
1242
1243		d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa)
1244
1245		R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW
1246		R17 = K * R18 ** self.LAMBDA_17
1247		R13 = R45 - 2 * R17
1248
1249		d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1)
1250
1251		return d13C_VPDB, d18O_VSMOW

Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)

@make_verbal
def crunch(self, verbose=''):
1254	@make_verbal
1255	def crunch(self, verbose = ''):
1256		'''
1257		Compute bulk composition and raw clumped isotope anomalies for all analyses.
1258		'''
1259		for r in self:
1260			self.compute_bulk_and_clumping_deltas(r)
1261		self.standardize_d13C()
1262		self.standardize_d18O()
1263		self.msg(f"Crunched {len(self)} analyses.")

Compute bulk composition and raw clumped isotope anomalies for all analyses.

def fill_in_missing_info(self, session='mySession'):
1266	def fill_in_missing_info(self, session = 'mySession'):
1267		'''
1268		Fill in optional fields with default values
1269		'''
1270		for i,r in enumerate(self):
1271			if 'D17O' not in r:
1272				r['D17O'] = 0.
1273			if 'UID' not in r:
1274				r['UID'] = f'{i+1}'
1275			if 'Session' not in r:
1276				r['Session'] = session
1277			for k in ['d47', 'd48', 'd49']:
1278				if k not in r:
1279					r[k] = np.nan

Fill in optional fields with default values

def standardize_d13C(self):
1282	def standardize_d13C(self):
1283		'''
1284		Perform δ13C standadization within each session `s` according to
1285		`self.sessions[s]['d13C_standardization_method']`, which is defined by default
1286		by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but
1287		may be redefined abitrarily at a later stage.
1288		'''
1289		for s in self.sessions:
1290			if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']:
1291				XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB]
1292				X,Y = zip(*XY)
1293				if self.sessions[s]['d13C_standardization_method'] == '1pt':
1294					offset = np.mean(Y) - np.mean(X)
1295					for r in self.sessions[s]['data']:
1296						r['d13C_VPDB'] += offset				
1297				elif self.sessions[s]['d13C_standardization_method'] == '2pt':
1298					a,b = np.polyfit(X,Y,1)
1299					for r in self.sessions[s]['data']:
1300						r['d13C_VPDB'] = a * r['d13C_VPDB'] + b

Perform δ13C standadization within each session s according to self.sessions[s]['d13C_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def standardize_d18O(self):
1302	def standardize_d18O(self):
1303		'''
1304		Perform δ18O standadization within each session `s` according to
1305		`self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`,
1306		which is defined by default by `D47data.refresh_sessions()`as equal to
1307		`self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage.
1308		'''
1309		for s in self.sessions:
1310			if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']:
1311				XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB]
1312				X,Y = zip(*XY)
1313				Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y]
1314				if self.sessions[s]['d18O_standardization_method'] == '1pt':
1315					offset = np.mean(Y) - np.mean(X)
1316					for r in self.sessions[s]['data']:
1317						r['d18O_VSMOW'] += offset				
1318				elif self.sessions[s]['d18O_standardization_method'] == '2pt':
1319					a,b = np.polyfit(X,Y,1)
1320					for r in self.sessions[s]['data']:
1321						r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b

Perform δ18O standadization within each session s according to self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'], which is defined by default by D47data.refresh_sessions()as equal to self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.

def compute_bulk_and_clumping_deltas(self, r):
1324	def compute_bulk_and_clumping_deltas(self, r):
1325		'''
1326		Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`.
1327		'''
1328
1329		# Compute working gas R13, R18, and isobar ratios
1330		R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000)
1331		R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000)
1332		R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg)
1333
1334		# Compute analyte isobar ratios
1335		R45 = (1 + r['d45'] / 1000) * R45_wg
1336		R46 = (1 + r['d46'] / 1000) * R46_wg
1337		R47 = (1 + r['d47'] / 1000) * R47_wg
1338		R48 = (1 + r['d48'] / 1000) * R48_wg
1339		R49 = (1 + r['d49'] / 1000) * R49_wg
1340
1341		r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O'])
1342		R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB
1343		R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW
1344
1345		# Compute stochastic isobar ratios of the analyte
1346		R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios(
1347			R13, R18, D17O = r['D17O']
1348		)
1349
1350		# Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1,
1351		# and raise a warning if the corresponding anomalies exceed 0.02 ppm.
1352		if (R45 / R45stoch - 1) > 5e-8:
1353			self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm')
1354		if (R46 / R46stoch - 1) > 5e-8:
1355			self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm')
1356
1357		# Compute raw clumped isotope anomalies
1358		r['D47raw'] = 1000 * (R47 / R47stoch - 1)
1359		r['D48raw'] = 1000 * (R48 / R48stoch - 1)
1360		r['D49raw'] = 1000 * (R49 / R49stoch - 1)

Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.

def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1363	def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0):
1364		'''
1365		Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`,
1366		optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope
1367		anomalies (`D47`, `D48`, `D49`), all expressed in permil.
1368		'''
1369
1370		# Compute R17
1371		R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17
1372
1373		# Compute isotope concentrations
1374		C12 = (1 + R13) ** -1
1375		C13 = C12 * R13
1376		C16 = (1 + R17 + R18) ** -1
1377		C17 = C16 * R17
1378		C18 = C16 * R18
1379
1380		# Compute stochastic isotopologue concentrations
1381		C626 = C16 * C12 * C16
1382		C627 = C16 * C12 * C17 * 2
1383		C628 = C16 * C12 * C18 * 2
1384		C636 = C16 * C13 * C16
1385		C637 = C16 * C13 * C17 * 2
1386		C638 = C16 * C13 * C18 * 2
1387		C727 = C17 * C12 * C17
1388		C728 = C17 * C12 * C18 * 2
1389		C737 = C17 * C13 * C17
1390		C738 = C17 * C13 * C18 * 2
1391		C828 = C18 * C12 * C18
1392		C838 = C18 * C13 * C18
1393
1394		# Compute stochastic isobar ratios
1395		R45 = (C636 + C627) / C626
1396		R46 = (C628 + C637 + C727) / C626
1397		R47 = (C638 + C728 + C737) / C626
1398		R48 = (C738 + C828) / C626
1399		R49 = C838 / C626
1400
1401		# Account for stochastic anomalies
1402		R47 *= 1 + D47 / 1000
1403		R48 *= 1 + D48 / 1000
1404		R49 *= 1 + D49 / 1000
1405
1406		# Return isobar ratios
1407		return R45, R46, R47, R48, R49

Compute isobar ratios for a sample with isotopic ratios R13 and R18, optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope anomalies (D47, D48, D49), all expressed in permil.

def split_samples(self, samples_to_split='all', grouping='by_session'):
1410	def split_samples(self, samples_to_split = 'all', grouping = 'by_session'):
1411		'''
1412		Split unknown samples by UID (treat all analyses as different samples)
1413		or by session (treat analyses of a given sample in different sessions as
1414		different samples).
1415
1416		**Parameters**
1417
1418		+ `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']`
1419		+ `grouping`: `by_uid` | `by_session`
1420		'''
1421		if samples_to_split == 'all':
1422			samples_to_split = [s for s in self.unknowns]
1423		gkeys = {'by_uid':'UID', 'by_session':'Session'}
1424		self.grouping = grouping.lower()
1425		if self.grouping in gkeys:
1426			gkey = gkeys[self.grouping]
1427		for r in self:
1428			if r['Sample'] in samples_to_split:
1429				r['Sample_original'] = r['Sample']
1430				r['Sample'] = f"{r['Sample']}__{r[gkey]}"
1431			elif r['Sample'] in self.unknowns:
1432				r['Sample_original'] = r['Sample']
1433		self.refresh_samples()

Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).

Parameters

  • samples_to_split: a list of samples to split, e.g., ['IAEA-C1', 'IAEA-C2']
  • grouping: by_uid | by_session
def unsplit_samples(self, tables=False):
1436	def unsplit_samples(self, tables = False):
1437		'''
1438		Reverse the effects of `D47data.split_samples()`.
1439		
1440		This should only be used after `D4xdata.standardize()` with `method='pooled'`.
1441		
1442		After `D4xdata.standardize()` with `method='indep_sessions'`, one should
1443		probably use `D4xdata.combine_samples()` instead to reverse the effects of
1444		`D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the
1445		effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in
1446		that case session-averaged Δ4x values are statistically independent).
1447		'''
1448		unknowns_old = sorted({s for s in self.unknowns})
1449		CM_old = self.standardization.covar[:,:]
1450		VD_old = self.standardization.params.valuesdict().copy()
1451		vars_old = self.standardization.var_names
1452
1453		unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r})
1454
1455		Ns = len(vars_old) - len(unknowns_old)
1456		vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new]
1457		VD_new = {k: VD_old[k] for k in vars_old[:Ns]}
1458
1459		W = np.zeros((len(vars_new), len(vars_old)))
1460		W[:Ns,:Ns] = np.eye(Ns)
1461		for u in unknowns_new:
1462			splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u})
1463			if self.grouping == 'by_session':
1464				weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits]
1465			elif self.grouping == 'by_uid':
1466				weights = [1 for s in splits]
1467			sw = sum(weights)
1468			weights = [w/sw for w in weights]
1469			W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:]
1470
1471		CM_new = W @ CM_old @ W.T
1472		V = W @ np.array([[VD_old[k]] for k in vars_old])
1473		VD_new = {k:v[0] for k,v in zip(vars_new, V)}
1474
1475		self.standardization.covar = CM_new
1476		self.standardization.params.valuesdict = lambda : VD_new
1477		self.standardization.var_names = vars_new
1478
1479		for r in self:
1480			if r['Sample'] in self.unknowns:
1481				r['Sample_split'] = r['Sample']
1482				r['Sample'] = r['Sample_original']
1483
1484		self.refresh_samples()
1485		self.consolidate_samples()
1486		self.repeatabilities()
1487
1488		if tables:
1489			self.table_of_analyses()
1490			self.table_of_samples()

Reverse the effects of D47data.split_samples().

This should only be used after D4xdata.standardize() with method='pooled'.

After D4xdata.standardize() with method='indep_sessions', one should probably use D4xdata.combine_samples() instead to reverse the effects of D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the effects of D47data.split_samples() with grouping='by_sessions' (because in that case session-averaged Δ4x values are statistically independent).

def assign_timestamps(self):
1492	def assign_timestamps(self):
1493		'''
1494		Assign a time field `t` of type `float` to each analysis.
1495
1496		If `TimeTag` is one of the data fields, `t` is equal within a given session
1497		to `TimeTag` minus the mean value of `TimeTag` for that session.
1498		Otherwise, `TimeTag` is by default equal to the index of each analysis
1499		in the dataset and `t` is defined as above.
1500		'''
1501		for session in self.sessions:
1502			sdata = self.sessions[session]['data']
1503			try:
1504				t0 = np.mean([r['TimeTag'] for r in sdata])
1505				for r in sdata:
1506					r['t'] = r['TimeTag'] - t0
1507			except KeyError:
1508				t0 = (len(sdata)-1)/2
1509				for t,r in enumerate(sdata):
1510					r['t'] = t - t0

Assign a time field t of type float to each analysis.

If TimeTag is one of the data fields, t is equal within a given session to TimeTag minus the mean value of TimeTag for that session. Otherwise, TimeTag is by default equal to the index of each analysis in the dataset and t is defined as above.

def report(self):
1513	def report(self):
1514		'''
1515		Prints a report on the standardization fit.
1516		Only applicable after `D4xdata.standardize(method='pooled')`.
1517		'''
1518		report_fit(self.standardization)

Prints a report on the standardization fit. Only applicable after D4xdata.standardize(method='pooled').

def combine_samples(self, sample_groups):
1521	def combine_samples(self, sample_groups):
1522		'''
1523		Combine analyses of different samples to compute weighted average Δ4x
1524		and new error (co)variances corresponding to the groups defined by the `sample_groups`
1525		dictionary.
1526		
1527		Caution: samples are weighted by number of replicate analyses, which is a
1528		reasonable default behavior but is not always optimal (e.g., in the case of strongly
1529		correlated analytical errors for one or more samples).
1530		
1531		Returns a tuplet of:
1532		
1533		+ the list of group names
1534		+ an array of the corresponding Δ4x values
1535		+ the corresponding (co)variance matrix
1536		
1537		**Parameters**
1538
1539		+ `sample_groups`: a dictionary of the form:
1540		```py
1541		{'group1': ['sample_1', 'sample_2'],
1542		 'group2': ['sample_3', 'sample_4', 'sample_5']}
1543		```
1544		'''
1545		
1546		samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])]
1547		groups = sorted(sample_groups.keys())
1548		group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups}
1549		D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples])
1550		CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples])
1551		W = np.array([
1552			[self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples]
1553			for j in groups])
1554		D4x_new = W @ D4x_old
1555		CM_new = W @ CM_old @ W.T
1556
1557		return groups, D4x_new[:,0], CM_new

Combine analyses of different samples to compute weighted average Δ4x and new error (co)variances corresponding to the groups defined by the sample_groups dictionary.

Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).

Returns a tuplet of:

  • the list of group names
  • an array of the corresponding Δ4x values
  • the corresponding (co)variance matrix

Parameters

  • sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
 'group2': ['sample_3', 'sample_4', 'sample_5']}
@make_verbal
def standardize( self, method='pooled', weighted_sessions=[], consolidate=True, consolidate_tables=False, consolidate_plots=False, constraints={}):
1560	@make_verbal
1561	def standardize(self,
1562		method = 'pooled',
1563		weighted_sessions = [],
1564		consolidate = True,
1565		consolidate_tables = False,
1566		consolidate_plots = False,
1567		constraints = {},
1568		):
1569		'''
1570		Compute absolute Δ4x values for all replicate analyses and for sample averages.
1571		If `method` argument is set to `'pooled'`, the standardization processes all sessions
1572		in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
1573		i.e. that their true Δ4x value does not change between sessions,
1574		([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to
1575		`'indep_sessions'`, the standardization processes each session independently, based only
1576		on anchors analyses.
1577		'''
1578
1579		self.standardization_method = method
1580		self.assign_timestamps()
1581
1582		if method == 'pooled':
1583			if weighted_sessions:
1584				for session_group in weighted_sessions:
1585					if self._4x == '47':
1586						X = D47data([r for r in self if r['Session'] in session_group])
1587					elif self._4x == '48':
1588						X = D48data([r for r in self if r['Session'] in session_group])
1589					X.Nominal_D4x = self.Nominal_D4x.copy()
1590					X.refresh()
1591					result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False)
1592					w = np.sqrt(result.redchi)
1593					self.msg(f'Session group {session_group} MRSWD = {w:.4f}')
1594					for r in X:
1595						r[f'wD{self._4x}raw'] *= w
1596			else:
1597				self.msg(f'All D{self._4x}raw weights set to 1 ‰')
1598				for r in self:
1599					r[f'wD{self._4x}raw'] = 1.
1600
1601			params = Parameters()
1602			for k,session in enumerate(self.sessions):
1603				self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.")
1604				self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.")
1605				self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.")
1606				s = pf(session)
1607				params.add(f'a_{s}', value = 0.9)
1608				params.add(f'b_{s}', value = 0.)
1609				params.add(f'c_{s}', value = -0.9)
1610				params.add(f'a2_{s}', value = 0.,
1611# 					vary = self.sessions[session]['scrambling_drift'],
1612					)
1613				params.add(f'b2_{s}', value = 0.,
1614# 					vary = self.sessions[session]['slope_drift'],
1615					)
1616				params.add(f'c2_{s}', value = 0.,
1617# 					vary = self.sessions[session]['wg_drift'],
1618					)
1619				if not self.sessions[session]['scrambling_drift']:
1620					params[f'a2_{s}'].expr = '0'
1621				if not self.sessions[session]['slope_drift']:
1622					params[f'b2_{s}'].expr = '0'
1623				if not self.sessions[session]['wg_drift']:
1624					params[f'c2_{s}'].expr = '0'
1625
1626			for sample in self.unknowns:
1627				params.add(f'D{self._4x}_{pf(sample)}', value = 0.5)
1628
1629			for k in constraints:
1630				params[k].expr = constraints[k]
1631
1632			def residuals(p):
1633				R = []
1634				for r in self:
1635					session = pf(r['Session'])
1636					sample = pf(r['Sample'])
1637					if r['Sample'] in self.Nominal_D4x:
1638						R += [ (
1639							r[f'D{self._4x}raw'] - (
1640								p[f'a_{session}'] * self.Nominal_D4x[r['Sample']]
1641								+ p[f'b_{session}'] * r[f'd{self._4x}']
1642								+	p[f'c_{session}']
1643								+ r['t'] * (
1644									p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']]
1645									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1646									+	p[f'c2_{session}']
1647									)
1648								)
1649							) / r[f'wD{self._4x}raw'] ]
1650					else:
1651						R += [ (
1652							r[f'D{self._4x}raw'] - (
1653								p[f'a_{session}'] * p[f'D{self._4x}_{sample}']
1654								+ p[f'b_{session}'] * r[f'd{self._4x}']
1655								+	p[f'c_{session}']
1656								+ r['t'] * (
1657									p[f'a2_{session}'] * p[f'D{self._4x}_{sample}']
1658									+ p[f'b2_{session}'] * r[f'd{self._4x}']
1659									+	p[f'c2_{session}']
1660									)
1661								)
1662							) / r[f'wD{self._4x}raw'] ]
1663				return R
1664
1665			M = Minimizer(residuals, params)
1666			result = M.least_squares()
1667			self.Nf = result.nfree
1668			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1669			new_names, new_covar, new_se = _fullcovar(result)[:3]
1670			result.var_names = new_names
1671			result.covar = new_covar
1672
1673			for r in self:
1674				s = pf(r["Session"])
1675				a = result.params.valuesdict()[f'a_{s}']
1676				b = result.params.valuesdict()[f'b_{s}']
1677				c = result.params.valuesdict()[f'c_{s}']
1678				a2 = result.params.valuesdict()[f'a2_{s}']
1679				b2 = result.params.valuesdict()[f'b2_{s}']
1680				c2 = result.params.valuesdict()[f'c2_{s}']
1681				r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1682				
1683
1684			self.standardization = result
1685
1686			for session in self.sessions:
1687				self.sessions[session]['Np'] = 3
1688				for k in ['scrambling', 'slope', 'wg']:
1689					if self.sessions[session][f'{k}_drift']:
1690						self.sessions[session]['Np'] += 1
1691
1692			if consolidate:
1693				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
1694			return result
1695
1696
1697		elif method == 'indep_sessions':
1698
1699			if weighted_sessions:
1700				for session_group in weighted_sessions:
1701					X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x)
1702					X.Nominal_D4x = self.Nominal_D4x.copy()
1703					X.refresh()
1704					# This is only done to assign r['wD47raw'] for r in X:
1705					X.standardize(method = method, weighted_sessions = [], consolidate = False)
1706					self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}')
1707			else:
1708				self.msg('All weights set to 1 ‰')
1709				for r in self:
1710					r[f'wD{self._4x}raw'] = 1
1711
1712			for session in self.sessions:
1713				s = self.sessions[session]
1714				p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2']
1715				p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']]
1716				s['Np'] = sum(p_active)
1717				sdata = s['data']
1718
1719				A = np.array([
1720					[
1721						self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'],
1722						r[f'd{self._4x}'] / r[f'wD{self._4x}raw'],
1723						1 / r[f'wD{self._4x}raw'],
1724						self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'],
1725						r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'],
1726						r['t'] / r[f'wD{self._4x}raw']
1727						]
1728					for r in sdata if r['Sample'] in self.anchors
1729					])[:,p_active] # only keep columns for the active parameters
1730				Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors])
1731				s['Na'] = Y.size
1732				CM = linalg.inv(A.T @ A)
1733				bf = (CM @ A.T @ Y).T[0,:]
1734				k = 0
1735				for n,a in zip(p_names, p_active):
1736					if a:
1737						s[n] = bf[k]
1738# 						self.msg(f'{n} = {bf[k]}')
1739						k += 1
1740					else:
1741						s[n] = 0.
1742# 						self.msg(f'{n} = 0.0')
1743
1744				for r in sdata :
1745					a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2']
1746					r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t'])
1747					r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t'])
1748
1749				s['CM'] = np.zeros((6,6))
1750				i = 0
1751				k_active = [j for j,a in enumerate(p_active) if a]
1752				for j,a in enumerate(p_active):
1753					if a:
1754						s['CM'][j,k_active] = CM[i,:]
1755						i += 1
1756
1757			if not weighted_sessions:
1758				w = self.rmswd()['rmswd']
1759				for r in self:
1760						r[f'wD{self._4x}'] *= w
1761						r[f'wD{self._4x}raw'] *= w
1762				for session in self.sessions:
1763					self.sessions[session]['CM'] *= w**2
1764
1765			for session in self.sessions:
1766				s = self.sessions[session]
1767				s['SE_a'] = s['CM'][0,0]**.5
1768				s['SE_b'] = s['CM'][1,1]**.5
1769				s['SE_c'] = s['CM'][2,2]**.5
1770				s['SE_a2'] = s['CM'][3,3]**.5
1771				s['SE_b2'] = s['CM'][4,4]**.5
1772				s['SE_c2'] = s['CM'][5,5]**.5
1773
1774			if not weighted_sessions:
1775				self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions])
1776			else:
1777				self.Nf = 0
1778				for sg in weighted_sessions:
1779					self.Nf += self.rmswd(sessions = sg)['Nf']
1780
1781			self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf)
1782
1783			avgD4x = {
1784				sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample])
1785				for sample in self.samples
1786				}
1787			chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self])
1788			rD4x = (chi2/self.Nf)**.5
1789			self.repeatability[f'sigma_{self._4x}'] = rD4x
1790
1791			if consolidate:
1792				self.consolidate(tables = consolidate_tables, plots = consolidate_plots)

Compute absolute Δ4x values for all replicate analyses and for sample averages. If method argument is set to 'pooled', the standardization processes all sessions in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, i.e. that their true Δ4x value does not change between sessions, (Daëron, 2021). If method argument is set to 'indep_sessions', the standardization processes each session independently, based only on anchors analyses.

def standardization_error(self, session, d4x, D4x, t=0):
1795	def standardization_error(self, session, d4x, D4x, t = 0):
1796		'''
1797		Compute standardization error for a given session and
1798		(δ47, Δ47) composition.
1799		'''
1800		a = self.sessions[session]['a']
1801		b = self.sessions[session]['b']
1802		c = self.sessions[session]['c']
1803		a2 = self.sessions[session]['a2']
1804		b2 = self.sessions[session]['b2']
1805		c2 = self.sessions[session]['c2']
1806		CM = self.sessions[session]['CM']
1807
1808		x, y = D4x, d4x
1809		z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t
1810# 		x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t)
1811		dxdy = -(b+b2*t) / (a+a2*t)
1812		dxdz = 1. / (a+a2*t)
1813		dxda = -x / (a+a2*t)
1814		dxdb = -y / (a+a2*t)
1815		dxdc = -1. / (a+a2*t)
1816		dxda2 = -x * a2 / (a+a2*t)
1817		dxdb2 = -y * t / (a+a2*t)
1818		dxdc2 = -t / (a+a2*t)
1819		V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2])
1820		sx = (V @ CM @ V.T) ** .5
1821		return sx

Compute standardization error for a given session and (δ47, Δ47) composition.

@make_verbal
def summary(self, dir='output', filename=None, save_to_file=True, print_out=True):
1824	@make_verbal
1825	def summary(self,
1826		dir = 'output',
1827		filename = None,
1828		save_to_file = True,
1829		print_out = True,
1830		):
1831		'''
1832		Print out an/or save to disk a summary of the standardization results.
1833
1834		**Parameters**
1835
1836		+ `dir`: the directory in which to save the table
1837		+ `filename`: the name to the csv file to write to
1838		+ `save_to_file`: whether to save the table to disk
1839		+ `print_out`: whether to print out the table
1840		'''
1841
1842		out = []
1843		out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]]
1844		out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]]
1845		out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]]
1846		out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]]
1847		out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]]
1848		out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]]
1849		out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]]
1850		out += [['Model degrees of freedom', f"{self.Nf}"]]
1851		out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]]
1852		out += [['Standardization method', self.standardization_method]]
1853
1854		if save_to_file:
1855			if not os.path.exists(dir):
1856				os.makedirs(dir)
1857			if filename is None:
1858				filename = f'D{self._4x}_summary.csv'
1859			with open(f'{dir}/{filename}', 'w') as fid:
1860				fid.write(make_csv(out))
1861		if print_out:
1862			self.msg('\n' + pretty_table(out, header = 0))

Print out an/or save to disk a summary of the standardization results.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
@make_verbal
def table_of_sessions( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1865	@make_verbal
1866	def table_of_sessions(self,
1867		dir = 'output',
1868		filename = None,
1869		save_to_file = True,
1870		print_out = True,
1871		output = None,
1872		):
1873		'''
1874		Print out an/or save to disk a table of sessions.
1875
1876		**Parameters**
1877
1878		+ `dir`: the directory in which to save the table
1879		+ `filename`: the name to the csv file to write to
1880		+ `save_to_file`: whether to save the table to disk
1881		+ `print_out`: whether to print out the table
1882		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1883		    if set to `'raw'`: return a list of list of strings
1884		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1885		'''
1886		include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions])
1887		include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions])
1888		include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions])
1889
1890		out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']]
1891		if include_a2:
1892			out[-1] += ['a2 ± SE']
1893		if include_b2:
1894			out[-1] += ['b2 ± SE']
1895		if include_c2:
1896			out[-1] += ['c2 ± SE']
1897		for session in self.sessions:
1898			out += [[
1899				session,
1900				f"{self.sessions[session]['Na']}",
1901				f"{self.sessions[session]['Nu']}",
1902				f"{self.sessions[session]['d13Cwg_VPDB']:.3f}",
1903				f"{self.sessions[session]['d18Owg_VSMOW']:.3f}",
1904				f"{self.sessions[session]['r_d13C_VPDB']:.4f}",
1905				f"{self.sessions[session]['r_d18O_VSMOW']:.4f}",
1906				f"{self.sessions[session][f'r_D{self._4x}']:.4f}",
1907				f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}",
1908				f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}",
1909				f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}",
1910				]]
1911			if include_a2:
1912				if self.sessions[session]['scrambling_drift']:
1913					out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"]
1914				else:
1915					out[-1] += ['']
1916			if include_b2:
1917				if self.sessions[session]['slope_drift']:
1918					out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"]
1919				else:
1920					out[-1] += ['']
1921			if include_c2:
1922				if self.sessions[session]['wg_drift']:
1923					out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"]
1924				else:
1925					out[-1] += ['']
1926
1927		if save_to_file:
1928			if not os.path.exists(dir):
1929				os.makedirs(dir)
1930			if filename is None:
1931				filename = f'D{self._4x}_sessions.csv'
1932			with open(f'{dir}/{filename}', 'w') as fid:
1933				fid.write(make_csv(out))
1934		if print_out:
1935			self.msg('\n' + pretty_table(out))
1936		if output == 'raw':
1937			return out
1938		elif output == 'pretty':
1939			return pretty_table(out)

Print out an/or save to disk a table of sessions.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_analyses( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
1942	@make_verbal
1943	def table_of_analyses(
1944		self,
1945		dir = 'output',
1946		filename = None,
1947		save_to_file = True,
1948		print_out = True,
1949		output = None,
1950		):
1951		'''
1952		Print out an/or save to disk a table of analyses.
1953
1954		**Parameters**
1955
1956		+ `dir`: the directory in which to save the table
1957		+ `filename`: the name to the csv file to write to
1958		+ `save_to_file`: whether to save the table to disk
1959		+ `print_out`: whether to print out the table
1960		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
1961		    if set to `'raw'`: return a list of list of strings
1962		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
1963		'''
1964
1965		out = [['UID','Session','Sample']]
1966		extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}]
1967		for f in extra_fields:
1968			out[-1] += [f[0]]
1969		out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}']
1970		for r in self:
1971			out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]]
1972			for f in extra_fields:
1973				out[-1] += [f"{r[f[0]]:{f[1]}}"]
1974			out[-1] += [
1975				f"{r['d13Cwg_VPDB']:.3f}",
1976				f"{r['d18Owg_VSMOW']:.3f}",
1977				f"{r['d45']:.6f}",
1978				f"{r['d46']:.6f}",
1979				f"{r['d47']:.6f}",
1980				f"{r['d48']:.6f}",
1981				f"{r['d49']:.6f}",
1982				f"{r['d13C_VPDB']:.6f}",
1983				f"{r['d18O_VSMOW']:.6f}",
1984				f"{r['D47raw']:.6f}",
1985				f"{r['D48raw']:.6f}",
1986				f"{r['D49raw']:.6f}",
1987				f"{r[f'D{self._4x}']:.6f}"
1988				]
1989		if save_to_file:
1990			if not os.path.exists(dir):
1991				os.makedirs(dir)
1992			if filename is None:
1993				filename = f'D{self._4x}_analyses.csv'
1994			with open(f'{dir}/{filename}', 'w') as fid:
1995				fid.write(make_csv(out))
1996		if print_out:
1997			self.msg('\n' + pretty_table(out))
1998		return out

Print out an/or save to disk a table of analyses.

Parameters

  • dir: the directory in which to save the table
  • filename: the name to the csv file to write to
  • save_to_file: whether to save the table to disk
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def covar_table( self, correl=False, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2000	@make_verbal
2001	def covar_table(
2002		self,
2003		correl = False,
2004		dir = 'output',
2005		filename = None,
2006		save_to_file = True,
2007		print_out = True,
2008		output = None,
2009		):
2010		'''
2011		Print out, save to disk and/or return the variance-covariance matrix of D4x
2012		for all unknown samples.
2013
2014		**Parameters**
2015
2016		+ `dir`: the directory in which to save the csv
2017		+ `filename`: the name of the csv file to write to
2018		+ `save_to_file`: whether to save the csv
2019		+ `print_out`: whether to print out the matrix
2020		+ `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`);
2021		    if set to `'raw'`: return a list of list of strings
2022		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2023		'''
2024		samples = sorted([u for u in self.unknowns])
2025		out = [[''] + samples]
2026		for s1 in samples:
2027			out.append([s1])
2028			for s2 in samples:
2029				if correl:
2030					out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}')
2031				else:
2032					out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}')
2033
2034		if save_to_file:
2035			if not os.path.exists(dir):
2036				os.makedirs(dir)
2037			if filename is None:
2038				if correl:
2039					filename = f'D{self._4x}_correl.csv'
2040				else:
2041					filename = f'D{self._4x}_covar.csv'
2042			with open(f'{dir}/{filename}', 'w') as fid:
2043				fid.write(make_csv(out))
2044		if print_out:
2045			self.msg('\n'+pretty_table(out))
2046		if output == 'raw':
2047			return out
2048		elif output == 'pretty':
2049			return pretty_table(out)

Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the matrix
  • output: if set to 'pretty': return a pretty text matrix (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
@make_verbal
def table_of_samples( self, dir='output', filename=None, save_to_file=True, print_out=True, output=None):
2051	@make_verbal
2052	def table_of_samples(
2053		self,
2054		dir = 'output',
2055		filename = None,
2056		save_to_file = True,
2057		print_out = True,
2058		output = None,
2059		):
2060		'''
2061		Print out, save to disk and/or return a table of samples.
2062
2063		**Parameters**
2064
2065		+ `dir`: the directory in which to save the csv
2066		+ `filename`: the name of the csv file to write to
2067		+ `save_to_file`: whether to save the csv
2068		+ `print_out`: whether to print out the table
2069		+ `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`);
2070		    if set to `'raw'`: return a list of list of strings
2071		    (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`)
2072		'''
2073
2074		out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']]
2075		for sample in self.anchors:
2076			out += [[
2077				f"{sample}",
2078				f"{self.samples[sample]['N']}",
2079				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2080				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2081				f"{self.samples[sample][f'D{self._4x}']:.4f}",'','',
2082				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', ''
2083				]]
2084		for sample in self.unknowns:
2085			out += [[
2086				f"{sample}",
2087				f"{self.samples[sample]['N']}",
2088				f"{self.samples[sample]['d13C_VPDB']:.2f}",
2089				f"{self.samples[sample]['d18O_VSMOW']:.2f}",
2090				f"{self.samples[sample][f'D{self._4x}']:.4f}",
2091				f"{self.samples[sample][f'SE_D{self._4x}']:.4f}",
2092				f{self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}",
2093				f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '',
2094				f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else ''
2095				]]
2096		if save_to_file:
2097			if not os.path.exists(dir):
2098				os.makedirs(dir)
2099			if filename is None:
2100				filename = f'D{self._4x}_samples.csv'
2101			with open(f'{dir}/{filename}', 'w') as fid:
2102				fid.write(make_csv(out))
2103		if print_out:
2104			self.msg('\n'+pretty_table(out))
2105		if output == 'raw':
2106			return out
2107		elif output == 'pretty':
2108			return pretty_table(out)

Print out, save to disk and/or return a table of samples.

Parameters

  • dir: the directory in which to save the csv
  • filename: the name of the csv file to write to
  • save_to_file: whether to save the csv
  • print_out: whether to print out the table
  • output: if set to 'pretty': return a pretty text table (see pretty_table()); if set to 'raw': return a list of list of strings (e.g., [['header1', 'header2'], ['0.1', '0.2']])
def plot_sessions(self, dir='output', figsize=(8, 8), filetype='pdf', dpi=100):
2111	def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100):
2112		'''
2113		Generate session plots and save them to disk.
2114
2115		**Parameters**
2116
2117		+ `dir`: the directory in which to save the plots
2118		+ `figsize`: the width and height (in inches) of each plot
2119		+ `filetype`: 'pdf' or 'png'
2120		+ `dpi`: resolution for PNG output
2121		'''
2122		if not os.path.exists(dir):
2123			os.makedirs(dir)
2124
2125		for session in self.sessions:
2126			sp = self.plot_single_session(session, xylimits = 'constant')
2127			ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {}))
2128			ppl.close(sp.fig)

Generate session plots and save them to disk.

Parameters

  • dir: the directory in which to save the plots
  • figsize: the width and height (in inches) of each plot
  • filetype: 'pdf' or 'png'
  • dpi: resolution for PNG output
@make_verbal
def consolidate_samples(self):
2132	@make_verbal
2133	def consolidate_samples(self):
2134		'''
2135		Compile various statistics for each sample.
2136
2137		For each anchor sample:
2138
2139		+ `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x`
2140		+ `SE_D47` or `SE_D48`: set to zero by definition
2141
2142		For each unknown sample:
2143
2144		+ `D47` or `D48`: the standardized Δ4x value for this unknown
2145		+ `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown
2146
2147		For each anchor and unknown:
2148
2149		+ `N`: the total number of analyses of this sample
2150		+ `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample
2151		+ `d13C_VPDB`: the average δ13C_VPDB value for this sample
2152		+ `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2)
2153		+ `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal
2154		variance, indicating whether the Δ4x repeatability this sample differs significantly from
2155		that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`.
2156		'''
2157		D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']]
2158		for sample in self.samples:
2159			self.samples[sample]['N'] = len(self.samples[sample]['data'])
2160			if self.samples[sample]['N'] > 1:
2161				self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']])
2162
2163			self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']])
2164			self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']])
2165
2166			D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']]
2167			if len(D4x_pop) > 2:
2168				self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1]
2169			
2170		if self.standardization_method == 'pooled':
2171			for sample in self.anchors:
2172				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2173				self.samples[sample][f'SE_D{self._4x}'] = 0.
2174			for sample in self.unknowns:
2175				self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}']
2176				try:
2177					self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5
2178				except ValueError:
2179					# when `sample` is constrained by self.standardize(constraints = {...}),
2180					# it is no longer listed in self.standardization.var_names.
2181					# Temporary fix: define SE as zero for now
2182					self.samples[sample][f'SE_D4{self._4x}'] = 0.
2183
2184		elif self.standardization_method == 'indep_sessions':
2185			for sample in self.anchors:
2186				self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample]
2187				self.samples[sample][f'SE_D{self._4x}'] = 0.
2188			for sample in self.unknowns:
2189				self.msg(f'Consolidating sample {sample}')
2190				self.unknowns[sample][f'session_D{self._4x}'] = {}
2191				session_avg = []
2192				for session in self.sessions:
2193					sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample]
2194					if sdata:
2195						self.msg(f'{sample} found in session {session}')
2196						avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata])
2197						avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata])
2198						# !! TODO: sigma_s below does not account for temporal changes in standardization error
2199						sigma_s = self.standardization_error(session, avg_d4x, avg_D4x)
2200						sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5
2201						session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5])
2202						self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1]
2203				self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg))
2204				weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']}
2205				wsum = sum([weights[s] for s in weights])
2206				for s in weights:
2207					self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum]
2208
2209		for r in self:
2210			r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']

Compile various statistics for each sample.

For each anchor sample:

  • D47 or D48: the nominal Δ4x value for this anchor, specified by self.Nominal_D4x
  • SE_D47 or SE_D48: set to zero by definition

For each unknown sample:

  • D47 or D48: the standardized Δ4x value for this unknown
  • SE_D47 or SE_D48: the standard error of Δ4x for this unknown

For each anchor and unknown:

  • N: the total number of analyses of this sample
  • SD_D47 or SD_D48: the “sample” (in the statistical sense) standard deviation for this sample
  • d13C_VPDB: the average δ13CVPDB value for this sample
  • d18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)
  • p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified by self.LEVENE_REF_SAMPLE.
def consolidate_sessions(self):
2214	def consolidate_sessions(self):
2215		'''
2216		Compute various statistics for each session.
2217
2218		+ `Na`: Number of anchor analyses in the session
2219		+ `Nu`: Number of unknown analyses in the session
2220		+ `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session
2221		+ `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session
2222		+ `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session
2223		+ `a`: scrambling factor
2224		+ `b`: compositional slope
2225		+ `c`: WG offset
2226		+ `SE_a`: Model stadard erorr of `a`
2227		+ `SE_b`: Model stadard erorr of `b`
2228		+ `SE_c`: Model stadard erorr of `c`
2229		+ `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`)
2230		+ `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`)
2231		+ `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`)
2232		+ `a2`: scrambling factor drift
2233		+ `b2`: compositional slope drift
2234		+ `c2`: WG offset drift
2235		+ `Np`: Number of standardization parameters to fit
2236		+ `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`)
2237		+ `d13Cwg_VPDB`: δ13C_VPDB of WG
2238		+ `d18Owg_VSMOW`: δ18O_VSMOW of WG
2239		'''
2240		for session in self.sessions:
2241			if 'd13Cwg_VPDB' not in self.sessions[session]:
2242				self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB']
2243			if 'd18Owg_VSMOW' not in self.sessions[session]:
2244				self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW']
2245			self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors])
2246			self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns])
2247
2248			self.msg(f'Computing repeatabilities for session {session}')
2249			self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session])
2250			self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session])
2251			self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session])
2252
2253		if self.standardization_method == 'pooled':
2254			for session in self.sessions:
2255
2256				# different (better?) computation of D4x repeatability for each session:
2257				sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']]
2258				self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5
2259
2260				self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}']
2261				i = self.standardization.var_names.index(f'a_{pf(session)}')
2262				self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5
2263
2264				self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}']
2265				i = self.standardization.var_names.index(f'b_{pf(session)}')
2266				self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5
2267
2268				self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}']
2269				i = self.standardization.var_names.index(f'c_{pf(session)}')
2270				self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5
2271
2272				self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}']
2273				if self.sessions[session]['scrambling_drift']:
2274					i = self.standardization.var_names.index(f'a2_{pf(session)}')
2275					self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5
2276				else:
2277					self.sessions[session]['SE_a2'] = 0.
2278
2279				self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}']
2280				if self.sessions[session]['slope_drift']:
2281					i = self.standardization.var_names.index(f'b2_{pf(session)}')
2282					self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5
2283				else:
2284					self.sessions[session]['SE_b2'] = 0.
2285
2286				self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}']
2287				if self.sessions[session]['wg_drift']:
2288					i = self.standardization.var_names.index(f'c2_{pf(session)}')
2289					self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5
2290				else:
2291					self.sessions[session]['SE_c2'] = 0.
2292
2293				i = self.standardization.var_names.index(f'a_{pf(session)}')
2294				j = self.standardization.var_names.index(f'b_{pf(session)}')
2295				k = self.standardization.var_names.index(f'c_{pf(session)}')
2296				CM = np.zeros((6,6))
2297				CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]]
2298				try:
2299					i2 = self.standardization.var_names.index(f'a2_{pf(session)}')
2300					CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]]
2301					CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2]
2302					try:
2303						j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2304						CM[3,4] = self.standardization.covar[i2,j2]
2305						CM[4,3] = self.standardization.covar[j2,i2]
2306					except ValueError:
2307						pass
2308					try:
2309						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2310						CM[3,5] = self.standardization.covar[i2,k2]
2311						CM[5,3] = self.standardization.covar[k2,i2]
2312					except ValueError:
2313						pass
2314				except ValueError:
2315					pass
2316				try:
2317					j2 = self.standardization.var_names.index(f'b2_{pf(session)}')
2318					CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]]
2319					CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2]
2320					try:
2321						k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2322						CM[4,5] = self.standardization.covar[j2,k2]
2323						CM[5,4] = self.standardization.covar[k2,j2]
2324					except ValueError:
2325						pass
2326				except ValueError:
2327					pass
2328				try:
2329					k2 = self.standardization.var_names.index(f'c2_{pf(session)}')
2330					CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]]
2331					CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2]
2332				except ValueError:
2333					pass
2334
2335				self.sessions[session]['CM'] = CM
2336
2337		elif self.standardization_method == 'indep_sessions':
2338			pass # Not implemented yet

Compute various statistics for each session.

  • Na: Number of anchor analyses in the session
  • Nu: Number of unknown analyses in the session
  • r_d13C_VPDB: δ13CVPDB repeatability of analyses within the session
  • r_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the session
  • r_D47 or r_D48: Δ4x repeatability of analyses within the session
  • a: scrambling factor
  • b: compositional slope
  • c: WG offset
  • SE_a: Model stadard erorr of a
  • SE_b: Model stadard erorr of b
  • SE_c: Model stadard erorr of c
  • scrambling_drift (boolean): whether to allow a temporal drift in the scrambling factor (a)
  • slope_drift (boolean): whether to allow a temporal drift in the compositional slope (b)
  • wg_drift (boolean): whether to allow a temporal drift in the WG offset (c)
  • a2: scrambling factor drift
  • b2: compositional slope drift
  • c2: WG offset drift
  • Np: Number of standardization parameters to fit
  • CM: model covariance matrix for (a, b, c, a2, b2, c2)
  • d13Cwg_VPDB: δ13CVPDB of WG
  • d18Owg_VSMOW: δ18OVSMOW of WG
@make_verbal
def repeatabilities(self):
2341	@make_verbal
2342	def repeatabilities(self):
2343		'''
2344		Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x
2345		(for all samples, for anchors, and for unknowns).
2346		'''
2347		self.msg('Computing reproducibilities for all sessions')
2348
2349		self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors')
2350		self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors')
2351		self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors')
2352		self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns')
2353		self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')

Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).

@make_verbal
def consolidate(self, tables=True, plots=True):
2356	@make_verbal
2357	def consolidate(self, tables = True, plots = True):
2358		'''
2359		Collect information about samples, sessions and repeatabilities.
2360		'''
2361		self.consolidate_samples()
2362		self.consolidate_sessions()
2363		self.repeatabilities()
2364
2365		if tables:
2366			self.summary()
2367			self.table_of_sessions()
2368			self.table_of_analyses()
2369			self.table_of_samples()
2370
2371		if plots:
2372			self.plot_sessions()

Collect information about samples, sessions and repeatabilities.

@make_verbal
def rmswd(self, samples='all samples', sessions='all sessions'):
2375	@make_verbal
2376	def rmswd(self,
2377		samples = 'all samples',
2378		sessions = 'all sessions',
2379		):
2380		'''
2381		Compute the χ2, root mean squared weighted deviation
2382		(i.e. reduced χ2), and corresponding degrees of freedom of the
2383		Δ4x values for samples in `samples` and sessions in `sessions`.
2384		
2385		Only used in `D4xdata.standardize()` with `method='indep_sessions'`.
2386		'''
2387		if samples == 'all samples':
2388			mysamples = [k for k in self.samples]
2389		elif samples == 'anchors':
2390			mysamples = [k for k in self.anchors]
2391		elif samples == 'unknowns':
2392			mysamples = [k for k in self.unknowns]
2393		else:
2394			mysamples = samples
2395
2396		if sessions == 'all sessions':
2397			sessions = [k for k in self.sessions]
2398
2399		chisq, Nf = 0, 0
2400		for sample in mysamples :
2401			G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2402			if len(G) > 1 :
2403				X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G])
2404				Nf += (len(G) - 1)
2405				chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G])
2406		r = (chisq / Nf)**.5 if Nf > 0 else 0
2407		self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.')
2408		return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}

Compute the χ2, root mean squared weighted deviation (i.e. reduced χ2), and corresponding degrees of freedom of the Δ4x values for samples in samples and sessions in sessions.

Only used in D4xdata.standardize() with method='indep_sessions'.

@make_verbal
def compute_r(self, key, samples='all samples', sessions='all sessions'):
2411	@make_verbal
2412	def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'):
2413		'''
2414		Compute the repeatability of `[r[key] for r in self]`
2415		'''
2416
2417		if samples == 'all samples':
2418			mysamples = [k for k in self.samples]
2419		elif samples == 'anchors':
2420			mysamples = [k for k in self.anchors]
2421		elif samples == 'unknowns':
2422			mysamples = [k for k in self.unknowns]
2423		else:
2424			mysamples = samples
2425
2426		if sessions == 'all sessions':
2427			sessions = [k for k in self.sessions]
2428
2429		if key in ['D47', 'D48']:
2430			# Full disclosure: the definition of Nf is tricky/debatable
2431			G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions]
2432			chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum()
2433			Nf = len(G)
2434# 			print(f'len(G) = {Nf}')
2435			Nf -= len([s for s in mysamples if s in self.unknowns])
2436# 			print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider')
2437			for session in sessions:
2438				Np = len([
2439					_ for _ in self.standardization.params
2440					if (
2441						self.standardization.params[_].expr is not None
2442						and (
2443							(_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session))
2444							or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session))
2445							)
2446						)
2447					])
2448# 				print(f'session {session}: {Np} parameters to consider')
2449				Na = len({
2450					r['Sample'] for r in self.sessions[session]['data']
2451					if r['Sample'] in self.anchors and r['Sample'] in mysamples
2452					})
2453# 				print(f'session {session}: {Na} different anchors in that session')
2454				Nf -= min(Np, Na)
2455# 			print(f'Nf = {Nf}')
2456
2457# 			for sample in mysamples :
2458# 				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2459# 				if len(X) > 1 :
2460# 					chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ])
2461# 					if sample in self.unknowns:
2462# 						Nf += len(X) - 1
2463# 					else:
2464# 						Nf += len(X)
2465# 			if samples in ['anchors', 'all samples']:
2466# 				Nf -= sum([self.sessions[s]['Np'] for s in sessions])
2467			r = (chisq / Nf)**.5 if Nf > 0 else 0
2468
2469		else: # if key not in ['D47', 'D48']
2470			chisq, Nf = 0, 0
2471			for sample in mysamples :
2472				X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ]
2473				if len(X) > 1 :
2474					Nf += len(X) - 1
2475					chisq += np.sum([ (x-np.mean(X))**2 for x in X ])
2476			r = (chisq / Nf)**.5 if Nf > 0 else 0
2477
2478		self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.')
2479		return r

Compute the repeatability of [r[key] for r in self]

def sample_average(self, samples, weights='equal', normalize=True):
2481	def sample_average(self, samples, weights = 'equal', normalize = True):
2482		'''
2483		Weighted average Δ4x value of a group of samples, accounting for covariance.
2484
2485		Returns the weighed average Δ4x value and associated SE
2486		of a group of samples. Weights are equal by default. If `normalize` is
2487		true, `weights` will be rescaled so that their sum equals 1.
2488
2489		**Examples**
2490
2491		```python
2492		self.sample_average(['X','Y'], [1, 2])
2493		```
2494
2495		returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3,
2496		where Δ4x(X) and Δ4x(Y) are the average Δ4x
2497		values of samples X and Y, respectively.
2498
2499		```python
2500		self.sample_average(['X','Y'], [1, -1], normalize = False)
2501		```
2502
2503		returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2504		'''
2505		if weights == 'equal':
2506			weights = [1/len(samples)] * len(samples)
2507
2508		if normalize:
2509			s = sum(weights)
2510			if s:
2511				weights = [w/s for w in weights]
2512
2513		try:
2514# 			indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples]
2515# 			C = self.standardization.covar[indices,:][:,indices]
2516			C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples])
2517			X = [self.samples[sample][f'D{self._4x}'] for sample in samples]
2518			return correlated_sum(X, C, weights)
2519		except ValueError:
2520			return (0., 0.)

Weighted average Δ4x value of a group of samples, accounting for covariance.

Returns the weighed average Δ4x value and associated SE of a group of samples. Weights are equal by default. If normalize is true, weights will be rescaled so that their sum equals 1.

Examples

self.sample_average(['X','Y'], [1, 2])

returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.

self.sample_average(['X','Y'], [1, -1], normalize = False)

returns the value and SE of the difference Δ4x(X) - Δ4x(Y).

def sample_D4x_covar(self, sample1, sample2=None):
2523	def sample_D4x_covar(self, sample1, sample2 = None):
2524		'''
2525		Covariance between Δ4x values of samples
2526
2527		Returns the error covariance between the average Δ4x values of two
2528		samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`),
2529		returns the Δ4x variance for that sample.
2530		'''
2531		if sample2 is None:
2532			sample2 = sample1
2533		if self.standardization_method == 'pooled':
2534			i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}')
2535			j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}')
2536			return self.standardization.covar[i, j]
2537		elif self.standardization_method == 'indep_sessions':
2538			if sample1 == sample2:
2539				return self.samples[sample1][f'SE_D{self._4x}']**2
2540			else:
2541				c = 0
2542				for session in self.sessions:
2543					sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1]
2544					sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2]
2545					if sdata1 and sdata2:
2546						a = self.sessions[session]['a']
2547						# !! TODO: CM below does not account for temporal changes in standardization parameters
2548						CM = self.sessions[session]['CM'][:3,:3]
2549						avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1])
2550						avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1])
2551						avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2])
2552						avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2])
2553						c += (
2554							self.unknowns[sample1][f'session_D{self._4x}'][session][2]
2555							* self.unknowns[sample2][f'session_D{self._4x}'][session][2]
2556							* np.array([[avg_D4x_1, avg_d4x_1, 1]])
2557							@ CM
2558							@ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T
2559							) / a**2
2560				return float(c)

Covariance between Δ4x values of samples

Returns the error covariance between the average Δ4x values of two samples. If if only sample_1 is specified, or if sample_1 == sample_2), returns the Δ4x variance for that sample.

def sample_D4x_correl(self, sample1, sample2=None):
2562	def sample_D4x_correl(self, sample1, sample2 = None):
2563		'''
2564		Correlation between Δ4x errors of samples
2565
2566		Returns the error correlation between the average Δ4x values of two samples.
2567		'''
2568		if sample2 is None or sample2 == sample1:
2569			return 1.
2570		return (
2571			self.sample_D4x_covar(sample1, sample2)
2572			/ self.unknowns[sample1][f'SE_D{self._4x}']
2573			/ self.unknowns[sample2][f'SE_D{self._4x}']
2574			)

Correlation between Δ4x errors of samples

Returns the error correlation between the average Δ4x values of two samples.

def plot_single_session( self, session, kw_plot_anchors={'ls': 'None', 'marker': 'x', 'mec': (0.75, 0, 0), 'mew': 0.75, 'ms': 4}, kw_plot_unknowns={'ls': 'None', 'marker': 'x', 'mec': (0, 0, 0.75), 'mew': 0.75, 'ms': 4}, kw_plot_anchor_avg={'ls': '-', 'marker': 'None', 'color': (0.75, 0, 0), 'lw': 0.75}, kw_plot_unknown_avg={'ls': '-', 'marker': 'None', 'color': (0, 0, 0.75), 'lw': 0.75}, kw_contour_error={'colors': [[0, 0, 0]], 'alpha': 0.5, 'linewidths': 0.75}, xylimits='free', x_label=None, y_label=None, error_contour_interval='auto', fig='new'):
2576	def plot_single_session(self,
2577		session,
2578		kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4),
2579		kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4),
2580		kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75),
2581		kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75),
2582		kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75),
2583		xylimits = 'free', # | 'constant'
2584		x_label = None,
2585		y_label = None,
2586		error_contour_interval = 'auto',
2587		fig = 'new',
2588		):
2589		'''
2590		Generate plot for a single session
2591		'''
2592		if x_label is None:
2593			x_label = f'δ$_{{{self._4x}}}$ (‰)'
2594		if y_label is None:
2595			y_label = f'Δ$_{{{self._4x}}}$ (‰)'
2596
2597		out = _SessionPlot()
2598		anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]]
2599		unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]]
2600		anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2601		anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]
2602		unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2603		unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]
2604		anchor_avg = (np.array([ np.array([
2605				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2606				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2607				]) for sample in anchors]).T,
2608			np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T)
2609		unknown_avg = (np.array([ np.array([
2610				np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1,
2611				np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1
2612				]) for sample in unknowns]).T,
2613			np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T)
2614		
2615		
2616		if fig == 'new':
2617			out.fig = ppl.figure(figsize = (6,6))
2618			ppl.subplots_adjust(.1,.1,.9,.9)
2619
2620		out.anchor_analyses, = ppl.plot(
2621			anchors_d,
2622			anchors_D,
2623			**kw_plot_anchors)
2624		out.unknown_analyses, = ppl.plot(
2625			unknowns_d,
2626			unknowns_D,
2627			**kw_plot_unknowns)
2628		out.anchor_avg = ppl.plot(
2629			*anchor_avg,
2630			**kw_plot_anchor_avg)
2631		out.unknown_avg = ppl.plot(
2632			*unknown_avg,
2633			**kw_plot_unknown_avg)
2634		if xylimits == 'constant':
2635			x = [r[f'd{self._4x}'] for r in self]
2636			y = [r[f'D{self._4x}'] for r in self]
2637			x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y)
2638			w, h = x2-x1, y2-y1
2639			x1 -= w/20
2640			x2 += w/20
2641			y1 -= h/20
2642			y2 += h/20
2643			ppl.axis([x1, x2, y1, y2])
2644		elif xylimits == 'free':
2645			x1, x2, y1, y2 = ppl.axis()
2646		else:
2647			x1, x2, y1, y2 = ppl.axis(xylimits)
2648				
2649		if error_contour_interval != 'none':
2650			xi, yi = np.linspace(x1, x2), np.linspace(y1, y2)
2651			XI,YI = np.meshgrid(xi, yi)
2652			SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi])
2653			if error_contour_interval == 'auto':
2654				rng = np.max(SI) - np.min(SI)
2655				if rng <= 0.01:
2656					cinterval = 0.001
2657				elif rng <= 0.03:
2658					cinterval = 0.004
2659				elif rng <= 0.1:
2660					cinterval = 0.01
2661				elif rng <= 0.3:
2662					cinterval = 0.03
2663				elif rng <= 1.:
2664					cinterval = 0.1
2665				else:
2666					cinterval = 0.5
2667			else:
2668				cinterval = error_contour_interval
2669
2670			cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval)
2671			out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error)
2672			out.clabel = ppl.clabel(out.contour)
2673			contour = (XI, YI, SI, cval, cinterval)
2674
2675		if fig == None:
2676			return {
2677			'anchors':anchors,
2678			'unknowns':unknowns,
2679			'anchors_d':anchors_d,
2680			'anchors_D':anchors_D,
2681			'unknowns_d':unknowns_d,
2682			'unknowns_D':unknowns_D,
2683			'anchor_avg':anchor_avg,
2684			'unknown_avg':unknown_avg,
2685			'contour':contour,
2686			}
2687
2688		ppl.xlabel(x_label)
2689		ppl.ylabel(y_label)
2690		ppl.title(session, weight = 'bold')
2691		ppl.grid(alpha = .2)
2692		out.ax = ppl.gca()		
2693
2694		return out

Generate plot for a single session

def plot_residuals( self, kde=False, hist=False, binwidth=0.6666666666666666, dir='output', filename=None, highlight=[], colors=None, figsize=None, dpi=100, yspan=None):
2696	def plot_residuals(
2697		self,
2698		kde = False,
2699		hist = False,
2700		binwidth = 2/3,
2701		dir = 'output',
2702		filename = None,
2703		highlight = [],
2704		colors = None,
2705		figsize = None,
2706		dpi = 100,
2707		yspan = None,
2708		):
2709		'''
2710		Plot residuals of each analysis as a function of time (actually, as a function of
2711		the order of analyses in the `D4xdata` object)
2712
2713		+ `kde`: whether to add a kernel density estimate of residuals
2714		+ `hist`: whether to add a histogram of residuals (incompatible with `kde`)
2715		+ `histbins`: specify bin edges for the histogram
2716		+ `dir`: the directory in which to save the plot
2717		+ `highlight`: a list of samples to highlight
2718		+ `colors`: a dict of `{<sample>: <color>}` for all samples
2719		+ `figsize`: (width, height) of figure
2720		+ `dpi`: resolution for PNG output
2721		+ `yspan`: factor controlling the range of y values shown in plot
2722		  (by default: `yspan = 1.5 if kde else 1.0`)
2723		'''
2724		
2725		from matplotlib import ticker
2726
2727		if yspan is None:
2728			if kde:
2729				yspan = 1.5
2730			else:
2731				yspan = 1.0
2732		
2733		# Layout
2734		fig = ppl.figure(figsize = (8,4) if figsize is None else figsize)
2735		if hist or kde:
2736			ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72)
2737			ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15)
2738		else:
2739			ppl.subplots_adjust(.08,.05,.78,.8)
2740			ax1 = ppl.subplot(111)
2741		
2742		# Colors
2743		N = len(self.anchors)
2744		if colors is None:
2745			if len(highlight) > 0:
2746				Nh = len(highlight)
2747				if Nh == 1:
2748					colors = {highlight[0]: (0,0,0)}
2749				elif Nh == 3:
2750					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])}
2751				elif Nh == 4:
2752					colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2753				else:
2754					colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)}
2755			else:
2756				if N == 3:
2757					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])}
2758				elif N == 4:
2759					colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])}
2760				else:
2761					colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)}
2762
2763		ppl.sca(ax1)
2764		
2765		ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75)
2766
2767		ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$'))
2768
2769		session = self[0]['Session']
2770		x1 = 0
2771# 		ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self])
2772		x_sessions = {}
2773		one_or_more_singlets = False
2774		one_or_more_multiplets = False
2775		multiplets = set()
2776		for k,r in enumerate(self):
2777			if r['Session'] != session:
2778				x2 = k-1
2779				x_sessions[session] = (x1+x2)/2
2780				ppl.axvline(k - 0.5, color = 'k', lw = .5)
2781				session = r['Session']
2782				x1 = k
2783			singlet = len(self.samples[r['Sample']]['data']) == 1
2784			if not singlet:
2785				multiplets.add(r['Sample'])
2786			if r['Sample'] in self.unknowns:
2787				if singlet:
2788					one_or_more_singlets = True
2789				else:
2790					one_or_more_multiplets = True
2791			kw = dict(
2792				marker = 'x' if singlet else '+',
2793				ms = 4 if singlet else 5,
2794				ls = 'None',
2795				mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0),
2796				mew = 1,
2797				alpha = 0.2 if singlet else 1,
2798				)
2799			if highlight and r['Sample'] not in highlight:
2800				kw['alpha'] = 0.2
2801			ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw)
2802		x2 = k
2803		x_sessions[session] = (x1+x2)/2
2804
2805		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1)
2806		ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1)
2807		if not (hist or kde):
2808			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center')
2809			ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f"   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center')
2810
2811		xmin, xmax, ymin, ymax = ppl.axis()
2812		if yspan != 1:
2813			ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2
2814		for s in x_sessions:
2815			ppl.text(
2816				x_sessions[s],
2817				ymax +1,
2818				s,
2819				va = 'bottom',
2820				**(
2821					dict(ha = 'center')
2822					if len(self.sessions[s]['data']) > (0.15 * len(self))
2823					else dict(ha = 'left', rotation = 45)
2824					)
2825				)
2826
2827		if hist or kde:
2828			ppl.sca(ax2)
2829
2830		for s in colors:
2831			kw['marker'] = '+'
2832			kw['ms'] = 5
2833			kw['mec'] = colors[s]
2834			kw['label'] = s
2835			kw['alpha'] = 1
2836			ppl.plot([], [], **kw)
2837
2838		kw['mec'] = (0,0,0)
2839
2840		if one_or_more_singlets:
2841			kw['marker'] = 'x'
2842			kw['ms'] = 4
2843			kw['alpha'] = .2
2844			kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other'
2845			ppl.plot([], [], **kw)
2846
2847		if one_or_more_multiplets:
2848			kw['marker'] = '+'
2849			kw['ms'] = 4
2850			kw['alpha'] = 1
2851			kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other'
2852			ppl.plot([], [], **kw)
2853
2854		if hist or kde:
2855			leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9)
2856		else:
2857			leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5)
2858		leg.set_zorder(-1000)
2859
2860		ppl.sca(ax1)
2861
2862		ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)')
2863		ppl.xticks([])
2864		ppl.axis([-1, len(self), None, None])
2865
2866		if hist or kde:
2867			ppl.sca(ax2)
2868			X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors])
2869
2870			if kde:
2871				from scipy.stats import gaussian_kde
2872				yi = np.linspace(ymin, ymax, 201)
2873				xi = gaussian_kde(X).evaluate(yi)
2874				ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1))
2875# 				ppl.plot(xi, yi, 'k-', lw = 1)
2876			elif hist:
2877				ppl.hist(
2878					X,
2879					orientation = 'horizontal',
2880					histtype = 'stepfilled',
2881					ec = [.4]*3,
2882					fc = [.25]*3,
2883					alpha = .25,
2884					bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)),
2885					)
2886			ppl.text(0, 0,
2887				f"   SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n   95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm",
2888				size = 7.5,
2889				alpha = 1,
2890				va = 'center',
2891				ha = 'left',
2892				)
2893
2894			ppl.axis([0, None, ymin, ymax])
2895			ppl.xticks([])
2896			ppl.yticks([])
2897# 			ax2.spines['left'].set_visible(False)
2898			ax2.spines['right'].set_visible(False)
2899			ax2.spines['top'].set_visible(False)
2900			ax2.spines['bottom'].set_visible(False)
2901
2902		ax1.axis([None, None, ymin, ymax])
2903
2904		if not os.path.exists(dir):
2905			os.makedirs(dir)
2906		if filename is None:
2907			return fig
2908		elif filename == '':
2909			filename = f'D{self._4x}_residuals.pdf'
2910		ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2911		ppl.close(fig)

Plot residuals of each analysis as a function of time (actually, as a function of the order of analyses in the D4xdata object)

  • kde: whether to add a kernel density estimate of residuals
  • hist: whether to add a histogram of residuals (incompatible with kde)
  • histbins: specify bin edges for the histogram
  • dir: the directory in which to save the plot
  • highlight: a list of samples to highlight
  • colors: a dict of {<sample>: <color>} for all samples
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
  • yspan: factor controlling the range of y values shown in plot (by default: yspan = 1.5 if kde else 1.0)
def simulate(self, *args, **kwargs):
2914	def simulate(self, *args, **kwargs):
2915		'''
2916		Legacy function with warning message pointing to `virtual_data()`
2917		'''
2918		raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')

Legacy function with warning message pointing to virtual_data()

def plot_distribution_of_analyses( self, dir='output', filename=None, vs_time=False, figsize=(6, 4), subplots_adjust=(0.02, 0.13, 0.85, 0.8), output=None, dpi=100):
2920	def plot_distribution_of_analyses(
2921		self,
2922		dir = 'output',
2923		filename = None,
2924		vs_time = False,
2925		figsize = (6,4),
2926		subplots_adjust = (0.02, 0.13, 0.85, 0.8),
2927		output = None,
2928		dpi = 100,
2929		):
2930		'''
2931		Plot temporal distribution of all analyses in the data set.
2932		
2933		**Parameters**
2934
2935		+ `dir`: the directory in which to save the plot
2936		+ `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially.
2937		+ `dpi`: resolution for PNG output
2938		+ `figsize`: (width, height) of figure
2939		+ `dpi`: resolution for PNG output
2940		'''
2941
2942		asamples = [s for s in self.anchors]
2943		usamples = [s for s in self.unknowns]
2944		if output is None or output == 'fig':
2945			fig = ppl.figure(figsize = figsize)
2946			ppl.subplots_adjust(*subplots_adjust)
2947		Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2948		Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)])
2949		Xmax += (Xmax-Xmin)/40
2950		Xmin -= (Xmax-Xmin)/41
2951		for k, s in enumerate(asamples + usamples):
2952			if vs_time:
2953				X = [r['TimeTag'] for r in self if r['Sample'] == s]
2954			else:
2955				X = [x for x,r in enumerate(self) if r['Sample'] == s]
2956			Y = [-k for x in X]
2957			ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75)
2958			ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25)
2959			ppl.text(Xmax, -k, f'   {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r')
2960		ppl.axis([Xmin, Xmax, -k-1, 1])
2961		ppl.xlabel('\ntime')
2962		ppl.gca().annotate('',
2963			xy = (0.6, -0.02),
2964			xycoords = 'axes fraction',
2965			xytext = (.4, -0.02), 
2966            arrowprops = dict(arrowstyle = "->", color = 'k'),
2967            )
2968			
2969
2970		x2 = -1
2971		for session in self.sessions:
2972			x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2973			if vs_time:
2974				ppl.axvline(x1, color = 'k', lw = .75)
2975			if x2 > -1:
2976				if not vs_time:
2977					ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5)
2978			x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session])
2979# 			from xlrd import xldate_as_datetime
2980# 			print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0))
2981			if vs_time:
2982				ppl.axvline(x2, color = 'k', lw = .75)
2983				ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15)
2984			ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8)
2985
2986		ppl.xticks([])
2987		ppl.yticks([])
2988
2989		if output is None:
2990			if not os.path.exists(dir):
2991				os.makedirs(dir)
2992			if filename == None:
2993				filename = f'D{self._4x}_distribution_of_analyses.pdf'
2994			ppl.savefig(f'{dir}/{filename}', dpi = dpi)
2995			ppl.close(fig)
2996		elif output == 'ax':
2997			return ppl.gca()
2998		elif output == 'fig':
2999			return fig

Plot temporal distribution of all analyses in the data set.

Parameters

  • dir: the directory in which to save the plot
  • vs_time: if True, plot as a function of TimeTag rather than sequentially.
  • dpi: resolution for PNG output
  • figsize: (width, height) of figure
  • dpi: resolution for PNG output
def plot_bulk_compositions( self, samples=None, dir='output/bulk_compositions', figsize=(6, 6), subplots_adjust=(0.15, 0.12, 0.95, 0.92), show=False, sample_color=(0, 0.5, 1), analysis_color=(0.7, 0.7, 0.7), labeldist=0.3, radius=0.05):
3002	def plot_bulk_compositions(
3003		self,
3004		samples = None,
3005		dir = 'output/bulk_compositions',
3006		figsize = (6,6),
3007		subplots_adjust = (0.15, 0.12, 0.95, 0.92),
3008		show = False,
3009		sample_color = (0,.5,1),
3010		analysis_color = (.7,.7,.7),
3011		labeldist = 0.3,
3012		radius = 0.05,
3013		):
3014		'''
3015		Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses.
3016		
3017		By default, creates a directory `./output/bulk_compositions` where plots for
3018		each sample are saved. Another plot named `__all__.pdf` shows all analyses together.
3019		
3020		
3021		**Parameters**
3022
3023		+ `samples`: Only these samples are processed (by default: all samples).
3024		+ `dir`: where to save the plots
3025		+ `figsize`: (width, height) of figure
3026		+ `subplots_adjust`: passed to `subplots_adjust()`
3027		+ `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples,
3028		allowing for interactive visualization/exploration in (δ13C, δ18O) space.
3029		+ `sample_color`: color used for replicate markers/labels
3030		+ `analysis_color`: color used for sample markers/labels
3031		+ `labeldist`: distance (in inches) from replicate markers to replicate labels
3032		+ `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`.
3033		'''
3034
3035		from matplotlib.patches import Ellipse
3036
3037		if samples is None:
3038			samples = [_ for _ in self.samples]
3039
3040		saved = {}
3041
3042		for s in samples:
3043
3044			fig = ppl.figure(figsize = figsize)
3045			fig.subplots_adjust(*subplots_adjust)
3046			ax = ppl.subplot(111)
3047			ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3048			ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3049			ppl.title(s)
3050
3051
3052			XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']])
3053			UID = [_['UID'] for _ in self.samples[s]['data']]
3054			XY0 = XY.mean(0)
3055
3056			for xy in XY:
3057				ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color)
3058				
3059			ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color)
3060			ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color)
3061			ppl.text(*XY0, f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3062			saved[s] = [XY, XY0]
3063			
3064			x1, x2, y1, y2 = ppl.axis()
3065			x0, dx = (x1+x2)/2, (x2-x1)/2
3066			y0, dy = (y1+y2)/2, (y2-y1)/2
3067			dx, dy = [max(max(dx, dy), radius)]*2
3068
3069			ppl.axis([
3070				x0 - 1.2*dx,
3071				x0 + 1.2*dx,
3072				y0 - 1.2*dy,
3073				y0 + 1.2*dy,
3074				])			
3075
3076			XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0))
3077
3078			for xy, uid in zip(XY, UID):
3079
3080				xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy))
3081				vector_in_display_space = xy_in_display_space - XY0_in_display_space
3082
3083				if (vector_in_display_space**2).sum() > 0:
3084
3085					unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5
3086					label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist
3087					label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space
3088					label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space))
3089
3090					ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color)
3091
3092				else:
3093
3094					ppl.text(*xy, f'{uid}  ', va = 'center', ha = 'right', color = analysis_color)
3095
3096			if radius:
3097				ax.add_artist(Ellipse(
3098					xy = XY0,
3099					width = radius*2,
3100					height = radius*2,
3101					ls = (0, (2,2)),
3102					lw = .7,
3103					ec = analysis_color,
3104					fc = 'None',
3105					))
3106				ppl.text(
3107					XY0[0],
3108					XY0[1]-radius,
3109					f'\n± {radius*1e3:.0f} ppm',
3110					color = analysis_color,
3111					va = 'top',
3112					ha = 'center',
3113					linespacing = 0.4,
3114					size = 8,
3115					)
3116
3117			if not os.path.exists(dir):
3118				os.makedirs(dir)
3119			fig.savefig(f'{dir}/{s}.pdf')
3120			ppl.close(fig)
3121
3122		fig = ppl.figure(figsize = figsize)
3123		fig.subplots_adjust(*subplots_adjust)
3124		ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)')
3125		ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)')
3126
3127		for s in saved:
3128			for xy in saved[s][0]:
3129				ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color)
3130			ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color)
3131			ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color)
3132			ppl.text(*saved[s][1], f'  {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold')
3133
3134		x1, x2, y1, y2 = ppl.axis()
3135		ppl.axis([
3136			x1 - (x2-x1)/10,
3137			x2 + (x2-x1)/10,
3138			y1 - (y2-y1)/10,
3139			y2 + (y2-y1)/10,
3140			])			
3141
3142
3143		if not os.path.exists(dir):
3144			os.makedirs(dir)
3145		fig.savefig(f'{dir}/__all__.pdf')
3146		if show:
3147			ppl.show()
3148		ppl.close(fig)

Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.

By default, creates a directory ./output/bulk_compositions where plots for each sample are saved. Another plot named __all__.pdf shows all analyses together.

Parameters

  • samples: Only these samples are processed (by default: all samples).
  • dir: where to save the plots
  • figsize: (width, height) of figure
  • subplots_adjust: passed to subplots_adjust()
  • show: whether to call matplotlib.pyplot.show() on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.
  • sample_color: color used for replicate markers/labels
  • analysis_color: color used for sample markers/labels
  • labeldist: distance (in inches) from replicate markers to replicate labels
  • radius: radius of the dashed circle providing scale. No circle if radius = 0.
Inherited Members
builtins.list
clear
copy
append
insert
extend
pop
remove
index
count
reverse
sort
class D47data(D4xdata):
3190class D47data(D4xdata):
3191	'''
3192	Store and process data for a large set of Δ47 analyses,
3193	usually comprising more than one analytical session.
3194	'''
3195
3196	Nominal_D4x = {
3197		'ETH-1':   0.2052,
3198		'ETH-2':   0.2085,
3199		'ETH-3':   0.6132,
3200		'ETH-4':   0.4511,
3201		'IAEA-C1': 0.3018,
3202		'IAEA-C2': 0.6409,
3203		'MERCK':   0.5135,
3204		} # I-CDES (Bernasconi et al., 2021)
3205	'''
3206	Nominal Δ47 values assigned to the Δ47 anchor samples, used by
3207	`D47data.standardize()` to normalize unknown samples to an absolute Δ47
3208	reference frame.
3209
3210	By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)):
3211	```py
3212	{
3213		'ETH-1'   : 0.2052,
3214		'ETH-2'   : 0.2085,
3215		'ETH-3'   : 0.6132,
3216		'ETH-4'   : 0.4511,
3217		'IAEA-C1' : 0.3018,
3218		'IAEA-C2' : 0.6409,
3219		'MERCK'   : 0.5135,
3220	}
3221	```
3222	'''
3223
3224
3225	@property
3226	def Nominal_D47(self):
3227		return self.Nominal_D4x
3228	
3229
3230	@Nominal_D47.setter
3231	def Nominal_D47(self, new):
3232		self.Nominal_D4x = dict(**new)
3233		self.refresh()
3234
3235
3236	def __init__(self, l = [], **kwargs):
3237		'''
3238		**Parameters:** same as `D4xdata.__init__()`
3239		'''
3240		D4xdata.__init__(self, l = l, mass = '47', **kwargs)
3241
3242
3243	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3244		'''
3245		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3246		value for that temperature, and add treat these samples as additional anchors.
3247
3248		**Parameters**
3249
3250		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3251		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3252		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3253		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3254		if `new`: keep pre-existing anchors but update them in case of conflict
3255		between old and new Δ47 values;
3256		if `old`: keep pre-existing anchors but preserve their original Δ47
3257		values in case of conflict.
3258		'''
3259		f = {
3260			'petersen': fCO2eqD47_Petersen,
3261			'wang': fCO2eqD47_Wang,
3262			}[fCo2eqD47]
3263		foo = {}
3264		for r in self:
3265			if 'Teq' in r:
3266				if r['Sample'] in foo:
3267					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3268				else:
3269					foo[r['Sample']] = f(r['Teq'])
3270			else:
3271					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3272
3273		if priority == 'replace':
3274			self.Nominal_D47 = {}
3275		for s in foo:
3276			if priority != 'old' or s not in self.Nominal_D47:
3277				self.Nominal_D47[s] = foo[s]
3278	
3279	def save_D47_correl(self, *args, **kwargs):
3280		return self._save_D4x_correl(*args, **kwargs)
3281
3282	save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')

Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.

D47data(l=[], **kwargs)
3236	def __init__(self, l = [], **kwargs):
3237		'''
3238		**Parameters:** same as `D4xdata.__init__()`
3239		'''
3240		D4xdata.__init__(self, l = l, mass = '47', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6132, 'ETH-4': 0.4511, 'IAEA-C1': 0.3018, 'IAEA-C2': 0.6409, 'MERCK': 0.5135}

Nominal Δ47 values assigned to the Δ47 anchor samples, used by D47data.standardize() to normalize unknown samples to an absolute Δ47 reference frame.

By default equal to (after Bernasconi et al. (2021)):

{
        'ETH-1'   : 0.2052,
        'ETH-2'   : 0.2085,
        'ETH-3'   : 0.6132,
        'ETH-4'   : 0.4511,
        'IAEA-C1' : 0.3018,
        'IAEA-C2' : 0.6409,
        'MERCK'   : 0.5135,
}
Nominal_D47
3225	@property
3226	def Nominal_D47(self):
3227		return self.Nominal_D4x
def D47fromTeq(self, fCo2eqD47='petersen', priority='new'):
3243	def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'):
3244		'''
3245		Find all samples for which `Teq` is specified, compute equilibrium Δ47
3246		value for that temperature, and add treat these samples as additional anchors.
3247
3248		**Parameters**
3249
3250		+ `fCo2eqD47`: Which CO2 equilibrium law to use
3251		(`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127);
3252		`wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)).
3253		+ `priority`: if `replace`: forget old anchors and only use the new ones;
3254		if `new`: keep pre-existing anchors but update them in case of conflict
3255		between old and new Δ47 values;
3256		if `old`: keep pre-existing anchors but preserve their original Δ47
3257		values in case of conflict.
3258		'''
3259		f = {
3260			'petersen': fCO2eqD47_Petersen,
3261			'wang': fCO2eqD47_Wang,
3262			}[fCo2eqD47]
3263		foo = {}
3264		for r in self:
3265			if 'Teq' in r:
3266				if r['Sample'] in foo:
3267					assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.'
3268				else:
3269					foo[r['Sample']] = f(r['Teq'])
3270			else:
3271					assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.'
3272
3273		if priority == 'replace':
3274			self.Nominal_D47 = {}
3275		for s in foo:
3276			if priority != 'old' or s not in self.Nominal_D47:
3277				self.Nominal_D47[s] = foo[s]

Find all samples for which Teq is specified, compute equilibrium Δ47 value for that temperature, and add treat these samples as additional anchors.

Parameters

  • fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019); wang: Wang et al. (2019)).
  • priority: if replace: forget old anchors and only use the new ones; if new: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; if old: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
def save_D47_correl(self, *args, **kwargs):
3279	def save_D47_correl(self, *args, **kwargs):
3280		return self._save_D4x_correl(*args, **kwargs)

Save D47 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D47_correl.csv)
  • D47_precision: the precision to use when writing D47 and D47_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D48data(D4xdata):
3285class D48data(D4xdata):
3286	'''
3287	Store and process data for a large set of Δ48 analyses,
3288	usually comprising more than one analytical session.
3289	'''
3290
3291	Nominal_D4x = {
3292		'ETH-1':  0.138,
3293		'ETH-2':  0.138,
3294		'ETH-3':  0.270,
3295		'ETH-4':  0.223,
3296		'GU-1':  -0.419,
3297		} # (Fiebig et al., 2019, 2021)
3298	'''
3299	Nominal Δ48 values assigned to the Δ48 anchor samples, used by
3300	`D48data.standardize()` to normalize unknown samples to an absolute Δ48
3301	reference frame.
3302
3303	By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019),
3304	[Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)):
3305
3306	```py
3307	{
3308		'ETH-1' :  0.138,
3309		'ETH-2' :  0.138,
3310		'ETH-3' :  0.270,
3311		'ETH-4' :  0.223,
3312		'GU-1'  : -0.419,
3313	}
3314	```
3315	'''
3316
3317
3318	@property
3319	def Nominal_D48(self):
3320		return self.Nominal_D4x
3321
3322	
3323	@Nominal_D48.setter
3324	def Nominal_D48(self, new):
3325		self.Nominal_D4x = dict(**new)
3326		self.refresh()
3327
3328
3329	def __init__(self, l = [], **kwargs):
3330		'''
3331		**Parameters:** same as `D4xdata.__init__()`
3332		'''
3333		D4xdata.__init__(self, l = l, mass = '48', **kwargs)
3334
3335	def save_D48_correl(self, *args, **kwargs):
3336		return self._save_D4x_correl(*args, **kwargs)
3337
3338	save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')

Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.

D48data(l=[], **kwargs)
3329	def __init__(self, l = [], **kwargs):
3330		'''
3331		**Parameters:** same as `D4xdata.__init__()`
3332		'''
3333		D4xdata.__init__(self, l = l, mass = '48', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'ETH-1': 0.138, 'ETH-2': 0.138, 'ETH-3': 0.27, 'ETH-4': 0.223, 'GU-1': -0.419}

Nominal Δ48 values assigned to the Δ48 anchor samples, used by D48data.standardize() to normalize unknown samples to an absolute Δ48 reference frame.

By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):

{
        'ETH-1' :  0.138,
        'ETH-2' :  0.138,
        'ETH-3' :  0.270,
        'ETH-4' :  0.223,
        'GU-1'  : -0.419,
}
Nominal_D48
3318	@property
3319	def Nominal_D48(self):
3320		return self.Nominal_D4x
def save_D48_correl(self, *args, **kwargs):
3335	def save_D48_correl(self, *args, **kwargs):
3336		return self._save_D4x_correl(*args, **kwargs)

Save D48 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D48_correl.csv)
  • D48_precision: the precision to use when writing D48 and D48_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)
class D49data(D4xdata):
3341class D49data(D4xdata):
3342	'''
3343	Store and process data for a large set of Δ49 analyses,
3344	usually comprising more than one analytical session.
3345	'''
3346	
3347	Nominal_D4x = {"1000C": 0.0, "25C": 2.228}  # Wang 2004
3348	'''
3349	Nominal Δ49 values assigned to the Δ49 anchor samples, used by
3350	`D49data.standardize()` to normalize unknown samples to an absolute Δ49
3351	reference frame.
3352
3353	By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)):
3354
3355	```py
3356	{
3357		"1000C": 0.0,
3358		"25C": 2.228
3359	}
3360	```
3361	'''
3362	
3363	@property
3364	def Nominal_D49(self):
3365		return self.Nominal_D4x
3366	
3367	@Nominal_D49.setter
3368	def Nominal_D49(self, new):
3369		self.Nominal_D4x = dict(**new)
3370		self.refresh()
3371	
3372	def __init__(self, l=[], **kwargs):
3373		'''
3374		**Parameters:** same as `D4xdata.__init__()`
3375		'''
3376		D4xdata.__init__(self, l=l, mass='49', **kwargs)
3377	
3378	def save_D49_correl(self, *args, **kwargs):
3379		return self._save_D4x_correl(*args, **kwargs)
3380	
3381	save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')

Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.

D49data(l=[], **kwargs)
3372	def __init__(self, l=[], **kwargs):
3373		'''
3374		**Parameters:** same as `D4xdata.__init__()`
3375		'''
3376		D4xdata.__init__(self, l=l, mass='49', **kwargs)

Parameters: same as D4xdata.__init__()

Nominal_D4x = {'1000C': 0.0, '25C': 2.228}

Nominal Δ49 values assigned to the Δ49 anchor samples, used by D49data.standardize() to normalize unknown samples to an absolute Δ49 reference frame.

By default equal to (after Wang et al. (2004)):

{
        "1000C": 0.0,
        "25C": 2.228
}
Nominal_D49
3363	@property
3364	def Nominal_D49(self):
3365		return self.Nominal_D4x
def save_D49_correl(self, *args, **kwargs):
3378	def save_D49_correl(self, *args, **kwargs):
3379		return self._save_D4x_correl(*args, **kwargs)

Save D49 values along with their SE and correlation matrix.

Parameters

  • samples: Only these samples are output (by default: all samples).
  • dir: the directory in which to save the faile (by defaut: output)
  • filename: the name to the csv file to write to (by default: D49_correl.csv)
  • D49_precision: the precision to use when writing D49 and D49_SE values (by default: 4)
  • correl_precision: the precision to use when writing correlation factor values (by default: 4)