D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
devbranch source code here and rename it toD47crunch.py. - Do any of the following:
- copy
D47crunch.pyto somewhere in your Python path - copy
D47crunch.pyto a working directory (import D47crunchwill only work if called within that directory) - copy
D47crunch.pyto any other location (e.g.,/foo/bar) and then use the following code snippet in your own code to importD47crunch:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data object which will store and process this data:
import D47crunch
mydata = D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv into our D47data object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDBD4xdata.Nominal_d18O_VPDBD47data.Nominal_D4x(also accessible throughD47data.Nominal_D47)D48data.Nominal_D4x(also accessible throughD48data.Nominal_D48)
17O correction parameters are defined by:
D4xdata.R13_VPDBD4xdata.R18_VSMOWD4xdata.R18_VPDBD4xdata.LAMBDA_17D4xdata.R17_VSMOWD4xdata.R17_VPDB
When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:
D47crunch rawdata.csv
This will create a directory named output and populate it by calling the following methods:
D47data.wg()D47data.crunch()D47data.standardize()D47data.summary()D47data.table_of_samples()D47data.table_of_sessions()D47data.plot_sessions()D47data.plot_residuals()D47data.table_of_analyses()D47data.plot_distribution_of_analyses()D47data.plot_bulk_compositions()D47data.save_D47_correl()
You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.
The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
To process Δ48 as well as Δ47, just add the --D48 option.
API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../../docpages/tutorial.md 13.. include:: ../../docpages/howto.md 14.. include:: ../../docpages/cli.md 15 16<h1>API Documentation</h1> 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) Mathieu Daëron' 23__license__ = 'MIT License - https://opensource.org/licenses/MIT' 24__date__ = '2025-12-15' 25__version__ = '2.5.3' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42from typer import rich_utils 43 44rich_utils.STYLE_HELPTEXT = '' 45 46rcParams['font.family'] = 'sans-serif' 47rcParams['font.sans-serif'] = 'Helvetica' 48rcParams['font.size'] = 10 49rcParams['mathtext.fontset'] = 'custom' 50rcParams['mathtext.rm'] = 'sans' 51rcParams['mathtext.bf'] = 'sans:bold' 52rcParams['mathtext.it'] = 'sans:italic' 53rcParams['mathtext.cal'] = 'sans:italic' 54rcParams['mathtext.default'] = 'rm' 55rcParams['xtick.major.size'] = 4 56rcParams['xtick.major.width'] = 1 57rcParams['ytick.major.size'] = 4 58rcParams['ytick.major.width'] = 1 59rcParams['axes.grid'] = False 60rcParams['axes.linewidth'] = 1 61rcParams['grid.linewidth'] = .75 62rcParams['grid.linestyle'] = '-' 63rcParams['grid.alpha'] = .15 64rcParams['savefig.dpi'] = 150 65 66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 68def fCO2eqD47_Petersen(T): 69 ''' 70 CO2 equilibrium Δ47 value as a function of T (in degrees C) 71 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 72 73 ''' 74 return float(_fCO2eqD47_Petersen(T)) 75 76 77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 79def fCO2eqD47_Wang(T): 80 ''' 81 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 82 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 83 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 84 ''' 85 return float(_fCO2eqD47_Wang(T)) 86 87 88def correlated_sum(X, C, w = None): 89 ''' 90 Compute covariance-aware linear combinations 91 92 **Parameters** 93 94 + `X`: list or 1-D array of values to sum 95 + `C`: covariance matrix for the elements of `X` 96 + `w`: list or 1-D array of weights to apply to the elements of `X` 97 (all equal to 1 by default) 98 99 Return the sum (and its SE) of the elements of `X`, with optional weights equal 100 to the elements of `w`, accounting for covariances between the elements of `X`. 101 ''' 102 if w is None: 103 w = [1 for x in X] 104 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 105 106 107def make_csv(x, hsep = ',', vsep = '\n'): 108 ''' 109 Formats a list of lists of strings as a CSV 110 111 **Parameters** 112 113 + `x`: the list of lists of strings to format 114 + `hsep`: the field separator (`,` by default) 115 + `vsep`: the line-ending convention to use (`\\n` by default) 116 117 **Example** 118 119 ```py 120 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 121 ``` 122 123 outputs: 124 125 ```py 126 a,b,c 127 d,e,f 128 ``` 129 ''' 130 return vsep.join([hsep.join(l) for l in x]) 131 132 133def pf(txt): 134 ''' 135 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 136 ''' 137 return txt.replace('-','_').replace('.','_').replace(' ','_') 138 139 140def smart_type(x): 141 ''' 142 Tries to convert string `x` to a float if it includes a decimal point, or 143 to an integer if it does not. If both attempts fail, return the original 144 string unchanged. 145 ''' 146 try: 147 y = float(x) 148 except ValueError: 149 return x 150 if '.' not in x: 151 return int(y) 152 return y 153 154class _Defaults(): 155 def __init__(self): 156 pass 157 158D47crunch_defaults = _Defaults() 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—' 160 161def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 162 ''' 163 Reads a list of lists of strings and outputs an ascii table 164 165 **Parameters** 166 167 + `x`: a list of lists of strings 168 + `header`: the number of lines to treat as header lines 169 + `hsep`: the horizontal separator between columns 170 + `vsep`: the character to use as vertical separator 171 + `align`: string of left (`<`) or right (`>`) alignment characters. 172 173 **Example** 174 175 ```py 176 print(pretty_table([ 177 ['A', 'B', 'C'], 178 ['1', '1.9999', 'foo'], 179 ['10', 'x', 'bar'], 180 ])) 181 ``` 182 yields: 183 ``` 184 —— —————— ——— 185 A B C 186 —— —————— ——— 187 1 1.9999 foo 188 10 x bar 189 —— —————— ——— 190 ``` 191 192 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 193 194 ```py 195 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 196 print(pretty_table([ 197 ['A', 'B', 'C'], 198 ['1', '1.9999', 'foo'], 199 ['10', 'x', 'bar'], 200 ])) 201 ``` 202 yields: 203 ``` 204 == ====== === 205 A B C 206 == ====== === 207 1 1.9999 foo 208 10 x bar 209 == ====== === 210 ``` 211 ''' 212 213 if vsep is None: 214 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 215 216 txt = [] 217 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 218 219 if len(widths) > len(align): 220 align += '>' * (len(widths)-len(align)) 221 sepline = hsep.join([vsep*w for w in widths]) 222 txt += [sepline] 223 for k,l in enumerate(x): 224 if k and k == header: 225 txt += [sepline] 226 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 227 txt += [sepline] 228 txt += [''] 229 return '\n'.join(txt) 230 231 232def transpose_table(x): 233 ''' 234 Transpose a list if lists 235 236 **Parameters** 237 238 + `x`: a list of lists 239 240 **Example** 241 242 ```py 243 x = [[1, 2], [3, 4]] 244 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 245 ``` 246 ''' 247 return [[e for e in c] for c in zip(*x)] 248 249 250def w_avg(X, sX) : 251 ''' 252 Compute variance-weighted average 253 254 Returns the value and SE of the weighted average of the elements of `X`, 255 with relative weights equal to their inverse variances (`1/sX**2`). 256 257 **Parameters** 258 259 + `X`: array-like of elements to average 260 + `sX`: array-like of the corresponding SE values 261 262 **Tip** 263 264 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 265 they may be rearranged using `zip()`: 266 267 ```python 268 foo = [(0, 1), (1, 0.5), (2, 0.5)] 269 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 270 ``` 271 ''' 272 X = [ x for x in X ] 273 sX = [ sx for sx in sX ] 274 W = [ sx**-2 for sx in sX ] 275 W = [ w/sum(W) for w in W ] 276 Xavg = sum([ w*x for w,x in zip(W,X) ]) 277 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 278 return Xavg, sXavg 279 280 281def read_csv(filename, sep = ''): 282 ''' 283 Read contents of `filename` in csv format and return a list of dictionaries. 284 285 In the csv string, spaces before and after field separators (`','` by default) 286 are optional. 287 288 **Parameters** 289 290 + `filename`: the csv file to read 291 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 292 whichever appers most often in the contents of `filename`. 293 ''' 294 with open(filename) as fid: 295 txt = fid.read() 296 297 if sep == '': 298 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 299 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 300 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 301 302 303def simulate_single_analysis( 304 sample = 'MYSAMPLE', 305 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 306 d13C_VPDB = None, d18O_VPDB = None, 307 D47 = None, D48 = None, D49 = 0., D17O = 0., 308 a47 = 1., b47 = 0., c47 = -0.9, 309 a48 = 1., b48 = 0., c48 = -0.45, 310 Nominal_D47 = None, 311 Nominal_D48 = None, 312 Nominal_d13C_VPDB = None, 313 Nominal_d18O_VPDB = None, 314 ALPHA_18O_ACID_REACTION = None, 315 R13_VPDB = None, 316 R17_VSMOW = None, 317 R18_VSMOW = None, 318 LAMBDA_17 = None, 319 R18_VPDB = None, 320 ): 321 ''' 322 Compute working-gas delta values for a single analysis, assuming a stochastic working 323 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 324 325 **Parameters** 326 327 + `sample`: sample name 328 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 329 (respectively –4 and +26 ‰ by default) 330 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 331 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 332 of the carbonate sample 333 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 334 Δ48 values if `D47` or `D48` are not specified 335 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 336 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 337 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 338 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 339 correction parameters (by default equal to the `D4xdata` default values) 340 341 Returns a dictionary with fields 342 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 343 ''' 344 345 if Nominal_d13C_VPDB is None: 346 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 347 348 if Nominal_d18O_VPDB is None: 349 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 350 351 if ALPHA_18O_ACID_REACTION is None: 352 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 353 354 if R13_VPDB is None: 355 R13_VPDB = D4xdata().R13_VPDB 356 357 if R17_VSMOW is None: 358 R17_VSMOW = D4xdata().R17_VSMOW 359 360 if R18_VSMOW is None: 361 R18_VSMOW = D4xdata().R18_VSMOW 362 363 if LAMBDA_17 is None: 364 LAMBDA_17 = D4xdata().LAMBDA_17 365 366 if R18_VPDB is None: 367 R18_VPDB = D4xdata().R18_VPDB 368 369 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 370 371 if Nominal_D47 is None: 372 Nominal_D47 = D47data().Nominal_D47 373 374 if Nominal_D48 is None: 375 Nominal_D48 = D48data().Nominal_D48 376 377 if d13C_VPDB is None: 378 if sample in Nominal_d13C_VPDB: 379 d13C_VPDB = Nominal_d13C_VPDB[sample] 380 else: 381 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 382 383 if d18O_VPDB is None: 384 if sample in Nominal_d18O_VPDB: 385 d18O_VPDB = Nominal_d18O_VPDB[sample] 386 else: 387 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 388 389 if D47 is None: 390 if sample in Nominal_D47: 391 D47 = Nominal_D47[sample] 392 else: 393 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 394 395 if D48 is None: 396 if sample in Nominal_D48: 397 D48 = Nominal_D48[sample] 398 else: 399 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 400 401 X = D4xdata() 402 X.R13_VPDB = R13_VPDB 403 X.R17_VSMOW = R17_VSMOW 404 X.R18_VSMOW = R18_VSMOW 405 X.LAMBDA_17 = LAMBDA_17 406 X.R18_VPDB = R18_VPDB 407 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 408 409 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 410 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 411 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 412 ) 413 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 414 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 415 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 416 D17O=D17O, D47=D47, D48=D48, D49=D49, 417 ) 418 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 419 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 420 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 421 D17O=D17O, 422 ) 423 424 d45 = 1000 * (R45/R45wg - 1) 425 d46 = 1000 * (R46/R46wg - 1) 426 d47 = 1000 * (R47/R47wg - 1) 427 d48 = 1000 * (R48/R48wg - 1) 428 d49 = 1000 * (R49/R49wg - 1) 429 430 for k in range(3): # dumb iteration to adjust for small changes in d47 431 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 432 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 433 d47 = 1000 * (R47raw/R47wg - 1) 434 d48 = 1000 * (R48raw/R48wg - 1) 435 436 return dict( 437 Sample = sample, 438 D17O = D17O, 439 d13Cwg_VPDB = d13Cwg_VPDB, 440 d18Owg_VSMOW = d18Owg_VSMOW, 441 d45 = d45, 442 d46 = d46, 443 d47 = d47, 444 d48 = d48, 445 d49 = d49, 446 ) 447 448 449def virtual_data( 450 samples = [], 451 a47 = 1., b47 = 0., c47 = -0.9, 452 a48 = 1., b48 = 0., c48 = -0.45, 453 rd45 = 0.020, rd46 = 0.060, 454 rD47 = 0.015, rD48 = 0.045, 455 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 456 session = None, 457 Nominal_D47 = None, Nominal_D48 = None, 458 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 459 ALPHA_18O_ACID_REACTION = None, 460 R13_VPDB = None, 461 R17_VSMOW = None, 462 R18_VSMOW = None, 463 LAMBDA_17 = None, 464 R18_VPDB = None, 465 seed = 0, 466 shuffle = True, 467 ): 468 ''' 469 Return list with simulated analyses from a single session. 470 471 **Parameters** 472 473 + `samples`: a list of entries; each entry is a dictionary with the following fields: 474 * `Sample`: the name of the sample 475 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 476 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 477 * `N`: how many analyses to generate for this sample 478 + `a47`: scrambling factor for Δ47 479 + `b47`: compositional nonlinearity for Δ47 480 + `c47`: working gas offset for Δ47 481 + `a48`: scrambling factor for Δ48 482 + `b48`: compositional nonlinearity for Δ48 483 + `c48`: working gas offset for Δ48 484 + `rd45`: analytical repeatability of δ45 485 + `rd46`: analytical repeatability of δ46 486 + `rD47`: analytical repeatability of Δ47 487 + `rD48`: analytical repeatability of Δ48 488 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 489 (by default equal to the `simulate_single_analysis` default values) 490 + `session`: name of the session (no name by default) 491 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 492 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 493 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 494 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 495 (by default equal to the `simulate_single_analysis` defaults) 496 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 497 (by default equal to the `simulate_single_analysis` defaults) 498 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 499 correction parameters (by default equal to the `simulate_single_analysis` default) 500 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 501 + `shuffle`: randomly reorder the sequence of analyses 502 503 504 Here is an example of using this method to generate an arbitrary combination of 505 anchors and unknowns for a bunch of sessions: 506 507 ```py 508 .. include:: ../../code_examples/virtual_data/example.py 509 ``` 510 511 This should output something like: 512 513 ``` 514 .. include:: ../../code_examples/virtual_data/output.txt 515 ``` 516 ''' 517 518 kwargs = locals().copy() 519 520 from numpy import random as nprandom 521 if seed: 522 nprandom.seed(seed) 523 rng = nprandom.default_rng(seed) 524 else: 525 rng = nprandom.default_rng() 526 527 N = sum([s['N'] for s in samples]) 528 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 529 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 530 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 531 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 532 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 533 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 534 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 535 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 536 537 k = 0 538 out = [] 539 for s in samples: 540 kw = {} 541 kw['sample'] = s['Sample'] 542 kw = { 543 **kw, 544 **{var: kwargs[var] 545 for var in [ 546 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 547 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 548 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 549 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 550 ] 551 if kwargs[var] is not None}, 552 **{var: s[var] 553 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 554 if var in s}, 555 } 556 557 sN = s['N'] 558 while sN: 559 out.append(simulate_single_analysis(**kw)) 560 out[-1]['d45'] += errors45[k] 561 out[-1]['d46'] += errors46[k] 562 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 563 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 564 sN -= 1 565 k += 1 566 567 if session is not None: 568 for r in out: 569 r['Session'] = session 570 571 if shuffle: 572 nprandom.shuffle(out) 573 574 return out 575 576def table_of_samples( 577 data47 = None, 578 data48 = None, 579 dir = 'output', 580 filename = None, 581 save_to_file = True, 582 print_out = True, 583 output = None, 584 ): 585 ''' 586 Print out, save to disk and/or return a combined table of samples 587 for a pair of `D47data` and `D48data` objects. 588 589 **Parameters** 590 591 + `data47`: `D47data` instance 592 + `data48`: `D48data` instance 593 + `dir`: the directory in which to save the table 594 + `filename`: the name to the csv file to write to 595 + `save_to_file`: whether to save the table to disk 596 + `print_out`: whether to print out the table 597 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 598 if set to `'raw'`: return a list of list of strings 599 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 600 ''' 601 if data47 is None: 602 if data48 is None: 603 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 604 else: 605 return data48.table_of_samples( 606 dir = dir, 607 filename = filename, 608 save_to_file = save_to_file, 609 print_out = print_out, 610 output = output 611 ) 612 else: 613 if data48 is None: 614 return data47.table_of_samples( 615 dir = dir, 616 filename = filename, 617 save_to_file = save_to_file, 618 print_out = print_out, 619 output = output 620 ) 621 else: 622 samples = ( 623 sorted([a for a in data47.anchors if a in data48.anchors]) 624 + sorted([a for a in data47.anchors if a not in data48.anchors]) 625 + sorted([a for a in data48.anchors if a not in data47.anchors]) 626 + sorted([a for a in data47.unknowns if a in data48.unknowns]) 627 ) 628 629 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 630 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 631 632 out47 = {l[0]: l for l in out47} 633 out48 = {l[0]: l for l in out48} 634 635 out = [out47['Sample'] + out48['Sample'][4:]] 636 for s in samples: 637 out.append(out47[s] + out48[s][4:]) 638 639 if save_to_file: 640 if not os.path.exists(dir): 641 os.makedirs(dir) 642 if filename is None: 643 filename = f'D47D48_samples.csv' 644 with open(f'{dir}/{filename}', 'w') as fid: 645 fid.write(make_csv(out)) 646 if print_out: 647 print('\n'+pretty_table(out)) 648 if output == 'raw': 649 return out 650 elif output == 'pretty': 651 return pretty_table(out) 652 653 654def table_of_sessions( 655 data47 = None, 656 data48 = None, 657 dir = 'output', 658 filename = None, 659 save_to_file = True, 660 print_out = True, 661 output = None, 662 ): 663 ''' 664 Print out, save to disk and/or return a combined table of sessions 665 for a pair of `D47data` and `D48data` objects. 666 ***Only applicable if the sessions in `data47` and those in `data48` 667 consist of the exact same sets of analyses.*** 668 669 **Parameters** 670 671 + `data47`: `D47data` instance 672 + `data48`: `D48data` instance 673 + `dir`: the directory in which to save the table 674 + `filename`: the name to the csv file to write to 675 + `save_to_file`: whether to save the table to disk 676 + `print_out`: whether to print out the table 677 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 678 if set to `'raw'`: return a list of list of strings 679 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 680 ''' 681 if data47 is None: 682 if data48 is None: 683 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 684 else: 685 return data48.table_of_sessions( 686 dir = dir, 687 filename = filename, 688 save_to_file = save_to_file, 689 print_out = print_out, 690 output = output 691 ) 692 else: 693 if data48 is None: 694 return data47.table_of_sessions( 695 dir = dir, 696 filename = filename, 697 save_to_file = save_to_file, 698 print_out = print_out, 699 output = output 700 ) 701 else: 702 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 703 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 704 for k,x in enumerate(out47[0]): 705 if k>7: 706 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 707 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 708 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 709 710 if save_to_file: 711 if not os.path.exists(dir): 712 os.makedirs(dir) 713 if filename is None: 714 filename = f'D47D48_sessions.csv' 715 with open(f'{dir}/{filename}', 'w') as fid: 716 fid.write(make_csv(out)) 717 if print_out: 718 print('\n'+pretty_table(out)) 719 if output == 'raw': 720 return out 721 elif output == 'pretty': 722 return pretty_table(out) 723 724 725def table_of_analyses( 726 data47 = None, 727 data48 = None, 728 dir = 'output', 729 filename = None, 730 save_to_file = True, 731 print_out = True, 732 output = None, 733 ): 734 ''' 735 Print out, save to disk and/or return a combined table of analyses 736 for a pair of `D47data` and `D48data` objects. 737 738 If the sessions in `data47` and those in `data48` do not consist of 739 the exact same sets of analyses, the table will have two columns 740 `Session_47` and `Session_48` instead of a single `Session` column. 741 742 **Parameters** 743 744 + `data47`: `D47data` instance 745 + `data48`: `D48data` instance 746 + `dir`: the directory in which to save the table 747 + `filename`: the name to the csv file to write to 748 + `save_to_file`: whether to save the table to disk 749 + `print_out`: whether to print out the table 750 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 751 if set to `'raw'`: return a list of list of strings 752 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 753 ''' 754 if data47 is None: 755 if data48 is None: 756 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 757 else: 758 return data48.table_of_analyses( 759 dir = dir, 760 filename = filename, 761 save_to_file = save_to_file, 762 print_out = print_out, 763 output = output 764 ) 765 else: 766 if data48 is None: 767 return data47.table_of_analyses( 768 dir = dir, 769 filename = filename, 770 save_to_file = save_to_file, 771 print_out = print_out, 772 output = output 773 ) 774 else: 775 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 776 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 777 778 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 779 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 780 else: 781 out47[0][1] = 'Session_47' 782 out48[0][1] = 'Session_48' 783 out47 = transpose_table(out47) 784 out48 = transpose_table(out48) 785 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 786 787 if save_to_file: 788 if not os.path.exists(dir): 789 os.makedirs(dir) 790 if filename is None: 791 filename = f'D47D48_analyses.csv' 792 with open(f'{dir}/{filename}', 'w') as fid: 793 fid.write(make_csv(out)) 794 if print_out: 795 print('\n'+pretty_table(out)) 796 if output == 'raw': 797 return out 798 elif output == 'pretty': 799 return pretty_table(out) 800 801 802def _fullcovar(minresult, epsilon = 0.01, named = False): 803 ''' 804 Construct full covariance matrix in the case of constrained parameters 805 ''' 806 807 import asteval 808 809 def f(values): 810 interp = asteval.Interpreter() 811 for n,v in zip(minresult.var_names, values): 812 interp(f'{n} = {v}') 813 for q in minresult.params: 814 if minresult.params[q].expr: 815 interp(f'{q} = {minresult.params[q].expr}') 816 return np.array([interp.symtable[q] for q in minresult.params]) 817 818 # construct Jacobian 819 J = np.zeros((minresult.nvarys, len(minresult.params))) 820 X = np.array([minresult.params[p].value for p in minresult.var_names]) 821 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 822 823 for j in range(minresult.nvarys): 824 x1 = [_ for _ in X] 825 x1[j] += epsilon * sX[j] 826 x2 = [_ for _ in X] 827 x2[j] -= epsilon * sX[j] 828 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 829 830 _names = [q for q in minresult.params] 831 _covar = J.T @ minresult.covar @ J 832 _se = np.diag(_covar)**.5 833 _correl = _covar.copy() 834 for k,s in enumerate(_se): 835 if s: 836 _correl[k,:] /= s 837 _correl[:,k] /= s 838 839 if named: 840 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 841 _se = {i: _se[i] for i in minresult.params} 842 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 843 844 return _names, _covar, _se, _correl 845 846 847class D4xdata(list): 848 ''' 849 Store and process data for a large set of Δ47 and/or Δ48 850 analyses, usually comprising more than one analytical session. 851 ''' 852 853 ### 17O CORRECTION PARAMETERS 854 R13_VPDB = 0.01118 # (Chang & Li, 1990) 855 ''' 856 Absolute (13C/12C) ratio of VPDB. 857 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 858 ''' 859 860 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 861 ''' 862 Absolute (18O/16C) ratio of VSMOW. 863 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 864 ''' 865 866 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 867 ''' 868 Mass-dependent exponent for triple oxygen isotopes. 869 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 870 ''' 871 872 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 873 ''' 874 Absolute (17O/16C) ratio of VSMOW. 875 By default equal to 0.00038475 876 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 877 rescaled to `R13_VPDB`) 878 ''' 879 880 R18_VPDB = R18_VSMOW * 1.03092 881 ''' 882 Absolute (18O/16C) ratio of VPDB. 883 By definition equal to `R18_VSMOW * 1.03092`. 884 ''' 885 886 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 887 ''' 888 Absolute (17O/16C) ratio of VPDB. 889 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 890 ''' 891 892 LEVENE_REF_SAMPLE = 'ETH-3' 893 ''' 894 After the Δ4x standardization step, each sample is tested to 895 assess whether the Δ4x variance within all analyses for that 896 sample differs significantly from that observed for a given reference 897 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 898 which yields a p-value corresponding to the null hypothesis that the 899 underlying variances are equal). 900 901 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 902 sample should be used as a reference for this test. 903 ''' 904 905 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 906 ''' 907 Specifies the 18O/16O fractionation factor generally applicable 908 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 909 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 910 911 By default equal to 1.008129 (calcite reacted at 90 °C, 912 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 913 ''' 914 915 Nominal_d13C_VPDB = { 916 'ETH-1': 2.02, 917 'ETH-2': -10.17, 918 'ETH-3': 1.71, 919 } # (Bernasconi et al., 2018) 920 ''' 921 Nominal δ13C_VPDB values assigned to carbonate standards, used by 922 `D4xdata.standardize_d13C()`. 923 924 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 925 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 926 ''' 927 928 Nominal_d18O_VPDB = { 929 'ETH-1': -2.19, 930 'ETH-2': -18.69, 931 'ETH-3': -1.78, 932 } # (Bernasconi et al., 2018) 933 ''' 934 Nominal δ18O_VPDB values assigned to carbonate standards, used by 935 `D4xdata.standardize_d18O()`. 936 937 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 938 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 939 ''' 940 941 d13C_STANDARDIZATION_METHOD = '2pt' 942 ''' 943 Method by which to standardize δ13C values: 944 945 + `none`: do not apply any δ13C standardization. 946 + `'1pt'`: within each session, offset all initial δ13C values so as to 947 minimize the difference between final δ13C_VPDB values and 948 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 949 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 950 values so as to minimize the difference between final δ13C_VPDB 951 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 952 is defined). 953 ''' 954 955 d18O_STANDARDIZATION_METHOD = '2pt' 956 ''' 957 Method by which to standardize δ18O values: 958 959 + `none`: do not apply any δ18O standardization. 960 + `'1pt'`: within each session, offset all initial δ18O values so as to 961 minimize the difference between final δ18O_VPDB values and 962 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 963 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 964 values so as to minimize the difference between final δ18O_VPDB 965 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 966 is defined). 967 ''' 968 969 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 970 ''' 971 **Parameters** 972 973 + `l`: a list of dictionaries, with each dictionary including at least the keys 974 `Sample`, `d45`, `d46`, and `d47` or `d48`. 975 + `mass`: `'47'` or `'48'` 976 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 977 + `session`: define session name for analyses without a `Session` key 978 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 979 980 Returns a `D4xdata` object derived from `list`. 981 ''' 982 self._4x = mass 983 self.verbose = verbose 984 self.prefix = 'D4xdata' 985 self.logfile = logfile 986 list.__init__(self, l) 987 self.Nf = None 988 self.repeatability = {} 989 self.refresh(session = session) 990 991 992 def make_verbal(oldfun): 993 ''' 994 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 995 ''' 996 @wraps(oldfun) 997 def newfun(*args, verbose = '', **kwargs): 998 myself = args[0] 999 oldprefix = myself.prefix 1000 myself.prefix = oldfun.__name__ 1001 if verbose != '': 1002 oldverbose = myself.verbose 1003 myself.verbose = verbose 1004 out = oldfun(*args, **kwargs) 1005 myself.prefix = oldprefix 1006 if verbose != '': 1007 myself.verbose = oldverbose 1008 return out 1009 return newfun 1010 1011 1012 def msg(self, txt): 1013 ''' 1014 Log a message to `self.logfile`, and print it out if `verbose = True` 1015 ''' 1016 self.log(txt) 1017 if self.verbose: 1018 print(f'{f"[{self.prefix}]":<16} {txt}') 1019 1020 1021 def vmsg(self, txt): 1022 ''' 1023 Log a message to `self.logfile` and print it out 1024 ''' 1025 self.log(txt) 1026 print(txt) 1027 1028 1029 def log(self, *txts): 1030 ''' 1031 Log a message to `self.logfile` 1032 ''' 1033 if self.logfile: 1034 with open(self.logfile, 'a') as fid: 1035 for txt in txts: 1036 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1037 1038 1039 def refresh(self, session = 'mySession'): 1040 ''' 1041 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1042 ''' 1043 self.fill_in_missing_info(session = session) 1044 self.refresh_sessions() 1045 self.refresh_samples() 1046 1047 1048 def refresh_sessions(self): 1049 ''' 1050 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1051 to `False` for all sessions. 1052 ''' 1053 self.sessions = { 1054 s: {'data': [r for r in self if r['Session'] == s]} 1055 for s in sorted({r['Session'] for r in self}) 1056 } 1057 for s in self.sessions: 1058 self.sessions[s]['scrambling_drift'] = False 1059 self.sessions[s]['slope_drift'] = False 1060 self.sessions[s]['wg_drift'] = False 1061 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1062 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1063 1064 1065 def refresh_samples(self): 1066 ''' 1067 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1068 ''' 1069 self.samples = { 1070 s: {'data': [r for r in self if r['Sample'] == s]} 1071 for s in sorted({r['Sample'] for r in self}) 1072 } 1073 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1074 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1075 1076 1077 def read(self, filename, sep = '', session = ''): 1078 ''' 1079 Read file in csv format to load data into a `D47data` object. 1080 1081 In the csv file, spaces before and after field separators (`','` by default) 1082 are optional. Each line corresponds to a single analysis. 1083 1084 The required fields are: 1085 1086 + `UID`: a unique identifier 1087 + `Session`: an identifier for the analytical session 1088 + `Sample`: a sample identifier 1089 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1090 1091 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1092 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1093 and `d49` are optional, and set to NaN by default. 1094 1095 **Parameters** 1096 1097 + `fileneme`: the path of the file to read 1098 + `sep`: csv separator delimiting the fields 1099 + `session`: set `Session` field to this string for all analyses 1100 ''' 1101 with open(filename) as fid: 1102 self.input(fid.read(), sep = sep, session = session) 1103 1104 1105 def input(self, txt, sep = '', session = ''): 1106 ''' 1107 Read `txt` string in csv format to load analysis data into a `D47data` object. 1108 1109 In the csv string, spaces before and after field separators (`','` by default) 1110 are optional. Each line corresponds to a single analysis. 1111 1112 The required fields are: 1113 1114 + `UID`: a unique identifier 1115 + `Session`: an identifier for the analytical session 1116 + `Sample`: a sample identifier 1117 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1118 1119 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1120 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1121 and `d49` are optional, and set to NaN by default. 1122 1123 **Parameters** 1124 1125 + `txt`: the csv string to read 1126 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1127 whichever appers most often in `txt`. 1128 + `session`: set `Session` field to this string for all analyses 1129 ''' 1130 if sep == '': 1131 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1132 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1133 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1134 1135 if session != '': 1136 for r in data: 1137 r['Session'] = session 1138 1139 self += data 1140 self.refresh() 1141 1142 1143 @make_verbal 1144 def wg(self, 1145 samples = None, 1146 session_groups = None, 1147 ): 1148 ''' 1149 Compute bulk composition of the working gas for each session based (by default) 1150 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1151 `self.Nominal_d18O_VPDB`. 1152 1153 **Parameters** 1154 1155 + `samples`: A list of samples specifying the subset of samples (defined in both 1156 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1157 when computing the working gas. By default, use all samples defined both in 1158 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1159 + `session_groups`: a list of lists of sessions 1160 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1161 specifying which sessions groups, if any, have the exact same WG composition. 1162 If set to `'all'`, force all sessions to have the same WG composition (use with 1163 caution and on short time scales, since the WG may drift slowly a long time scales). 1164 ''' 1165 1166 self.msg('Computing WG composition:') 1167 1168 a18_acid = self.ALPHA_18O_ACID_REACTION 1169 1170 if samples is None: 1171 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1172 if session_groups is None: 1173 session_groups = [[s] for s in self.sessions] 1174 elif session_groups == 'all': 1175 session_groups = [[s for s in self.sessions]] 1176 1177 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1178 R45R46_standards = {} 1179 for sample in samples: 1180 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1181 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1182 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1183 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1184 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1185 1186 C12_s = 1 / (1 + R13_s) 1187 C13_s = R13_s / (1 + R13_s) 1188 C16_s = 1 / (1 + R17_s + R18_s) 1189 C17_s = R17_s / (1 + R17_s + R18_s) 1190 C18_s = R18_s / (1 + R17_s + R18_s) 1191 1192 C626_s = C12_s * C16_s ** 2 1193 C627_s = 2 * C12_s * C16_s * C17_s 1194 C628_s = 2 * C12_s * C16_s * C18_s 1195 C636_s = C13_s * C16_s ** 2 1196 C637_s = 2 * C13_s * C16_s * C17_s 1197 C727_s = C12_s * C17_s ** 2 1198 1199 R45_s = (C627_s + C636_s) / C626_s 1200 R46_s = (C628_s + C637_s + C727_s) / C626_s 1201 R45R46_standards[sample] = (R45_s, R46_s) 1202 1203 for sg in session_groups: 1204 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1205 assert db, f'No sample from {samples} found in session group {sg}.' 1206 1207 X = [r['d45'] for r in db] 1208 Y = [R45R46_standards[r['Sample']][0] for r in db] 1209 x1, x2 = np.min(X), np.max(X) 1210 1211 if x1 < x2: 1212 wgcoord = x1/(x1-x2) 1213 else: 1214 wgcoord = 999 1215 1216 if wgcoord < -.5 or wgcoord > 1.5: 1217 # unreasonable to extrapolate to d45 = 0 1218 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1219 else : 1220 # d45 = 0 is reasonably well bracketed 1221 R45_wg = np.polyfit(X, Y, 1)[1] 1222 1223 X = [r['d46'] for r in db] 1224 Y = [R45R46_standards[r['Sample']][1] for r in db] 1225 x1, x2 = np.min(X), np.max(X) 1226 1227 if x1 < x2: 1228 wgcoord = x1/(x1-x2) 1229 else: 1230 wgcoord = 999 1231 1232 if wgcoord < -.5 or wgcoord > 1.5: 1233 # unreasonable to extrapolate to d46 = 0 1234 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1235 else : 1236 # d46 = 0 is reasonably well bracketed 1237 R46_wg = np.polyfit(X, Y, 1)[1] 1238 1239 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1240 1241 for s in sg: 1242 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1243 1244 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1245 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1246 for r in self.sessions[s]['data']: 1247 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1248 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1249 1250 1251 def compute_bulk_delta(self, R45, R46, D17O = 0): 1252 ''' 1253 Compute δ13C_VPDB and δ18O_VSMOW, 1254 by solving the generalized form of equation (17) from 1255 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1256 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1257 solving the corresponding second-order Taylor polynomial. 1258 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1259 ''' 1260 1261 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1262 1263 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1264 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1265 C = 2 * self.R18_VSMOW 1266 D = -R46 1267 1268 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1269 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1270 cc = A + B + C + D 1271 1272 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1273 1274 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1275 R17 = K * R18 ** self.LAMBDA_17 1276 R13 = R45 - 2 * R17 1277 1278 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1279 1280 return d13C_VPDB, d18O_VSMOW 1281 1282 1283 @make_verbal 1284 def crunch(self, verbose = ''): 1285 ''' 1286 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1287 ''' 1288 for r in self: 1289 self.compute_bulk_and_clumping_deltas(r) 1290 self.standardize_d13C() 1291 self.standardize_d18O() 1292 self.msg(f"Crunched {len(self)} analyses.") 1293 1294 1295 def fill_in_missing_info(self, session = 'mySession'): 1296 ''' 1297 Fill in optional fields with default values 1298 ''' 1299 for i,r in enumerate(self): 1300 if 'D17O' not in r: 1301 r['D17O'] = 0. 1302 if 'UID' not in r: 1303 r['UID'] = f'{i+1}' 1304 if 'Session' not in r: 1305 r['Session'] = session 1306 for k in ['d47', 'd48', 'd49']: 1307 if k not in r: 1308 r[k] = np.nan 1309 1310 1311 def standardize_d13C(self): 1312 ''' 1313 Perform δ13C standadization within each session `s` according to 1314 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1315 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1316 may be redefined abitrarily at a later stage. 1317 ''' 1318 for s in self.sessions: 1319 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1320 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1321 X,Y = zip(*XY) 1322 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1323 offset = np.mean(Y) - np.mean(X) 1324 for r in self.sessions[s]['data']: 1325 r['d13C_VPDB'] += offset 1326 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1327 a,b = np.polyfit(X,Y,1) 1328 for r in self.sessions[s]['data']: 1329 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1330 1331 def standardize_d18O(self): 1332 ''' 1333 Perform δ18O standadization within each session `s` according to 1334 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1335 which is defined by default by `D47data.refresh_sessions()`as equal to 1336 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1337 ''' 1338 for s in self.sessions: 1339 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1340 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1341 X,Y = zip(*XY) 1342 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1343 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1344 offset = np.mean(Y) - np.mean(X) 1345 for r in self.sessions[s]['data']: 1346 r['d18O_VSMOW'] += offset 1347 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1348 a,b = np.polyfit(X,Y,1) 1349 for r in self.sessions[s]['data']: 1350 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1351 1352 1353 def compute_bulk_and_clumping_deltas(self, r): 1354 ''' 1355 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1356 ''' 1357 1358 # Compute working gas R13, R18, and isobar ratios 1359 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1360 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1361 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1362 1363 # Compute analyte isobar ratios 1364 R45 = (1 + r['d45'] / 1000) * R45_wg 1365 R46 = (1 + r['d46'] / 1000) * R46_wg 1366 R47 = (1 + r['d47'] / 1000) * R47_wg 1367 R48 = (1 + r['d48'] / 1000) * R48_wg 1368 R49 = (1 + r['d49'] / 1000) * R49_wg 1369 1370 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1371 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1372 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1373 1374 # Compute stochastic isobar ratios of the analyte 1375 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1376 R13, R18, D17O = r['D17O'] 1377 ) 1378 1379 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1380 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1381 if (R45 / R45stoch - 1) > 5e-8: 1382 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1383 if (R46 / R46stoch - 1) > 5e-8: 1384 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1385 1386 # Compute raw clumped isotope anomalies 1387 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1388 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1389 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1390 1391 1392 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1393 ''' 1394 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1395 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1396 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1397 ''' 1398 1399 # Compute R17 1400 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1401 1402 # Compute isotope concentrations 1403 C12 = (1 + R13) ** -1 1404 C13 = C12 * R13 1405 C16 = (1 + R17 + R18) ** -1 1406 C17 = C16 * R17 1407 C18 = C16 * R18 1408 1409 # Compute stochastic isotopologue concentrations 1410 C626 = C16 * C12 * C16 1411 C627 = C16 * C12 * C17 * 2 1412 C628 = C16 * C12 * C18 * 2 1413 C636 = C16 * C13 * C16 1414 C637 = C16 * C13 * C17 * 2 1415 C638 = C16 * C13 * C18 * 2 1416 C727 = C17 * C12 * C17 1417 C728 = C17 * C12 * C18 * 2 1418 C737 = C17 * C13 * C17 1419 C738 = C17 * C13 * C18 * 2 1420 C828 = C18 * C12 * C18 1421 C838 = C18 * C13 * C18 1422 1423 # Compute stochastic isobar ratios 1424 R45 = (C636 + C627) / C626 1425 R46 = (C628 + C637 + C727) / C626 1426 R47 = (C638 + C728 + C737) / C626 1427 R48 = (C738 + C828) / C626 1428 R49 = C838 / C626 1429 1430 # Account for stochastic anomalies 1431 R47 *= 1 + D47 / 1000 1432 R48 *= 1 + D48 / 1000 1433 R49 *= 1 + D49 / 1000 1434 1435 # Return isobar ratios 1436 return R45, R46, R47, R48, R49 1437 1438 1439 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1440 ''' 1441 Split unknown samples by UID (treat all analyses as different samples) 1442 or by session (treat analyses of a given sample in different sessions as 1443 different samples). 1444 1445 **Parameters** 1446 1447 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1448 + `grouping`: `by_uid` | `by_session` 1449 ''' 1450 if samples_to_split == 'all': 1451 samples_to_split = [s for s in self.unknowns] 1452 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1453 self.grouping = grouping.lower() 1454 if self.grouping in gkeys: 1455 gkey = gkeys[self.grouping] 1456 for r in self: 1457 if r['Sample'] in samples_to_split: 1458 r['Sample_original'] = r['Sample'] 1459 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1460 elif r['Sample'] in self.unknowns: 1461 r['Sample_original'] = r['Sample'] 1462 self.refresh_samples() 1463 1464 1465 def unsplit_samples(self, tables = False): 1466 ''' 1467 Reverse the effects of `D47data.split_samples()`. 1468 1469 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1470 1471 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1472 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1473 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1474 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1475 that case session-averaged Δ4x values are statistically independent). 1476 ''' 1477 unknowns_old = sorted({s for s in self.unknowns}) 1478 CM_old = self.standardization.covar[:,:] 1479 VD_old = self.standardization.params.valuesdict().copy() 1480 vars_old = self.standardization.var_names 1481 1482 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1483 1484 Ns = len(vars_old) - len(unknowns_old) 1485 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1486 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1487 1488 W = np.zeros((len(vars_new), len(vars_old))) 1489 W[:Ns,:Ns] = np.eye(Ns) 1490 for u in unknowns_new: 1491 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1492 if self.grouping == 'by_session': 1493 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1494 elif self.grouping == 'by_uid': 1495 weights = [1 for s in splits] 1496 sw = sum(weights) 1497 weights = [w/sw for w in weights] 1498 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1499 1500 CM_new = W @ CM_old @ W.T 1501 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1502 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1503 1504 self.standardization.covar = CM_new 1505 self.standardization.params.valuesdict = lambda : VD_new 1506 self.standardization.var_names = vars_new 1507 1508 for r in self: 1509 if r['Sample'] in self.unknowns: 1510 r['Sample_split'] = r['Sample'] 1511 r['Sample'] = r['Sample_original'] 1512 1513 self.refresh_samples() 1514 self.consolidate_samples() 1515 self.repeatabilities() 1516 1517 if tables: 1518 self.table_of_analyses() 1519 self.table_of_samples() 1520 1521 def assign_timestamps(self): 1522 ''' 1523 Assign a time field `t` of type `float` to each analysis. 1524 1525 If `TimeTag` is one of the data fields, `t` is equal within a given session 1526 to `TimeTag` minus the mean value of `TimeTag` for that session. 1527 Otherwise, `TimeTag` is by default equal to the index of each analysis 1528 in the dataset and `t` is defined as above. 1529 ''' 1530 for session in self.sessions: 1531 sdata = self.sessions[session]['data'] 1532 try: 1533 t0 = np.mean([r['TimeTag'] for r in sdata]) 1534 for r in sdata: 1535 r['t'] = r['TimeTag'] - t0 1536 except KeyError: 1537 t0 = (len(sdata)-1)/2 1538 for t,r in enumerate(sdata): 1539 r['t'] = t - t0 1540 1541 1542 def report(self): 1543 ''' 1544 Prints a report on the standardization fit. 1545 Only applicable after `D4xdata.standardize(method='pooled')`. 1546 ''' 1547 report_fit(self.standardization) 1548 1549 1550 def combine_samples(self, sample_groups): 1551 ''' 1552 Combine analyses of different samples to compute weighted average Δ4x 1553 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1554 dictionary. 1555 1556 Caution: samples are weighted by number of replicate analyses, which is a 1557 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1558 correlated analytical errors for one or more samples). 1559 1560 Returns a tuplet of: 1561 1562 + the list of group names 1563 + an array of the corresponding Δ4x values 1564 + the corresponding (co)variance matrix 1565 1566 **Parameters** 1567 1568 + `sample_groups`: a dictionary of the form: 1569 ```py 1570 {'group1': ['sample_1', 'sample_2'], 1571 'group2': ['sample_3', 'sample_4', 'sample_5']} 1572 ``` 1573 ''' 1574 1575 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1576 groups = sorted(sample_groups.keys()) 1577 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1578 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1579 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1580 W = np.array([ 1581 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1582 for j in groups]) 1583 D4x_new = W @ D4x_old 1584 CM_new = W @ CM_old @ W.T 1585 1586 return groups, D4x_new[:,0], CM_new 1587 1588 1589 @make_verbal 1590 def standardize(self, 1591 method = 'pooled', 1592 weighted_sessions = [], 1593 consolidate = True, 1594 consolidate_tables = False, 1595 consolidate_plots = False, 1596 constraints = {}, 1597 ): 1598 ''' 1599 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1600 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1601 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1602 i.e. that their true Δ4x value does not change between sessions, 1603 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1604 `'indep_sessions'`, the standardization processes each session independently, based only 1605 on anchors analyses. 1606 ''' 1607 1608 self.standardization_method = method 1609 self.assign_timestamps() 1610 1611 if method == 'pooled': 1612 if weighted_sessions: 1613 for session_group in weighted_sessions: 1614 if self._4x == '47': 1615 X = D47data([r for r in self if r['Session'] in session_group]) 1616 elif self._4x == '48': 1617 X = D48data([r for r in self if r['Session'] in session_group]) 1618 X.Nominal_D4x = self.Nominal_D4x.copy() 1619 X.refresh() 1620 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1621 w = np.sqrt(result.redchi) 1622 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1623 for r in X: 1624 r[f'wD{self._4x}raw'] *= w 1625 else: 1626 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1627 for r in self: 1628 r[f'wD{self._4x}raw'] = 1. 1629 1630 params = Parameters() 1631 for k,session in enumerate(self.sessions): 1632 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1633 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1634 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1635 s = pf(session) 1636 params.add(f'a_{s}', value = 0.9) 1637 params.add(f'b_{s}', value = 0.) 1638 params.add(f'c_{s}', value = -0.9) 1639 params.add(f'a2_{s}', value = 0., 1640# vary = self.sessions[session]['scrambling_drift'], 1641 ) 1642 params.add(f'b2_{s}', value = 0., 1643# vary = self.sessions[session]['slope_drift'], 1644 ) 1645 params.add(f'c2_{s}', value = 0., 1646# vary = self.sessions[session]['wg_drift'], 1647 ) 1648 if not self.sessions[session]['scrambling_drift']: 1649 params[f'a2_{s}'].expr = '0' 1650 if not self.sessions[session]['slope_drift']: 1651 params[f'b2_{s}'].expr = '0' 1652 if not self.sessions[session]['wg_drift']: 1653 params[f'c2_{s}'].expr = '0' 1654 1655 for sample in self.unknowns: 1656 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1657 1658 for k in constraints: 1659 params[k].expr = constraints[k] 1660 1661 def residuals(p): 1662 R = [] 1663 for r in self: 1664 session = pf(r['Session']) 1665 sample = pf(r['Sample']) 1666 if r['Sample'] in self.Nominal_D4x: 1667 R += [ ( 1668 r[f'D{self._4x}raw'] - ( 1669 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1670 + p[f'b_{session}'] * r[f'd{self._4x}'] 1671 + p[f'c_{session}'] 1672 + r['t'] * ( 1673 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1674 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1675 + p[f'c2_{session}'] 1676 ) 1677 ) 1678 ) / r[f'wD{self._4x}raw'] ] 1679 else: 1680 R += [ ( 1681 r[f'D{self._4x}raw'] - ( 1682 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1683 + p[f'b_{session}'] * r[f'd{self._4x}'] 1684 + p[f'c_{session}'] 1685 + r['t'] * ( 1686 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1687 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1688 + p[f'c2_{session}'] 1689 ) 1690 ) 1691 ) / r[f'wD{self._4x}raw'] ] 1692 return R 1693 1694 M = Minimizer(residuals, params) 1695 result = M.least_squares() 1696 self.Nf = result.nfree 1697 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1698 new_names, new_covar, new_se = _fullcovar(result)[:3] 1699 result.var_names = new_names 1700 result.covar = new_covar 1701 1702 for r in self: 1703 s = pf(r["Session"]) 1704 a = result.params.valuesdict()[f'a_{s}'] 1705 b = result.params.valuesdict()[f'b_{s}'] 1706 c = result.params.valuesdict()[f'c_{s}'] 1707 a2 = result.params.valuesdict()[f'a2_{s}'] 1708 b2 = result.params.valuesdict()[f'b2_{s}'] 1709 c2 = result.params.valuesdict()[f'c2_{s}'] 1710 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1711 1712 1713 self.standardization = result 1714 1715 for session in self.sessions: 1716 self.sessions[session]['Np'] = 3 1717 for k in ['scrambling', 'slope', 'wg']: 1718 if self.sessions[session][f'{k}_drift']: 1719 self.sessions[session]['Np'] += 1 1720 1721 if consolidate: 1722 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1723 return result 1724 1725 1726 elif method == 'indep_sessions': 1727 1728 if weighted_sessions: 1729 for session_group in weighted_sessions: 1730 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1731 X.Nominal_D4x = self.Nominal_D4x.copy() 1732 X.refresh() 1733 # This is only done to assign r['wD47raw'] for r in X: 1734 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1735 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1736 else: 1737 self.msg('All weights set to 1 ‰') 1738 for r in self: 1739 r[f'wD{self._4x}raw'] = 1 1740 1741 for session in self.sessions: 1742 s = self.sessions[session] 1743 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1744 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1745 s['Np'] = sum(p_active) 1746 sdata = s['data'] 1747 1748 A = np.array([ 1749 [ 1750 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1751 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1752 1 / r[f'wD{self._4x}raw'], 1753 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1754 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1755 r['t'] / r[f'wD{self._4x}raw'] 1756 ] 1757 for r in sdata if r['Sample'] in self.anchors 1758 ])[:,p_active] # only keep columns for the active parameters 1759 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1760 s['Na'] = Y.size 1761 CM = linalg.inv(A.T @ A) 1762 bf = (CM @ A.T @ Y).T[0,:] 1763 k = 0 1764 for n,a in zip(p_names, p_active): 1765 if a: 1766 s[n] = bf[k] 1767# self.msg(f'{n} = {bf[k]}') 1768 k += 1 1769 else: 1770 s[n] = 0. 1771# self.msg(f'{n} = 0.0') 1772 1773 for r in sdata : 1774 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1775 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1776 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1777 1778 s['CM'] = np.zeros((6,6)) 1779 i = 0 1780 k_active = [j for j,a in enumerate(p_active) if a] 1781 for j,a in enumerate(p_active): 1782 if a: 1783 s['CM'][j,k_active] = CM[i,:] 1784 i += 1 1785 1786 if not weighted_sessions: 1787 w = self.rmswd()['rmswd'] 1788 for r in self: 1789 r[f'wD{self._4x}'] *= w 1790 r[f'wD{self._4x}raw'] *= w 1791 for session in self.sessions: 1792 self.sessions[session]['CM'] *= w**2 1793 1794 for session in self.sessions: 1795 s = self.sessions[session] 1796 s['SE_a'] = s['CM'][0,0]**.5 1797 s['SE_b'] = s['CM'][1,1]**.5 1798 s['SE_c'] = s['CM'][2,2]**.5 1799 s['SE_a2'] = s['CM'][3,3]**.5 1800 s['SE_b2'] = s['CM'][4,4]**.5 1801 s['SE_c2'] = s['CM'][5,5]**.5 1802 1803 if not weighted_sessions: 1804 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1805 else: 1806 self.Nf = 0 1807 for sg in weighted_sessions: 1808 self.Nf += self.rmswd(sessions = sg)['Nf'] 1809 1810 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1811 1812 avgD4x = { 1813 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1814 for sample in self.samples 1815 } 1816 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1817 rD4x = (chi2/self.Nf)**.5 1818 self.repeatability[f'sigma_{self._4x}'] = rD4x 1819 1820 if consolidate: 1821 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1822 1823 1824 def standardization_error(self, session, d4x, D4x, t = 0): 1825 ''' 1826 Compute standardization error for a given session and 1827 (δ47, Δ47) composition. 1828 ''' 1829 a = self.sessions[session]['a'] 1830 b = self.sessions[session]['b'] 1831 c = self.sessions[session]['c'] 1832 a2 = self.sessions[session]['a2'] 1833 b2 = self.sessions[session]['b2'] 1834 c2 = self.sessions[session]['c2'] 1835 CM = self.sessions[session]['CM'] 1836 1837 x, y = D4x, d4x 1838 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1839# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1840 dxdy = -(b+b2*t) / (a+a2*t) 1841 dxdz = 1. / (a+a2*t) 1842 dxda = -x / (a+a2*t) 1843 dxdb = -y / (a+a2*t) 1844 dxdc = -1. / (a+a2*t) 1845 dxda2 = -x * a2 / (a+a2*t) 1846 dxdb2 = -y * t / (a+a2*t) 1847 dxdc2 = -t / (a+a2*t) 1848 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1849 sx = (V @ CM @ V.T) ** .5 1850 return sx 1851 1852 1853 @make_verbal 1854 def summary(self, 1855 dir = 'output', 1856 filename = None, 1857 save_to_file = True, 1858 print_out = True, 1859 ): 1860 ''' 1861 Print out an/or save to disk a summary of the standardization results. 1862 1863 **Parameters** 1864 1865 + `dir`: the directory in which to save the table 1866 + `filename`: the name to the csv file to write to 1867 + `save_to_file`: whether to save the table to disk 1868 + `print_out`: whether to print out the table 1869 ''' 1870 1871 out = [] 1872 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1873 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1874 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1875 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1876 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1877 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1878 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1879 out += [['Model degrees of freedom', f"{self.Nf}"]] 1880 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1881 out += [['Standardization method', self.standardization_method]] 1882 1883 if save_to_file: 1884 if not os.path.exists(dir): 1885 os.makedirs(dir) 1886 if filename is None: 1887 filename = f'D{self._4x}_summary.csv' 1888 with open(f'{dir}/{filename}', 'w') as fid: 1889 fid.write(make_csv(out)) 1890 if print_out: 1891 self.msg('\n' + pretty_table(out, header = 0)) 1892 1893 1894 @make_verbal 1895 def table_of_sessions(self, 1896 dir = 'output', 1897 filename = None, 1898 save_to_file = True, 1899 print_out = True, 1900 output = None, 1901 ): 1902 ''' 1903 Print out an/or save to disk a table of sessions. 1904 1905 **Parameters** 1906 1907 + `dir`: the directory in which to save the table 1908 + `filename`: the name to the csv file to write to 1909 + `save_to_file`: whether to save the table to disk 1910 + `print_out`: whether to print out the table 1911 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1912 if set to `'raw'`: return a list of list of strings 1913 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1914 ''' 1915 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1916 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1917 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1918 1919 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1920 if include_a2: 1921 out[-1] += ['a2 ± SE'] 1922 if include_b2: 1923 out[-1] += ['b2 ± SE'] 1924 if include_c2: 1925 out[-1] += ['c2 ± SE'] 1926 for session in self.sessions: 1927 out += [[ 1928 session, 1929 f"{self.sessions[session]['Na']}", 1930 f"{self.sessions[session]['Nu']}", 1931 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1932 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1933 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1934 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1935 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1936 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1937 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1938 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1939 ]] 1940 if include_a2: 1941 if self.sessions[session]['scrambling_drift']: 1942 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1943 else: 1944 out[-1] += [''] 1945 if include_b2: 1946 if self.sessions[session]['slope_drift']: 1947 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1948 else: 1949 out[-1] += [''] 1950 if include_c2: 1951 if self.sessions[session]['wg_drift']: 1952 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1953 else: 1954 out[-1] += [''] 1955 1956 if save_to_file: 1957 if not os.path.exists(dir): 1958 os.makedirs(dir) 1959 if filename is None: 1960 filename = f'D{self._4x}_sessions.csv' 1961 with open(f'{dir}/{filename}', 'w') as fid: 1962 fid.write(make_csv(out)) 1963 if print_out: 1964 self.msg('\n' + pretty_table(out)) 1965 if output == 'raw': 1966 return out 1967 elif output == 'pretty': 1968 return pretty_table(out) 1969 1970 1971 @make_verbal 1972 def table_of_analyses( 1973 self, 1974 dir = 'output', 1975 filename = None, 1976 save_to_file = True, 1977 print_out = True, 1978 output = None, 1979 ): 1980 ''' 1981 Print out an/or save to disk a table of analyses. 1982 1983 **Parameters** 1984 1985 + `dir`: the directory in which to save the table 1986 + `filename`: the name to the csv file to write to 1987 + `save_to_file`: whether to save the table to disk 1988 + `print_out`: whether to print out the table 1989 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1990 if set to `'raw'`: return a list of list of strings 1991 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1992 ''' 1993 1994 out = [['UID','Session','Sample']] 1995 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1996 for f in extra_fields: 1997 out[-1] += [f[0]] 1998 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1999 for r in self: 2000 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 2001 for f in extra_fields: 2002 out[-1] += [f"{r[f[0]]:{f[1]}}"] 2003 out[-1] += [ 2004 f"{r['d13Cwg_VPDB']:.3f}", 2005 f"{r['d18Owg_VSMOW']:.3f}", 2006 f"{r['d45']:.6f}", 2007 f"{r['d46']:.6f}", 2008 f"{r['d47']:.6f}", 2009 f"{r['d48']:.6f}", 2010 f"{r['d49']:.6f}", 2011 f"{r['d13C_VPDB']:.6f}", 2012 f"{r['d18O_VSMOW']:.6f}", 2013 f"{r['D47raw']:.6f}", 2014 f"{r['D48raw']:.6f}", 2015 f"{r['D49raw']:.6f}", 2016 f"{r[f'D{self._4x}']:.6f}" 2017 ] 2018 if save_to_file: 2019 if not os.path.exists(dir): 2020 os.makedirs(dir) 2021 if filename is None: 2022 filename = f'D{self._4x}_analyses.csv' 2023 with open(f'{dir}/{filename}', 'w') as fid: 2024 fid.write(make_csv(out)) 2025 if print_out: 2026 self.msg('\n' + pretty_table(out)) 2027 return out 2028 2029 @make_verbal 2030 def covar_table( 2031 self, 2032 correl = False, 2033 dir = 'output', 2034 filename = None, 2035 save_to_file = True, 2036 print_out = True, 2037 output = None, 2038 ): 2039 ''' 2040 Print out, save to disk and/or return the variance-covariance matrix of D4x 2041 for all unknown samples. 2042 2043 **Parameters** 2044 2045 + `dir`: the directory in which to save the csv 2046 + `filename`: the name of the csv file to write to 2047 + `save_to_file`: whether to save the csv 2048 + `print_out`: whether to print out the matrix 2049 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2050 if set to `'raw'`: return a list of list of strings 2051 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2052 ''' 2053 samples = sorted([u for u in self.unknowns]) 2054 out = [[''] + samples] 2055 for s1 in samples: 2056 out.append([s1]) 2057 for s2 in samples: 2058 if correl: 2059 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2060 else: 2061 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2062 2063 if save_to_file: 2064 if not os.path.exists(dir): 2065 os.makedirs(dir) 2066 if filename is None: 2067 if correl: 2068 filename = f'D{self._4x}_correl.csv' 2069 else: 2070 filename = f'D{self._4x}_covar.csv' 2071 with open(f'{dir}/{filename}', 'w') as fid: 2072 fid.write(make_csv(out)) 2073 if print_out: 2074 self.msg('\n'+pretty_table(out)) 2075 if output == 'raw': 2076 return out 2077 elif output == 'pretty': 2078 return pretty_table(out) 2079 2080 @make_verbal 2081 def table_of_samples( 2082 self, 2083 dir = 'output', 2084 filename = None, 2085 save_to_file = True, 2086 print_out = True, 2087 output = None, 2088 ): 2089 ''' 2090 Print out, save to disk and/or return a table of samples. 2091 2092 **Parameters** 2093 2094 + `dir`: the directory in which to save the csv 2095 + `filename`: the name of the csv file to write to 2096 + `save_to_file`: whether to save the csv 2097 + `print_out`: whether to print out the table 2098 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2099 if set to `'raw'`: return a list of list of strings 2100 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2101 ''' 2102 2103 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2104 for sample in self.anchors: 2105 out += [[ 2106 f"{sample}", 2107 f"{self.samples[sample]['N']}", 2108 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2109 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2110 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2111 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2112 ]] 2113 for sample in self.unknowns: 2114 out += [[ 2115 f"{sample}", 2116 f"{self.samples[sample]['N']}", 2117 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2118 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2119 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2120 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2121 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2122 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2123 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2124 ]] 2125 if save_to_file: 2126 if not os.path.exists(dir): 2127 os.makedirs(dir) 2128 if filename is None: 2129 filename = f'D{self._4x}_samples.csv' 2130 with open(f'{dir}/{filename}', 'w') as fid: 2131 fid.write(make_csv(out)) 2132 if print_out: 2133 self.msg('\n'+pretty_table(out)) 2134 if output == 'raw': 2135 return out 2136 elif output == 'pretty': 2137 return pretty_table(out) 2138 2139 2140 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2141 ''' 2142 Generate session plots and save them to disk. 2143 2144 **Parameters** 2145 2146 + `dir`: the directory in which to save the plots 2147 + `figsize`: the width and height (in inches) of each plot 2148 + `filetype`: 'pdf' or 'png' 2149 + `dpi`: resolution for PNG output 2150 ''' 2151 if not os.path.exists(dir): 2152 os.makedirs(dir) 2153 2154 for session in self.sessions: 2155 sp = self.plot_single_session(session, xylimits = 'constant') 2156 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2157 ppl.close(sp.fig) 2158 2159 2160 2161 @make_verbal 2162 def consolidate_samples(self): 2163 ''' 2164 Compile various statistics for each sample. 2165 2166 For each anchor sample: 2167 2168 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2169 + `SE_D47` or `SE_D48`: set to zero by definition 2170 2171 For each unknown sample: 2172 2173 + `D47` or `D48`: the standardized Δ4x value for this unknown 2174 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2175 2176 For each anchor and unknown: 2177 2178 + `N`: the total number of analyses of this sample 2179 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2180 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2181 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2182 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2183 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2184 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2185 ''' 2186 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2187 for sample in self.samples: 2188 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2189 if self.samples[sample]['N'] > 1: 2190 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2191 2192 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2193 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2194 2195 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2196 if len(D4x_pop) > 2: 2197 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2198 2199 if self.standardization_method == 'pooled': 2200 for sample in self.anchors: 2201 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2202 self.samples[sample][f'SE_D{self._4x}'] = 0. 2203 for sample in self.unknowns: 2204 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2205 try: 2206 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2207 except ValueError: 2208 # when `sample` is constrained by self.standardize(constraints = {...}), 2209 # it is no longer listed in self.standardization.var_names. 2210 # Temporary fix: define SE as zero for now 2211 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2212 2213 elif self.standardization_method == 'indep_sessions': 2214 for sample in self.anchors: 2215 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2216 self.samples[sample][f'SE_D{self._4x}'] = 0. 2217 for sample in self.unknowns: 2218 self.msg(f'Consolidating sample {sample}') 2219 self.unknowns[sample][f'session_D{self._4x}'] = {} 2220 session_avg = [] 2221 for session in self.sessions: 2222 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2223 if sdata: 2224 self.msg(f'{sample} found in session {session}') 2225 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2226 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2227 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2228 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2229 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2230 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2231 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2232 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2233 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2234 wsum = sum([weights[s] for s in weights]) 2235 for s in weights: 2236 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2237 2238 for r in self: 2239 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2240 2241 2242 2243 def consolidate_sessions(self): 2244 ''' 2245 Compute various statistics for each session. 2246 2247 + `Na`: Number of anchor analyses in the session 2248 + `Nu`: Number of unknown analyses in the session 2249 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2250 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2251 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2252 + `a`: scrambling factor 2253 + `b`: compositional slope 2254 + `c`: WG offset 2255 + `SE_a`: Model stadard erorr of `a` 2256 + `SE_b`: Model stadard erorr of `b` 2257 + `SE_c`: Model stadard erorr of `c` 2258 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2259 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2260 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2261 + `a2`: scrambling factor drift 2262 + `b2`: compositional slope drift 2263 + `c2`: WG offset drift 2264 + `Np`: Number of standardization parameters to fit 2265 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2266 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2267 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2268 ''' 2269 for session in self.sessions: 2270 if 'd13Cwg_VPDB' not in self.sessions[session]: 2271 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2272 if 'd18Owg_VSMOW' not in self.sessions[session]: 2273 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2274 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2275 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2276 2277 self.msg(f'Computing repeatabilities for session {session}') 2278 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2279 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2280 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2281 2282 if self.standardization_method == 'pooled': 2283 for session in self.sessions: 2284 2285 # different (better?) computation of D4x repeatability for each session: 2286 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2287 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2288 2289 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2290 i = self.standardization.var_names.index(f'a_{pf(session)}') 2291 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2292 2293 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2294 i = self.standardization.var_names.index(f'b_{pf(session)}') 2295 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2296 2297 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2298 i = self.standardization.var_names.index(f'c_{pf(session)}') 2299 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2300 2301 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2302 if self.sessions[session]['scrambling_drift']: 2303 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2304 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2305 else: 2306 self.sessions[session]['SE_a2'] = 0. 2307 2308 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2309 if self.sessions[session]['slope_drift']: 2310 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2311 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2312 else: 2313 self.sessions[session]['SE_b2'] = 0. 2314 2315 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2316 if self.sessions[session]['wg_drift']: 2317 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2318 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2319 else: 2320 self.sessions[session]['SE_c2'] = 0. 2321 2322 i = self.standardization.var_names.index(f'a_{pf(session)}') 2323 j = self.standardization.var_names.index(f'b_{pf(session)}') 2324 k = self.standardization.var_names.index(f'c_{pf(session)}') 2325 CM = np.zeros((6,6)) 2326 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2327 try: 2328 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2329 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2330 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2331 try: 2332 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2333 CM[3,4] = self.standardization.covar[i2,j2] 2334 CM[4,3] = self.standardization.covar[j2,i2] 2335 except ValueError: 2336 pass 2337 try: 2338 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2339 CM[3,5] = self.standardization.covar[i2,k2] 2340 CM[5,3] = self.standardization.covar[k2,i2] 2341 except ValueError: 2342 pass 2343 except ValueError: 2344 pass 2345 try: 2346 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2347 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2348 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2349 try: 2350 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2351 CM[4,5] = self.standardization.covar[j2,k2] 2352 CM[5,4] = self.standardization.covar[k2,j2] 2353 except ValueError: 2354 pass 2355 except ValueError: 2356 pass 2357 try: 2358 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2359 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2360 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2361 except ValueError: 2362 pass 2363 2364 self.sessions[session]['CM'] = CM 2365 2366 elif self.standardization_method == 'indep_sessions': 2367 pass # Not implemented yet 2368 2369 2370 @make_verbal 2371 def repeatabilities(self): 2372 ''' 2373 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2374 (for all samples, for anchors, and for unknowns). 2375 ''' 2376 self.msg('Computing reproducibilities for all sessions') 2377 2378 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2379 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2380 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2381 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2382 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2383 2384 2385 @make_verbal 2386 def consolidate(self, tables = True, plots = True): 2387 ''' 2388 Collect information about samples, sessions and repeatabilities. 2389 ''' 2390 self.consolidate_samples() 2391 self.consolidate_sessions() 2392 self.repeatabilities() 2393 2394 if tables: 2395 self.summary() 2396 self.table_of_sessions() 2397 self.table_of_analyses() 2398 self.table_of_samples() 2399 2400 if plots: 2401 self.plot_sessions() 2402 2403 2404 @make_verbal 2405 def rmswd(self, 2406 samples = 'all samples', 2407 sessions = 'all sessions', 2408 ): 2409 ''' 2410 Compute the χ2, root mean squared weighted deviation 2411 (i.e. reduced χ2), and corresponding degrees of freedom of the 2412 Δ4x values for samples in `samples` and sessions in `sessions`. 2413 2414 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2415 ''' 2416 if samples == 'all samples': 2417 mysamples = [k for k in self.samples] 2418 elif samples == 'anchors': 2419 mysamples = [k for k in self.anchors] 2420 elif samples == 'unknowns': 2421 mysamples = [k for k in self.unknowns] 2422 else: 2423 mysamples = samples 2424 2425 if sessions == 'all sessions': 2426 sessions = [k for k in self.sessions] 2427 2428 chisq, Nf = 0, 0 2429 for sample in mysamples : 2430 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2431 if len(G) > 1 : 2432 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2433 Nf += (len(G) - 1) 2434 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2435 r = (chisq / Nf)**.5 if Nf > 0 else 0 2436 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2437 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2438 2439 2440 @make_verbal 2441 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2442 ''' 2443 Compute the repeatability of `[r[key] for r in self]` 2444 ''' 2445 2446 if samples == 'all samples': 2447 mysamples = [k for k in self.samples] 2448 elif samples == 'anchors': 2449 mysamples = [k for k in self.anchors] 2450 elif samples == 'unknowns': 2451 mysamples = [k for k in self.unknowns] 2452 else: 2453 mysamples = samples 2454 2455 if sessions == 'all sessions': 2456 sessions = [k for k in self.sessions] 2457 2458 if key in ['D47', 'D48']: 2459 # Full disclosure: the definition of Nf is tricky/debatable 2460 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2461 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2462 Nf = len(G) 2463# print(f'len(G) = {Nf}') 2464 Nf -= len([s for s in mysamples if s in self.unknowns]) 2465# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2466 for session in sessions: 2467 Np = len([ 2468 _ for _ in self.standardization.params 2469 if ( 2470 self.standardization.params[_].expr is not None 2471 and ( 2472 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2473 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2474 ) 2475 ) 2476 ]) 2477# print(f'session {session}: {Np} parameters to consider') 2478 Na = len({ 2479 r['Sample'] for r in self.sessions[session]['data'] 2480 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2481 }) 2482# print(f'session {session}: {Na} different anchors in that session') 2483 Nf -= min(Np, Na) 2484# print(f'Nf = {Nf}') 2485 2486# for sample in mysamples : 2487# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2488# if len(X) > 1 : 2489# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2490# if sample in self.unknowns: 2491# Nf += len(X) - 1 2492# else: 2493# Nf += len(X) 2494# if samples in ['anchors', 'all samples']: 2495# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2496 r = (chisq / Nf)**.5 if Nf > 0 else 0 2497 2498 else: # if key not in ['D47', 'D48'] 2499 chisq, Nf = 0, 0 2500 for sample in mysamples : 2501 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2502 if len(X) > 1 : 2503 Nf += len(X) - 1 2504 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2505 r = (chisq / Nf)**.5 if Nf > 0 else 0 2506 2507 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2508 return r 2509 2510 def sample_average(self, samples, weights = 'equal', normalize = True): 2511 ''' 2512 Weighted average Δ4x value of a group of samples, accounting for covariance. 2513 2514 Returns the weighed average Δ4x value and associated SE 2515 of a group of samples. Weights are equal by default. If `normalize` is 2516 true, `weights` will be rescaled so that their sum equals 1. 2517 2518 **Examples** 2519 2520 ```python 2521 self.sample_average(['X','Y'], [1, 2]) 2522 ``` 2523 2524 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2525 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2526 values of samples X and Y, respectively. 2527 2528 ```python 2529 self.sample_average(['X','Y'], [1, -1], normalize = False) 2530 ``` 2531 2532 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2533 ''' 2534 if weights == 'equal': 2535 weights = [1/len(samples)] * len(samples) 2536 2537 if normalize: 2538 s = sum(weights) 2539 if s: 2540 weights = [w/s for w in weights] 2541 2542 try: 2543# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2544# C = self.standardization.covar[indices,:][:,indices] 2545 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2546 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2547 return correlated_sum(X, C, weights) 2548 except ValueError: 2549 return (0., 0.) 2550 2551 2552 def sample_D4x_covar(self, sample1, sample2 = None): 2553 ''' 2554 Covariance between Δ4x values of samples 2555 2556 Returns the error covariance between the average Δ4x values of two 2557 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2558 returns the Δ4x variance for that sample. 2559 ''' 2560 if sample2 is None: 2561 sample2 = sample1 2562 if self.standardization_method == 'pooled': 2563 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2564 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2565 return self.standardization.covar[i, j] 2566 elif self.standardization_method == 'indep_sessions': 2567 if sample1 == sample2: 2568 return self.samples[sample1][f'SE_D{self._4x}']**2 2569 else: 2570 c = 0 2571 for session in self.sessions: 2572 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2573 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2574 if sdata1 and sdata2: 2575 a = self.sessions[session]['a'] 2576 # !! TODO: CM below does not account for temporal changes in standardization parameters 2577 CM = self.sessions[session]['CM'][:3,:3] 2578 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2579 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2580 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2581 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2582 c += ( 2583 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2584 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2585 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2586 @ CM 2587 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2588 ) / a**2 2589 return float(c) 2590 2591 def sample_D4x_correl(self, sample1, sample2 = None): 2592 ''' 2593 Correlation between Δ4x errors of samples 2594 2595 Returns the error correlation between the average Δ4x values of two samples. 2596 ''' 2597 if sample2 is None or sample2 == sample1: 2598 return 1. 2599 return ( 2600 self.sample_D4x_covar(sample1, sample2) 2601 / self.unknowns[sample1][f'SE_D{self._4x}'] 2602 / self.unknowns[sample2][f'SE_D{self._4x}'] 2603 ) 2604 2605 def plot_single_session(self, 2606 session, 2607 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2608 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2609 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2610 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2611 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2612 xylimits = 'free', # | 'constant' 2613 x_label = None, 2614 y_label = None, 2615 error_contour_interval = 'auto', 2616 fig = 'new', 2617 ): 2618 ''' 2619 Generate plot for a single session 2620 ''' 2621 if x_label is None: 2622 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2623 if y_label is None: 2624 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2625 2626 out = _SessionPlot() 2627 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2628 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2629 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2630 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2631 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2632 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2633 anchor_avg = (np.array([ np.array([ 2634 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2635 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2636 ]) for sample in anchors]).T, 2637 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2638 unknown_avg = (np.array([ np.array([ 2639 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2640 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2641 ]) for sample in unknowns]).T, 2642 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2643 2644 2645 if fig == 'new': 2646 out.fig = ppl.figure(figsize = (6,6)) 2647 ppl.subplots_adjust(.1,.1,.9,.9) 2648 2649 out.anchor_analyses, = ppl.plot( 2650 anchors_d, 2651 anchors_D, 2652 **kw_plot_anchors) 2653 out.unknown_analyses, = ppl.plot( 2654 unknowns_d, 2655 unknowns_D, 2656 **kw_plot_unknowns) 2657 out.anchor_avg = ppl.plot( 2658 *anchor_avg, 2659 **kw_plot_anchor_avg) 2660 out.unknown_avg = ppl.plot( 2661 *unknown_avg, 2662 **kw_plot_unknown_avg) 2663 if xylimits == 'constant': 2664 x = [r[f'd{self._4x}'] for r in self] 2665 y = [r[f'D{self._4x}'] for r in self] 2666 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2667 w, h = x2-x1, y2-y1 2668 x1 -= w/20 2669 x2 += w/20 2670 y1 -= h/20 2671 y2 += h/20 2672 ppl.axis([x1, x2, y1, y2]) 2673 elif xylimits == 'free': 2674 x1, x2, y1, y2 = ppl.axis() 2675 else: 2676 x1, x2, y1, y2 = ppl.axis(xylimits) 2677 2678 if error_contour_interval != 'none': 2679 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2680 XI,YI = np.meshgrid(xi, yi) 2681 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2682 if error_contour_interval == 'auto': 2683 rng = np.max(SI) - np.min(SI) 2684 if rng <= 0.01: 2685 cinterval = 0.001 2686 elif rng <= 0.03: 2687 cinterval = 0.004 2688 elif rng <= 0.1: 2689 cinterval = 0.01 2690 elif rng <= 0.3: 2691 cinterval = 0.03 2692 elif rng <= 1.: 2693 cinterval = 0.1 2694 else: 2695 cinterval = 0.5 2696 else: 2697 cinterval = error_contour_interval 2698 2699 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2700 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2701 out.clabel = ppl.clabel(out.contour) 2702 contour = (XI, YI, SI, cval, cinterval) 2703 2704 if fig == None: 2705 return { 2706 'anchors':anchors, 2707 'unknowns':unknowns, 2708 'anchors_d':anchors_d, 2709 'anchors_D':anchors_D, 2710 'unknowns_d':unknowns_d, 2711 'unknowns_D':unknowns_D, 2712 'anchor_avg':anchor_avg, 2713 'unknown_avg':unknown_avg, 2714 'contour':contour, 2715 } 2716 2717 ppl.xlabel(x_label) 2718 ppl.ylabel(y_label) 2719 ppl.title(session, weight = 'bold') 2720 ppl.grid(alpha = .2) 2721 out.ax = ppl.gca() 2722 2723 return out 2724 2725 def plot_residuals( 2726 self, 2727 kde = False, 2728 hist = False, 2729 binwidth = 2/3, 2730 dir = 'output', 2731 filename = None, 2732 highlight = [], 2733 colors = None, 2734 figsize = None, 2735 dpi = 100, 2736 yspan = None, 2737 ): 2738 ''' 2739 Plot residuals of each analysis as a function of time (actually, as a function of 2740 the order of analyses in the `D4xdata` object) 2741 2742 + `kde`: whether to add a kernel density estimate of residuals 2743 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2744 + `histbins`: specify bin edges for the histogram 2745 + `dir`: the directory in which to save the plot 2746 + `highlight`: a list of samples to highlight 2747 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2748 + `figsize`: (width, height) of figure 2749 + `dpi`: resolution for PNG output 2750 + `yspan`: factor controlling the range of y values shown in plot 2751 (by default: `yspan = 1.5 if kde else 1.0`) 2752 ''' 2753 2754 from matplotlib import ticker 2755 2756 if yspan is None: 2757 if kde: 2758 yspan = 1.5 2759 else: 2760 yspan = 1.0 2761 2762 # Layout 2763 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2764 if hist or kde: 2765 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2766 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2767 else: 2768 ppl.subplots_adjust(.08,.05,.78,.8) 2769 ax1 = ppl.subplot(111) 2770 2771 # Colors 2772 N = len(self.anchors) 2773 if colors is None: 2774 if len(highlight) > 0: 2775 Nh = len(highlight) 2776 if Nh == 1: 2777 colors = {highlight[0]: (0,0,0)} 2778 elif Nh == 3: 2779 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2780 elif Nh == 4: 2781 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2782 else: 2783 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2784 else: 2785 if N == 3: 2786 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2787 elif N == 4: 2788 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2789 else: 2790 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2791 2792 ppl.sca(ax1) 2793 2794 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2795 2796 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2797 2798 session = self[0]['Session'] 2799 x1 = 0 2800# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2801 x_sessions = {} 2802 one_or_more_singlets = False 2803 one_or_more_multiplets = False 2804 multiplets = set() 2805 for k,r in enumerate(self): 2806 if r['Session'] != session: 2807 x2 = k-1 2808 x_sessions[session] = (x1+x2)/2 2809 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2810 session = r['Session'] 2811 x1 = k 2812 singlet = len(self.samples[r['Sample']]['data']) == 1 2813 if not singlet: 2814 multiplets.add(r['Sample']) 2815 if r['Sample'] in self.unknowns: 2816 if singlet: 2817 one_or_more_singlets = True 2818 else: 2819 one_or_more_multiplets = True 2820 kw = dict( 2821 marker = 'x' if singlet else '+', 2822 ms = 4 if singlet else 5, 2823 ls = 'None', 2824 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2825 mew = 1, 2826 alpha = 0.2 if singlet else 1, 2827 ) 2828 if highlight and r['Sample'] not in highlight: 2829 kw['alpha'] = 0.2 2830 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2831 x2 = k 2832 x_sessions[session] = (x1+x2)/2 2833 2834 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2835 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2836 if not (hist or kde): 2837 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2838 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2839 2840 xmin, xmax, ymin, ymax = ppl.axis() 2841 if yspan != 1: 2842 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2843 for s in x_sessions: 2844 ppl.text( 2845 x_sessions[s], 2846 ymax +1, 2847 s, 2848 va = 'bottom', 2849 **( 2850 dict(ha = 'center') 2851 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2852 else dict(ha = 'left', rotation = 45) 2853 ) 2854 ) 2855 2856 if hist or kde: 2857 ppl.sca(ax2) 2858 2859 for s in colors: 2860 kw['marker'] = '+' 2861 kw['ms'] = 5 2862 kw['mec'] = colors[s] 2863 kw['label'] = s 2864 kw['alpha'] = 1 2865 ppl.plot([], [], **kw) 2866 2867 kw['mec'] = (0,0,0) 2868 2869 if one_or_more_singlets: 2870 kw['marker'] = 'x' 2871 kw['ms'] = 4 2872 kw['alpha'] = .2 2873 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2874 ppl.plot([], [], **kw) 2875 2876 if one_or_more_multiplets: 2877 kw['marker'] = '+' 2878 kw['ms'] = 4 2879 kw['alpha'] = 1 2880 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2881 ppl.plot([], [], **kw) 2882 2883 if hist or kde: 2884 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2885 else: 2886 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2887 leg.set_zorder(-1000) 2888 2889 ppl.sca(ax1) 2890 2891 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2892 ppl.xticks([]) 2893 ppl.axis([-1, len(self), None, None]) 2894 2895 if hist or kde: 2896 ppl.sca(ax2) 2897 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2898 2899 if kde: 2900 from scipy.stats import gaussian_kde 2901 yi = np.linspace(ymin, ymax, 201) 2902 xi = gaussian_kde(X).evaluate(yi) 2903 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2904# ppl.plot(xi, yi, 'k-', lw = 1) 2905 elif hist: 2906 ppl.hist( 2907 X, 2908 orientation = 'horizontal', 2909 histtype = 'stepfilled', 2910 ec = [.4]*3, 2911 fc = [.25]*3, 2912 alpha = .25, 2913 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2914 ) 2915 ppl.text(0, 0, 2916 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2917 size = 7.5, 2918 alpha = 1, 2919 va = 'center', 2920 ha = 'left', 2921 ) 2922 2923 ppl.axis([0, None, ymin, ymax]) 2924 ppl.xticks([]) 2925 ppl.yticks([]) 2926# ax2.spines['left'].set_visible(False) 2927 ax2.spines['right'].set_visible(False) 2928 ax2.spines['top'].set_visible(False) 2929 ax2.spines['bottom'].set_visible(False) 2930 2931 ax1.axis([None, None, ymin, ymax]) 2932 2933 if not os.path.exists(dir): 2934 os.makedirs(dir) 2935 if filename is None: 2936 return fig 2937 elif filename == '': 2938 filename = f'D{self._4x}_residuals.pdf' 2939 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2940 ppl.close(fig) 2941 2942 2943 def simulate(self, *args, **kwargs): 2944 ''' 2945 Legacy function with warning message pointing to `virtual_data()` 2946 ''' 2947 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2948 2949 def plot_anchor_residuals( 2950 self, 2951 dir = 'output', 2952 filename = '', 2953 figsize = None, 2954 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2955 dpi = 100, 2956 colors = None, 2957 ): 2958 ''' 2959 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2960 2961 **Parameters** 2962 2963 + `dir`: the directory in which to save the plot 2964 + `filename`: the file name to save to. 2965 + `dpi`: resolution for PNG output 2966 + `figsize`: (width, height) of figure 2967 + `subplots_adjust`: passed to the figure 2968 + `dpi`: resolution for PNG output 2969 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2970 ''' 2971 2972 # Colors 2973 N = len(self.anchors) 2974 if colors is None: 2975 if N == 3: 2976 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2977 elif N == 4: 2978 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2979 else: 2980 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2981 2982 if figsize is None: 2983 figsize = (4, 1.5*N+1) 2984 fig = ppl.figure(figsize = figsize) 2985 ppl.subplots_adjust(*subplots_adjust) 2986 axs = {} 2987 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2988 sigma = self.repeatability['r_D47a'] * 1000 2989 D = max(np.abs(X)) 2990 2991 for k,a in enumerate(self.anchors): 2992 color = colors[a] 2993 axs[a] = ppl.subplot(N, 1, 1+k) 2994 axs[a].text( 2995 0.02, 1-0.05, a, 2996 va = 'top', 2997 ha = 'left', 2998 weight = 'bold', 2999 size = 9, 3000 color = [_*0.75 for _ in color], 3001 transform = axs[a].transAxes, 3002 ) 3003 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 3004 axs[a].axvline(0, lw = 0.5, color = color) 3005 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 3006 3007 xi = np.linspace(-3*D, 3*D, 601) 3008 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 3009 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 3010 3011 axs[a].errorbar( 3012 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3013 ecolor = color, 3014 marker = 's', 3015 ls = 'None', 3016 mec = color, 3017 mew = 1, 3018 mfc = 'w', 3019 ms = 8, 3020 elinewidth = 1, 3021 capsize = 4, 3022 capthick = 1, 3023 ) 3024 3025 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3026 ppl.yticks([]) 3027 3028 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3029 3030 if not os.path.exists(dir): 3031 os.makedirs(dir) 3032 if filename is None: 3033 return fig 3034 elif filename == '': 3035 filename = f'D{self._4x}_anchor_residuals.pdf' 3036 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3037 ppl.close(fig) 3038 3039 3040 def plot_distribution_of_analyses( 3041 self, 3042 dir = 'output', 3043 filename = None, 3044 vs_time = False, 3045 figsize = (6,4), 3046 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3047 output = None, 3048 dpi = 100, 3049 ): 3050 ''' 3051 Plot temporal distribution of all analyses in the data set. 3052 3053 **Parameters** 3054 3055 + `dir`: the directory in which to save the plot 3056 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3057 + `dpi`: resolution for PNG output 3058 + `figsize`: (width, height) of figure 3059 + `dpi`: resolution for PNG output 3060 ''' 3061 3062 asamples = [s for s in self.anchors] 3063 usamples = [s for s in self.unknowns] 3064 if output is None or output == 'fig': 3065 fig = ppl.figure(figsize = figsize) 3066 ppl.subplots_adjust(*subplots_adjust) 3067 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3068 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3069 Xmax += (Xmax-Xmin)/40 3070 Xmin -= (Xmax-Xmin)/41 3071 for k, s in enumerate(asamples + usamples): 3072 if vs_time: 3073 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3074 else: 3075 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3076 Y = [-k for x in X] 3077 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3078 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3079 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3080 ppl.axis([Xmin, Xmax, -k-1, 1]) 3081 ppl.xlabel('\ntime') 3082 ppl.gca().annotate('', 3083 xy = (0.6, -0.02), 3084 xycoords = 'axes fraction', 3085 xytext = (.4, -0.02), 3086 arrowprops = dict(arrowstyle = "->", color = 'k'), 3087 ) 3088 3089 3090 x2 = -1 3091 for session in self.sessions: 3092 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3093 if vs_time: 3094 ppl.axvline(x1, color = 'k', lw = .75) 3095 if x2 > -1: 3096 if not vs_time: 3097 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3098 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3099# from xlrd import xldate_as_datetime 3100# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3101 if vs_time: 3102 ppl.axvline(x2, color = 'k', lw = .75) 3103 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3104 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3105 3106 ppl.xticks([]) 3107 ppl.yticks([]) 3108 3109 if output is None: 3110 if not os.path.exists(dir): 3111 os.makedirs(dir) 3112 if filename == None: 3113 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3114 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3115 ppl.close(fig) 3116 elif output == 'ax': 3117 return ppl.gca() 3118 elif output == 'fig': 3119 return fig 3120 3121 3122 def plot_bulk_compositions( 3123 self, 3124 samples = None, 3125 dir = 'output/bulk_compositions', 3126 figsize = (6,6), 3127 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3128 show = False, 3129 sample_color = (0,.5,1), 3130 analysis_color = (.7,.7,.7), 3131 labeldist = 0.3, 3132 radius = 0.05, 3133 ): 3134 ''' 3135 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3136 3137 By default, creates a directory `./output/bulk_compositions` where plots for 3138 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3139 3140 3141 **Parameters** 3142 3143 + `samples`: Only these samples are processed (by default: all samples). 3144 + `dir`: where to save the plots 3145 + `figsize`: (width, height) of figure 3146 + `subplots_adjust`: passed to `subplots_adjust()` 3147 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3148 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3149 + `sample_color`: color used for replicate markers/labels 3150 + `analysis_color`: color used for sample markers/labels 3151 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3152 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3153 ''' 3154 3155 from matplotlib.patches import Ellipse 3156 3157 if samples is None: 3158 samples = [_ for _ in self.samples] 3159 3160 saved = {} 3161 3162 for s in samples: 3163 3164 fig = ppl.figure(figsize = figsize) 3165 fig.subplots_adjust(*subplots_adjust) 3166 ax = ppl.subplot(111) 3167 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3168 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3169 ppl.title(s) 3170 3171 3172 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3173 UID = [_['UID'] for _ in self.samples[s]['data']] 3174 XY0 = XY.mean(0) 3175 3176 for xy in XY: 3177 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3178 3179 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3180 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3181 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3182 saved[s] = [XY, XY0] 3183 3184 x1, x2, y1, y2 = ppl.axis() 3185 x0, dx = (x1+x2)/2, (x2-x1)/2 3186 y0, dy = (y1+y2)/2, (y2-y1)/2 3187 dx, dy = [max(max(dx, dy), radius)]*2 3188 3189 ppl.axis([ 3190 x0 - 1.2*dx, 3191 x0 + 1.2*dx, 3192 y0 - 1.2*dy, 3193 y0 + 1.2*dy, 3194 ]) 3195 3196 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3197 3198 for xy, uid in zip(XY, UID): 3199 3200 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3201 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3202 3203 if (vector_in_display_space**2).sum() > 0: 3204 3205 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3206 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3207 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3208 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3209 3210 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3211 3212 else: 3213 3214 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3215 3216 if radius: 3217 ax.add_artist(Ellipse( 3218 xy = XY0, 3219 width = radius*2, 3220 height = radius*2, 3221 ls = (0, (2,2)), 3222 lw = .7, 3223 ec = analysis_color, 3224 fc = 'None', 3225 )) 3226 ppl.text( 3227 XY0[0], 3228 XY0[1]-radius, 3229 f'\n± {radius*1e3:.0f} ppm', 3230 color = analysis_color, 3231 va = 'top', 3232 ha = 'center', 3233 linespacing = 0.4, 3234 size = 8, 3235 ) 3236 3237 if not os.path.exists(dir): 3238 os.makedirs(dir) 3239 fig.savefig(f'{dir}/{s}.pdf') 3240 ppl.close(fig) 3241 3242 fig = ppl.figure(figsize = figsize) 3243 fig.subplots_adjust(*subplots_adjust) 3244 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3245 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3246 3247 for s in saved: 3248 for xy in saved[s][0]: 3249 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3250 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3251 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3252 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3253 3254 x1, x2, y1, y2 = ppl.axis() 3255 ppl.axis([ 3256 x1 - (x2-x1)/10, 3257 x2 + (x2-x1)/10, 3258 y1 - (y2-y1)/10, 3259 y2 + (y2-y1)/10, 3260 ]) 3261 3262 3263 if not os.path.exists(dir): 3264 os.makedirs(dir) 3265 fig.savefig(f'{dir}/__all__.pdf') 3266 if show: 3267 ppl.show() 3268 ppl.close(fig) 3269 3270 3271 def _save_D4x_correl( 3272 self, 3273 samples = None, 3274 dir = 'output', 3275 filename = None, 3276 D4x_precision = 4, 3277 correl_precision = 4, 3278 save_to_file = True, 3279 ): 3280 ''' 3281 Save D4x values along with their SE and correlation matrix. 3282 3283 **Parameters** 3284 3285 + `samples`: Only these samples are output (by default: all samples). 3286 + `dir`: the directory in which to save the faile (by defaut: `output`) 3287 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3288 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3289 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3290 + `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`, 3291 returns the output as a string 3292 ''' 3293 if samples is None: 3294 samples = sorted([s for s in self.unknowns]) 3295 3296 out = [['Sample']] + [[s] for s in samples] 3297 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3298 for k,s in enumerate(samples): 3299 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3300 for s2 in samples: 3301 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3302 3303 if save_to_file: 3304 if not os.path.exists(dir): 3305 os.makedirs(dir) 3306 if filename is None: 3307 filename = f'D{self._4x}_correl.csv' 3308 with open(f'{dir}/{filename}', 'w') as fid: 3309 fid.write(make_csv(out)) 3310 else: 3311 return make_csv(out) 3312 3313 3314class D47data(D4xdata): 3315 ''' 3316 Store and process data for a large set of Δ47 analyses, 3317 usually comprising more than one analytical session. 3318 ''' 3319 3320 Nominal_D4x = { 3321 'ETH-1': 0.2052, 3322 'ETH-2': 0.2085, 3323 'ETH-3': 0.6132, 3324 'ETH-4': 0.4511, 3325 'IAEA-C1': 0.3018, 3326 'IAEA-C2': 0.6409, 3327 'MERCK': 0.5135, 3328 } # I-CDES (Bernasconi et al., 2021) 3329 ''' 3330 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3331 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3332 reference frame. 3333 3334 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3335 ```py 3336 { 3337 'ETH-1' : 0.2052, 3338 'ETH-2' : 0.2085, 3339 'ETH-3' : 0.6132, 3340 'ETH-4' : 0.4511, 3341 'IAEA-C1' : 0.3018, 3342 'IAEA-C2' : 0.6409, 3343 'MERCK' : 0.5135, 3344 } 3345 ``` 3346 ''' 3347 3348 3349 @property 3350 def Nominal_D47(self): 3351 return self.Nominal_D4x 3352 3353 3354 @Nominal_D47.setter 3355 def Nominal_D47(self, new): 3356 self.Nominal_D4x = dict(**new) 3357 self.refresh() 3358 3359 3360 def __init__(self, l = [], **kwargs): 3361 ''' 3362 **Parameters:** same as `D4xdata.__init__()` 3363 ''' 3364 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3365 3366 3367 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3368 ''' 3369 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3370 value for that temperature, and add treat these samples as additional anchors. 3371 3372 **Parameters** 3373 3374 + `fCo2eqD47`: Which CO2 equilibrium law to use 3375 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3376 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3377 + `priority`: if `replace`: forget old anchors and only use the new ones; 3378 if `new`: keep pre-existing anchors but update them in case of conflict 3379 between old and new Δ47 values; 3380 if `old`: keep pre-existing anchors but preserve their original Δ47 3381 values in case of conflict. 3382 ''' 3383 f = { 3384 'petersen': fCO2eqD47_Petersen, 3385 'wang': fCO2eqD47_Wang, 3386 }[fCo2eqD47] 3387 foo = {} 3388 for r in self: 3389 if 'Teq' in r: 3390 if r['Sample'] in foo: 3391 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3392 else: 3393 foo[r['Sample']] = f(r['Teq']) 3394 else: 3395 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3396 3397 if priority == 'replace': 3398 self.Nominal_D47 = {} 3399 for s in foo: 3400 if priority != 'old' or s not in self.Nominal_D47: 3401 self.Nominal_D47[s] = foo[s] 3402 3403 def save_D47_correl(self, *args, **kwargs): 3404 return self._save_D4x_correl(*args, **kwargs) 3405 3406 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47') 3407 3408 3409class D48data(D4xdata): 3410 ''' 3411 Store and process data for a large set of Δ48 analyses, 3412 usually comprising more than one analytical session. 3413 ''' 3414 3415 Nominal_D4x = { 3416 'ETH-1': 0.138, 3417 'ETH-2': 0.138, 3418 'ETH-3': 0.270, 3419 'ETH-4': 0.223, 3420 'GU-1': -0.419, 3421 } # (Fiebig et al., 2019, 2021) 3422 ''' 3423 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3424 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3425 reference frame. 3426 3427 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3428 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3429 3430 ```py 3431 { 3432 'ETH-1' : 0.138, 3433 'ETH-2' : 0.138, 3434 'ETH-3' : 0.270, 3435 'ETH-4' : 0.223, 3436 'GU-1' : -0.419, 3437 } 3438 ``` 3439 ''' 3440 3441 3442 @property 3443 def Nominal_D48(self): 3444 return self.Nominal_D4x 3445 3446 3447 @Nominal_D48.setter 3448 def Nominal_D48(self, new): 3449 self.Nominal_D4x = dict(**new) 3450 self.refresh() 3451 3452 3453 def __init__(self, l = [], **kwargs): 3454 ''' 3455 **Parameters:** same as `D4xdata.__init__()` 3456 ''' 3457 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3458 3459 def save_D48_correl(self, *args, **kwargs): 3460 return self._save_D4x_correl(*args, **kwargs) 3461 3462 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48') 3463 3464 3465class D49data(D4xdata): 3466 ''' 3467 Store and process data for a large set of Δ49 analyses, 3468 usually comprising more than one analytical session. 3469 ''' 3470 3471 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3472 ''' 3473 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3474 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3475 reference frame. 3476 3477 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3478 3479 ```py 3480 { 3481 "1000C": 0.0, 3482 "25C": 2.228 3483 } 3484 ``` 3485 ''' 3486 3487 @property 3488 def Nominal_D49(self): 3489 return self.Nominal_D4x 3490 3491 @Nominal_D49.setter 3492 def Nominal_D49(self, new): 3493 self.Nominal_D4x = dict(**new) 3494 self.refresh() 3495 3496 def __init__(self, l=[], **kwargs): 3497 ''' 3498 **Parameters:** same as `D4xdata.__init__()` 3499 ''' 3500 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3501 3502 def save_D49_correl(self, *args, **kwargs): 3503 return self._save_D4x_correl(*args, **kwargs) 3504 3505 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49') 3506 3507class _SessionPlot(): 3508 ''' 3509 Simple placeholder class 3510 ''' 3511 def __init__(self): 3512 pass 3513 3514_app = typer.Typer( 3515 add_completion = False, 3516 context_settings={'help_option_names': ['-h', '--help']}, 3517 rich_markup_mode = 'rich', 3518 ) 3519 3520@_app.command() 3521def _cli( 3522 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3523 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3524 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3525 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3526 run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False, 3527 ): 3528 """ 3529 Process raw D47 data and return standardized results. 3530 3531 See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details. 3532 3533 Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes 3534 the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different 3535 user-specified anchors. A new directory (named `output` by default) is created to store the results and 3536 the following sequence is applied: 3537 3538 * [b]D47data.wg()[/b] 3539 * [b]D47data.crunch()[/b] 3540 * [b]D47data.standardize()[/b] 3541 * [b]D47data.summary()[/b] 3542 * [b]D47data.table_of_samples()[/b] 3543 * [b]D47data.table_of_sessions()[/b] 3544 * [b]D47data.plot_sessions()[/b] 3545 * [b]D47data.plot_residuals()[/b] 3546 * [b]D47data.table_of_analyses()[/b] 3547 * [b]D47data.plot_distribution_of_analyses()[/b] 3548 * [b]D47data.plot_bulk_compositions()[/b] 3549 * [b]D47data.save_D47_correl()[/b] 3550 3551 Optionally, also apply similar methods for [b]]D48[/b]. 3552 3553 [b]Example CSV file for --anchors option:[/b] 3554 [i] 3555 Sample, d13C_VPDB, d18O_VPDB, D47, D48 3556 ETH-1, 2.02, -2.19, 0.2052, 0.138 3557 ETH-2, -10.17, -18.69, 0.2085, 0.138 3558 ETH-3, 1.71, -1.78, 0.6132, 0.270 3559 ETH-4, , , 0.4511, 0.223 3560 [/i] 3561 Except for [i]Sample[/i], none of the columns above are mandatory. 3562 3563 [b]Example CSV file for --exclude option:[/b] 3564 [i] 3565 Sample, UID 3566 FOO-1, 3567 BAR-2, 3568 , A04 3569 , A17 3570 , A88 3571 [/i] 3572 This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i], 3573 and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i]. 3574 Neither column is mandatory. 3575 """ 3576 3577 data = D47data() 3578 data.read(rawdata) 3579 3580 if exclude != 'none': 3581 exclude = read_csv(exclude) 3582 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3583 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3584 else: 3585 exclude_uid = [] 3586 exclude_sample = [] 3587 3588 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3589 3590 if anchors != 'none': 3591 anchors = read_csv(anchors) 3592 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3593 data.Nominal_d13C_VPDB = { 3594 _['Sample']: _['d13C_VPDB'] 3595 for _ in anchors 3596 if 'd13C_VPDB' in _ 3597 } 3598 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3599 data.Nominal_d18O_VPDB = { 3600 _['Sample']: _['d18O_VPDB'] 3601 for _ in anchors 3602 if 'd18O_VPDB' in _ 3603 } 3604 if len([_ for _ in anchors if 'D47' in _]): 3605 data.Nominal_D4x = { 3606 _['Sample']: _['D47'] 3607 for _ in anchors 3608 if 'D47' in _ 3609 } 3610 3611 data.refresh() 3612 data.wg() 3613 data.crunch() 3614 data.standardize() 3615 data.summary(dir = output_dir) 3616 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3617 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3618 data.plot_sessions(dir = output_dir) 3619 data.save_D47_correl(dir = output_dir) 3620 3621 if not run_D48: 3622 data.table_of_samples(dir = output_dir) 3623 data.table_of_analyses(dir = output_dir) 3624 data.table_of_sessions(dir = output_dir) 3625 3626 3627 if run_D48: 3628 data2 = D48data() 3629 print(rawdata) 3630 data2.read(rawdata) 3631 3632 data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3633 3634 if anchors != 'none': 3635 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3636 data2.Nominal_d13C_VPDB = { 3637 _['Sample']: _['d13C_VPDB'] 3638 for _ in anchors 3639 if 'd13C_VPDB' in _ 3640 } 3641 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3642 data2.Nominal_d18O_VPDB = { 3643 _['Sample']: _['d18O_VPDB'] 3644 for _ in anchors 3645 if 'd18O_VPDB' in _ 3646 } 3647 if len([_ for _ in anchors if 'D48' in _]): 3648 data2.Nominal_D4x = { 3649 _['Sample']: _['D48'] 3650 for _ in anchors 3651 if 'D48' in _ 3652 } 3653 3654 data2.refresh() 3655 data2.wg() 3656 data2.crunch() 3657 data2.standardize() 3658 data2.summary(dir = output_dir) 3659 data2.plot_sessions(dir = output_dir) 3660 data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True) 3661 data2.plot_distribution_of_analyses(dir = output_dir) 3662 data2.save_D48_correl(dir = output_dir) 3663 3664 table_of_analyses(data, data2, dir = output_dir) 3665 table_of_samples(data, data2, dir = output_dir) 3666 table_of_sessions(data, data2, dir = output_dir) 3667 3668def __cli(): 3669 _app()
69def fCO2eqD47_Petersen(T): 70 ''' 71 CO2 equilibrium Δ47 value as a function of T (in degrees C) 72 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 73 74 ''' 75 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
80def fCO2eqD47_Wang(T): 81 ''' 82 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 83 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 84 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 85 ''' 86 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
108def make_csv(x, hsep = ',', vsep = '\n'): 109 ''' 110 Formats a list of lists of strings as a CSV 111 112 **Parameters** 113 114 + `x`: the list of lists of strings to format 115 + `hsep`: the field separator (`,` by default) 116 + `vsep`: the line-ending convention to use (`\\n` by default) 117 118 **Example** 119 120 ```py 121 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 122 ``` 123 124 outputs: 125 126 ```py 127 a,b,c 128 d,e,f 129 ``` 130 ''' 131 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x: the list of lists of strings to formathsep: the field separator (,by default)vsep: the line-ending convention to use (\nby default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
134def pf(txt): 135 ''' 136 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 137 ''' 138 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt to follow lmfit.Parameter() naming rules.
141def smart_type(x): 142 ''' 143 Tries to convert string `x` to a float if it includes a decimal point, or 144 to an integer if it does not. If both attempts fail, return the original 145 string unchanged. 146 ''' 147 try: 148 y = float(x) 149 except ValueError: 150 return x 151 if '.' not in x: 152 return int(y) 153 return y
Tries to convert string x to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
162def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 163 ''' 164 Reads a list of lists of strings and outputs an ascii table 165 166 **Parameters** 167 168 + `x`: a list of lists of strings 169 + `header`: the number of lines to treat as header lines 170 + `hsep`: the horizontal separator between columns 171 + `vsep`: the character to use as vertical separator 172 + `align`: string of left (`<`) or right (`>`) alignment characters. 173 174 **Example** 175 176 ```py 177 print(pretty_table([ 178 ['A', 'B', 'C'], 179 ['1', '1.9999', 'foo'], 180 ['10', 'x', 'bar'], 181 ])) 182 ``` 183 yields: 184 ``` 185 —— —————— ——— 186 A B C 187 —— —————— ——— 188 1 1.9999 foo 189 10 x bar 190 —— —————— ——— 191 ``` 192 193 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 194 195 ```py 196 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 197 print(pretty_table([ 198 ['A', 'B', 'C'], 199 ['1', '1.9999', 'foo'], 200 ['10', 'x', 'bar'], 201 ])) 202 ``` 203 yields: 204 ``` 205 == ====== === 206 A B C 207 == ====== === 208 1 1.9999 foo 209 10 x bar 210 == ====== === 211 ``` 212 ''' 213 214 if vsep is None: 215 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 216 217 txt = [] 218 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 219 220 if len(widths) > len(align): 221 align += '>' * (len(widths)-len(align)) 222 sepline = hsep.join([vsep*w for w in widths]) 223 txt += [sepline] 224 for k,l in enumerate(x): 225 if k and k == header: 226 txt += [sepline] 227 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 228 txt += [sepline] 229 txt += [''] 230 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x: a list of lists of stringsheader: the number of lines to treat as header lineshsep: the horizontal separator between columnsvsep: the character to use as vertical separatoralign: string of left (<) or right (>) alignment characters.
Example
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
—— —————— ———
A B C
—— —————— ———
1 1.9999 foo
10 x bar
—— —————— ———
To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:
D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
== ====== ===
A B C
== ====== ===
1 1.9999 foo
10 x bar
== ====== ===
233def transpose_table(x): 234 ''' 235 Transpose a list if lists 236 237 **Parameters** 238 239 + `x`: a list of lists 240 241 **Example** 242 243 ```py 244 x = [[1, 2], [3, 4]] 245 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 246 ``` 247 ''' 248 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
251def w_avg(X, sX) : 252 ''' 253 Compute variance-weighted average 254 255 Returns the value and SE of the weighted average of the elements of `X`, 256 with relative weights equal to their inverse variances (`1/sX**2`). 257 258 **Parameters** 259 260 + `X`: array-like of elements to average 261 + `sX`: array-like of the corresponding SE values 262 263 **Tip** 264 265 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 266 they may be rearranged using `zip()`: 267 268 ```python 269 foo = [(0, 1), (1, 0.5), (2, 0.5)] 270 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 271 ``` 272 ''' 273 X = [ x for x in X ] 274 sX = [ sx for sx in sX ] 275 W = [ sx**-2 for sx in sX ] 276 W = [ w/sum(W) for w in W ] 277 Xavg = sum([ w*x for w,x in zip(W,X) ]) 278 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 279 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X,
with relative weights equal to their inverse variances (1/sX**2).
Parameters
X: array-like of elements to averagesX: array-like of the corresponding SE values
Tip
If X and sX are initially arranged as a list of (x, sx) doublets,
they may be rearranged using zip():
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
282def read_csv(filename, sep = ''): 283 ''' 284 Read contents of `filename` in csv format and return a list of dictionaries. 285 286 In the csv string, spaces before and after field separators (`','` by default) 287 are optional. 288 289 **Parameters** 290 291 + `filename`: the csv file to read 292 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 293 whichever appers most often in the contents of `filename`. 294 ''' 295 with open(filename) as fid: 296 txt = fid.read() 297 298 if sep == '': 299 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 300 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 301 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (',' by default)
are optional.
Parameters
filename: the csv file to readsep: csv separator delimiting the fields. By default, use,,;, or, whichever appers most often in the contents offilename.
304def simulate_single_analysis( 305 sample = 'MYSAMPLE', 306 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 307 d13C_VPDB = None, d18O_VPDB = None, 308 D47 = None, D48 = None, D49 = 0., D17O = 0., 309 a47 = 1., b47 = 0., c47 = -0.9, 310 a48 = 1., b48 = 0., c48 = -0.45, 311 Nominal_D47 = None, 312 Nominal_D48 = None, 313 Nominal_d13C_VPDB = None, 314 Nominal_d18O_VPDB = None, 315 ALPHA_18O_ACID_REACTION = None, 316 R13_VPDB = None, 317 R17_VSMOW = None, 318 R18_VSMOW = None, 319 LAMBDA_17 = None, 320 R18_VPDB = None, 321 ): 322 ''' 323 Compute working-gas delta values for a single analysis, assuming a stochastic working 324 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 325 326 **Parameters** 327 328 + `sample`: sample name 329 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 330 (respectively –4 and +26 ‰ by default) 331 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 332 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 333 of the carbonate sample 334 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 335 Δ48 values if `D47` or `D48` are not specified 336 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 337 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 338 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 339 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 340 correction parameters (by default equal to the `D4xdata` default values) 341 342 Returns a dictionary with fields 343 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 344 ''' 345 346 if Nominal_d13C_VPDB is None: 347 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 348 349 if Nominal_d18O_VPDB is None: 350 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 351 352 if ALPHA_18O_ACID_REACTION is None: 353 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 354 355 if R13_VPDB is None: 356 R13_VPDB = D4xdata().R13_VPDB 357 358 if R17_VSMOW is None: 359 R17_VSMOW = D4xdata().R17_VSMOW 360 361 if R18_VSMOW is None: 362 R18_VSMOW = D4xdata().R18_VSMOW 363 364 if LAMBDA_17 is None: 365 LAMBDA_17 = D4xdata().LAMBDA_17 366 367 if R18_VPDB is None: 368 R18_VPDB = D4xdata().R18_VPDB 369 370 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 371 372 if Nominal_D47 is None: 373 Nominal_D47 = D47data().Nominal_D47 374 375 if Nominal_D48 is None: 376 Nominal_D48 = D48data().Nominal_D48 377 378 if d13C_VPDB is None: 379 if sample in Nominal_d13C_VPDB: 380 d13C_VPDB = Nominal_d13C_VPDB[sample] 381 else: 382 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 383 384 if d18O_VPDB is None: 385 if sample in Nominal_d18O_VPDB: 386 d18O_VPDB = Nominal_d18O_VPDB[sample] 387 else: 388 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 389 390 if D47 is None: 391 if sample in Nominal_D47: 392 D47 = Nominal_D47[sample] 393 else: 394 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 395 396 if D48 is None: 397 if sample in Nominal_D48: 398 D48 = Nominal_D48[sample] 399 else: 400 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 401 402 X = D4xdata() 403 X.R13_VPDB = R13_VPDB 404 X.R17_VSMOW = R17_VSMOW 405 X.R18_VSMOW = R18_VSMOW 406 X.LAMBDA_17 = LAMBDA_17 407 X.R18_VPDB = R18_VPDB 408 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 409 410 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 411 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 412 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 413 ) 414 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 415 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 416 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 417 D17O=D17O, D47=D47, D48=D48, D49=D49, 418 ) 419 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 420 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 421 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 422 D17O=D17O, 423 ) 424 425 d45 = 1000 * (R45/R45wg - 1) 426 d46 = 1000 * (R46/R46wg - 1) 427 d47 = 1000 * (R47/R47wg - 1) 428 d48 = 1000 * (R48/R48wg - 1) 429 d49 = 1000 * (R49/R49wg - 1) 430 431 for k in range(3): # dumb iteration to adjust for small changes in d47 432 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 433 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 434 d47 = 1000 * (R47raw/R47wg - 1) 435 d48 = 1000 * (R48raw/R48wg - 1) 436 437 return dict( 438 Sample = sample, 439 D17O = D17O, 440 d13Cwg_VPDB = d13Cwg_VPDB, 441 d18Owg_VSMOW = d18Owg_VSMOW, 442 d45 = d45, 443 d46 = d46, 444 d47 = d47, 445 d48 = d48, 446 d49 = d49, 447 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample: sample named13Cwg_VPDB,d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB,d18O_VPDB: bulk composition of the carbonate sampleD47,D48,D49,D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47,Nominal_D48: where to lookup Δ47 and Δ48 values ifD47orD48are not specifiedNominal_d13C_VPDB,Nominal_d18O_VPDB: where to lookup δ13C and δ18O values ifd13C_VPDBord18O_VPDBare not specifiedALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factorR13_VPDB,R17_VSMOW,R18_VSMOW,LAMBDA_17,R18_VPDB: oxygen-17 correction parameters (by default equal to theD4xdatadefault values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].
450def virtual_data( 451 samples = [], 452 a47 = 1., b47 = 0., c47 = -0.9, 453 a48 = 1., b48 = 0., c48 = -0.45, 454 rd45 = 0.020, rd46 = 0.060, 455 rD47 = 0.015, rD48 = 0.045, 456 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 457 session = None, 458 Nominal_D47 = None, Nominal_D48 = None, 459 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 460 ALPHA_18O_ACID_REACTION = None, 461 R13_VPDB = None, 462 R17_VSMOW = None, 463 R18_VSMOW = None, 464 LAMBDA_17 = None, 465 R18_VPDB = None, 466 seed = 0, 467 shuffle = True, 468 ): 469 ''' 470 Return list with simulated analyses from a single session. 471 472 **Parameters** 473 474 + `samples`: a list of entries; each entry is a dictionary with the following fields: 475 * `Sample`: the name of the sample 476 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 477 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 478 * `N`: how many analyses to generate for this sample 479 + `a47`: scrambling factor for Δ47 480 + `b47`: compositional nonlinearity for Δ47 481 + `c47`: working gas offset for Δ47 482 + `a48`: scrambling factor for Δ48 483 + `b48`: compositional nonlinearity for Δ48 484 + `c48`: working gas offset for Δ48 485 + `rd45`: analytical repeatability of δ45 486 + `rd46`: analytical repeatability of δ46 487 + `rD47`: analytical repeatability of Δ47 488 + `rD48`: analytical repeatability of Δ48 489 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 490 (by default equal to the `simulate_single_analysis` default values) 491 + `session`: name of the session (no name by default) 492 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 493 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 494 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 495 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 496 (by default equal to the `simulate_single_analysis` defaults) 497 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 498 (by default equal to the `simulate_single_analysis` defaults) 499 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 500 correction parameters (by default equal to the `simulate_single_analysis` default) 501 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 502 + `shuffle`: randomly reorder the sequence of analyses 503 504 505 Here is an example of using this method to generate an arbitrary combination of 506 anchors and unknowns for a bunch of sessions: 507 508 ```py 509 .. include:: ../../code_examples/virtual_data/example.py 510 ``` 511 512 This should output something like: 513 514 ``` 515 .. include:: ../../code_examples/virtual_data/output.txt 516 ``` 517 ''' 518 519 kwargs = locals().copy() 520 521 from numpy import random as nprandom 522 if seed: 523 nprandom.seed(seed) 524 rng = nprandom.default_rng(seed) 525 else: 526 rng = nprandom.default_rng() 527 528 N = sum([s['N'] for s in samples]) 529 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 530 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 531 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 532 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 533 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 534 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 535 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 536 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 537 538 k = 0 539 out = [] 540 for s in samples: 541 kw = {} 542 kw['sample'] = s['Sample'] 543 kw = { 544 **kw, 545 **{var: kwargs[var] 546 for var in [ 547 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 548 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 549 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 550 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 551 ] 552 if kwargs[var] is not None}, 553 **{var: s[var] 554 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 555 if var in s}, 556 } 557 558 sN = s['N'] 559 while sN: 560 out.append(simulate_single_analysis(**kw)) 561 out[-1]['d45'] += errors45[k] 562 out[-1]['d46'] += errors46[k] 563 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 564 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 565 sN -= 1 566 k += 1 567 568 if session is not None: 569 for r in out: 570 r['Session'] = session 571 572 if shuffle: 573 nprandom.shuffle(out) 574 575 return out
Return list with simulated analyses from a single session.
Parameters
samples: a list of entries; each entry is a dictionary with the following fields:Sample: the name of the sampled13C_VPDB,d18O_VPDB: bulk composition of the carbonate sampleD47,D48,D49,D17O(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN: how many analyses to generate for this sample
a47: scrambling factor for Δ47b47: compositional nonlinearity for Δ47c47: working gas offset for Δ47a48: scrambling factor for Δ48b48: compositional nonlinearity for Δ48c48: working gas offset for Δ48rd45: analytical repeatability of δ45rd46: analytical repeatability of δ46rD47: analytical repeatability of Δ47rD48: analytical repeatability of Δ48d13Cwg_VPDB,d18Owg_VSMOW: bulk composition of the working gas (by default equal to thesimulate_single_analysisdefault values)session: name of the session (no name by default)Nominal_D47,Nominal_D48: where to lookup Δ47 and Δ48 values ifD47orD48are not specified (by default equal to thesimulate_single_analysisdefaults)Nominal_d13C_VPDB,Nominal_d18O_VPDB: where to lookup δ13C and δ18O values ifd13C_VPDBord18O_VPDBare not specified (by default equal to thesimulate_single_analysisdefaults)ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysisdefaults)R13_VPDB,R17_VSMOW,R18_VSMOW,LAMBDA_17,R18_VPDB: oxygen-17 correction parameters (by default equal to thesimulate_single_analysisdefault)seed: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0075 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0082 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0091 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0070 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
[table_of_samples]
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
[table_of_analyses]
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
1 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
2 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
3 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
4 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
5 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
6 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
7 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
8 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
9 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
10 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
11 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
12 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
13 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
14 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
15 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
16 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
17 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
18 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
19 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
20 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
21 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
22 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
23 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
24 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
25 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
26 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
27 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
28 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
29 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
30 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
31 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
32 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
33 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
34 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
35 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
36 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
37 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
38 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
39 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
40 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
41 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
42 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
43 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
44 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
45 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
46 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
47 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
48 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
49 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
50 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
51 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
52 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
53 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
54 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
55 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
56 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
57 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
58 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
59 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
60 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
577def table_of_samples( 578 data47 = None, 579 data48 = None, 580 dir = 'output', 581 filename = None, 582 save_to_file = True, 583 print_out = True, 584 output = None, 585 ): 586 ''' 587 Print out, save to disk and/or return a combined table of samples 588 for a pair of `D47data` and `D48data` objects. 589 590 **Parameters** 591 592 + `data47`: `D47data` instance 593 + `data48`: `D48data` instance 594 + `dir`: the directory in which to save the table 595 + `filename`: the name to the csv file to write to 596 + `save_to_file`: whether to save the table to disk 597 + `print_out`: whether to print out the table 598 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 599 if set to `'raw'`: return a list of list of strings 600 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 601 ''' 602 if data47 is None: 603 if data48 is None: 604 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 605 else: 606 return data48.table_of_samples( 607 dir = dir, 608 filename = filename, 609 save_to_file = save_to_file, 610 print_out = print_out, 611 output = output 612 ) 613 else: 614 if data48 is None: 615 return data47.table_of_samples( 616 dir = dir, 617 filename = filename, 618 save_to_file = save_to_file, 619 print_out = print_out, 620 output = output 621 ) 622 else: 623 samples = ( 624 sorted([a for a in data47.anchors if a in data48.anchors]) 625 + sorted([a for a in data47.anchors if a not in data48.anchors]) 626 + sorted([a for a in data48.anchors if a not in data47.anchors]) 627 + sorted([a for a in data47.unknowns if a in data48.unknowns]) 628 ) 629 630 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 631 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 632 633 out47 = {l[0]: l for l in out47} 634 out48 = {l[0]: l for l in out48} 635 636 out = [out47['Sample'] + out48['Sample'][4:]] 637 for s in samples: 638 out.append(out47[s] + out48[s][4:]) 639 640 if save_to_file: 641 if not os.path.exists(dir): 642 os.makedirs(dir) 643 if filename is None: 644 filename = f'D47D48_samples.csv' 645 with open(f'{dir}/{filename}', 'w') as fid: 646 fid.write(make_csv(out)) 647 if print_out: 648 print('\n'+pretty_table(out)) 649 if output == 'raw': 650 return out 651 elif output == 'pretty': 652 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data and D48data objects.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
655def table_of_sessions( 656 data47 = None, 657 data48 = None, 658 dir = 'output', 659 filename = None, 660 save_to_file = True, 661 print_out = True, 662 output = None, 663 ): 664 ''' 665 Print out, save to disk and/or return a combined table of sessions 666 for a pair of `D47data` and `D48data` objects. 667 ***Only applicable if the sessions in `data47` and those in `data48` 668 consist of the exact same sets of analyses.*** 669 670 **Parameters** 671 672 + `data47`: `D47data` instance 673 + `data48`: `D48data` instance 674 + `dir`: the directory in which to save the table 675 + `filename`: the name to the csv file to write to 676 + `save_to_file`: whether to save the table to disk 677 + `print_out`: whether to print out the table 678 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 679 if set to `'raw'`: return a list of list of strings 680 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 681 ''' 682 if data47 is None: 683 if data48 is None: 684 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 685 else: 686 return data48.table_of_sessions( 687 dir = dir, 688 filename = filename, 689 save_to_file = save_to_file, 690 print_out = print_out, 691 output = output 692 ) 693 else: 694 if data48 is None: 695 return data47.table_of_sessions( 696 dir = dir, 697 filename = filename, 698 save_to_file = save_to_file, 699 print_out = print_out, 700 output = output 701 ) 702 else: 703 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 704 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 705 for k,x in enumerate(out47[0]): 706 if k>7: 707 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 708 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 709 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 710 711 if save_to_file: 712 if not os.path.exists(dir): 713 os.makedirs(dir) 714 if filename is None: 715 filename = f'D47D48_sessions.csv' 716 with open(f'{dir}/{filename}', 'w') as fid: 717 fid.write(make_csv(out)) 718 if print_out: 719 print('\n'+pretty_table(out)) 720 if output == 'raw': 721 return out 722 elif output == 'pretty': 723 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data and D48data objects.
Only applicable if the sessions in data47 and those in data48
consist of the exact same sets of analyses.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
726def table_of_analyses( 727 data47 = None, 728 data48 = None, 729 dir = 'output', 730 filename = None, 731 save_to_file = True, 732 print_out = True, 733 output = None, 734 ): 735 ''' 736 Print out, save to disk and/or return a combined table of analyses 737 for a pair of `D47data` and `D48data` objects. 738 739 If the sessions in `data47` and those in `data48` do not consist of 740 the exact same sets of analyses, the table will have two columns 741 `Session_47` and `Session_48` instead of a single `Session` column. 742 743 **Parameters** 744 745 + `data47`: `D47data` instance 746 + `data48`: `D48data` instance 747 + `dir`: the directory in which to save the table 748 + `filename`: the name to the csv file to write to 749 + `save_to_file`: whether to save the table to disk 750 + `print_out`: whether to print out the table 751 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 752 if set to `'raw'`: return a list of list of strings 753 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 754 ''' 755 if data47 is None: 756 if data48 is None: 757 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 758 else: 759 return data48.table_of_analyses( 760 dir = dir, 761 filename = filename, 762 save_to_file = save_to_file, 763 print_out = print_out, 764 output = output 765 ) 766 else: 767 if data48 is None: 768 return data47.table_of_analyses( 769 dir = dir, 770 filename = filename, 771 save_to_file = save_to_file, 772 print_out = print_out, 773 output = output 774 ) 775 else: 776 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 777 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 778 779 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 780 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 781 else: 782 out47[0][1] = 'Session_47' 783 out48[0][1] = 'Session_48' 784 out47 = transpose_table(out47) 785 out48 = transpose_table(out48) 786 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 787 788 if save_to_file: 789 if not os.path.exists(dir): 790 os.makedirs(dir) 791 if filename is None: 792 filename = f'D47D48_analyses.csv' 793 with open(f'{dir}/{filename}', 'w') as fid: 794 fid.write(make_csv(out)) 795 if print_out: 796 print('\n'+pretty_table(out)) 797 if output == 'raw': 798 return out 799 elif output == 'pretty': 800 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data and D48data objects.
If the sessions in data47 and those in data48 do not consist of
the exact same sets of analyses, the table will have two columns
Session_47 and Session_48 instead of a single Session column.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
848class D4xdata(list): 849 ''' 850 Store and process data for a large set of Δ47 and/or Δ48 851 analyses, usually comprising more than one analytical session. 852 ''' 853 854 ### 17O CORRECTION PARAMETERS 855 R13_VPDB = 0.01118 # (Chang & Li, 1990) 856 ''' 857 Absolute (13C/12C) ratio of VPDB. 858 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 859 ''' 860 861 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 862 ''' 863 Absolute (18O/16C) ratio of VSMOW. 864 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 865 ''' 866 867 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 868 ''' 869 Mass-dependent exponent for triple oxygen isotopes. 870 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 871 ''' 872 873 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 874 ''' 875 Absolute (17O/16C) ratio of VSMOW. 876 By default equal to 0.00038475 877 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 878 rescaled to `R13_VPDB`) 879 ''' 880 881 R18_VPDB = R18_VSMOW * 1.03092 882 ''' 883 Absolute (18O/16C) ratio of VPDB. 884 By definition equal to `R18_VSMOW * 1.03092`. 885 ''' 886 887 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 888 ''' 889 Absolute (17O/16C) ratio of VPDB. 890 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 891 ''' 892 893 LEVENE_REF_SAMPLE = 'ETH-3' 894 ''' 895 After the Δ4x standardization step, each sample is tested to 896 assess whether the Δ4x variance within all analyses for that 897 sample differs significantly from that observed for a given reference 898 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 899 which yields a p-value corresponding to the null hypothesis that the 900 underlying variances are equal). 901 902 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 903 sample should be used as a reference for this test. 904 ''' 905 906 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 907 ''' 908 Specifies the 18O/16O fractionation factor generally applicable 909 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 910 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 911 912 By default equal to 1.008129 (calcite reacted at 90 °C, 913 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 914 ''' 915 916 Nominal_d13C_VPDB = { 917 'ETH-1': 2.02, 918 'ETH-2': -10.17, 919 'ETH-3': 1.71, 920 } # (Bernasconi et al., 2018) 921 ''' 922 Nominal δ13C_VPDB values assigned to carbonate standards, used by 923 `D4xdata.standardize_d13C()`. 924 925 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 926 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 927 ''' 928 929 Nominal_d18O_VPDB = { 930 'ETH-1': -2.19, 931 'ETH-2': -18.69, 932 'ETH-3': -1.78, 933 } # (Bernasconi et al., 2018) 934 ''' 935 Nominal δ18O_VPDB values assigned to carbonate standards, used by 936 `D4xdata.standardize_d18O()`. 937 938 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 939 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 940 ''' 941 942 d13C_STANDARDIZATION_METHOD = '2pt' 943 ''' 944 Method by which to standardize δ13C values: 945 946 + `none`: do not apply any δ13C standardization. 947 + `'1pt'`: within each session, offset all initial δ13C values so as to 948 minimize the difference between final δ13C_VPDB values and 949 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 950 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 951 values so as to minimize the difference between final δ13C_VPDB 952 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 953 is defined). 954 ''' 955 956 d18O_STANDARDIZATION_METHOD = '2pt' 957 ''' 958 Method by which to standardize δ18O values: 959 960 + `none`: do not apply any δ18O standardization. 961 + `'1pt'`: within each session, offset all initial δ18O values so as to 962 minimize the difference between final δ18O_VPDB values and 963 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 964 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 965 values so as to minimize the difference between final δ18O_VPDB 966 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 967 is defined). 968 ''' 969 970 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 971 ''' 972 **Parameters** 973 974 + `l`: a list of dictionaries, with each dictionary including at least the keys 975 `Sample`, `d45`, `d46`, and `d47` or `d48`. 976 + `mass`: `'47'` or `'48'` 977 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 978 + `session`: define session name for analyses without a `Session` key 979 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 980 981 Returns a `D4xdata` object derived from `list`. 982 ''' 983 self._4x = mass 984 self.verbose = verbose 985 self.prefix = 'D4xdata' 986 self.logfile = logfile 987 list.__init__(self, l) 988 self.Nf = None 989 self.repeatability = {} 990 self.refresh(session = session) 991 992 993 def make_verbal(oldfun): 994 ''' 995 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 996 ''' 997 @wraps(oldfun) 998 def newfun(*args, verbose = '', **kwargs): 999 myself = args[0] 1000 oldprefix = myself.prefix 1001 myself.prefix = oldfun.__name__ 1002 if verbose != '': 1003 oldverbose = myself.verbose 1004 myself.verbose = verbose 1005 out = oldfun(*args, **kwargs) 1006 myself.prefix = oldprefix 1007 if verbose != '': 1008 myself.verbose = oldverbose 1009 return out 1010 return newfun 1011 1012 1013 def msg(self, txt): 1014 ''' 1015 Log a message to `self.logfile`, and print it out if `verbose = True` 1016 ''' 1017 self.log(txt) 1018 if self.verbose: 1019 print(f'{f"[{self.prefix}]":<16} {txt}') 1020 1021 1022 def vmsg(self, txt): 1023 ''' 1024 Log a message to `self.logfile` and print it out 1025 ''' 1026 self.log(txt) 1027 print(txt) 1028 1029 1030 def log(self, *txts): 1031 ''' 1032 Log a message to `self.logfile` 1033 ''' 1034 if self.logfile: 1035 with open(self.logfile, 'a') as fid: 1036 for txt in txts: 1037 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1038 1039 1040 def refresh(self, session = 'mySession'): 1041 ''' 1042 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1043 ''' 1044 self.fill_in_missing_info(session = session) 1045 self.refresh_sessions() 1046 self.refresh_samples() 1047 1048 1049 def refresh_sessions(self): 1050 ''' 1051 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1052 to `False` for all sessions. 1053 ''' 1054 self.sessions = { 1055 s: {'data': [r for r in self if r['Session'] == s]} 1056 for s in sorted({r['Session'] for r in self}) 1057 } 1058 for s in self.sessions: 1059 self.sessions[s]['scrambling_drift'] = False 1060 self.sessions[s]['slope_drift'] = False 1061 self.sessions[s]['wg_drift'] = False 1062 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1063 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1064 1065 1066 def refresh_samples(self): 1067 ''' 1068 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1069 ''' 1070 self.samples = { 1071 s: {'data': [r for r in self if r['Sample'] == s]} 1072 for s in sorted({r['Sample'] for r in self}) 1073 } 1074 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1075 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1076 1077 1078 def read(self, filename, sep = '', session = ''): 1079 ''' 1080 Read file in csv format to load data into a `D47data` object. 1081 1082 In the csv file, spaces before and after field separators (`','` by default) 1083 are optional. Each line corresponds to a single analysis. 1084 1085 The required fields are: 1086 1087 + `UID`: a unique identifier 1088 + `Session`: an identifier for the analytical session 1089 + `Sample`: a sample identifier 1090 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1091 1092 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1093 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1094 and `d49` are optional, and set to NaN by default. 1095 1096 **Parameters** 1097 1098 + `fileneme`: the path of the file to read 1099 + `sep`: csv separator delimiting the fields 1100 + `session`: set `Session` field to this string for all analyses 1101 ''' 1102 with open(filename) as fid: 1103 self.input(fid.read(), sep = sep, session = session) 1104 1105 1106 def input(self, txt, sep = '', session = ''): 1107 ''' 1108 Read `txt` string in csv format to load analysis data into a `D47data` object. 1109 1110 In the csv string, spaces before and after field separators (`','` by default) 1111 are optional. Each line corresponds to a single analysis. 1112 1113 The required fields are: 1114 1115 + `UID`: a unique identifier 1116 + `Session`: an identifier for the analytical session 1117 + `Sample`: a sample identifier 1118 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1119 1120 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1121 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1122 and `d49` are optional, and set to NaN by default. 1123 1124 **Parameters** 1125 1126 + `txt`: the csv string to read 1127 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1128 whichever appers most often in `txt`. 1129 + `session`: set `Session` field to this string for all analyses 1130 ''' 1131 if sep == '': 1132 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1133 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1134 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1135 1136 if session != '': 1137 for r in data: 1138 r['Session'] = session 1139 1140 self += data 1141 self.refresh() 1142 1143 1144 @make_verbal 1145 def wg(self, 1146 samples = None, 1147 session_groups = None, 1148 ): 1149 ''' 1150 Compute bulk composition of the working gas for each session based (by default) 1151 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1152 `self.Nominal_d18O_VPDB`. 1153 1154 **Parameters** 1155 1156 + `samples`: A list of samples specifying the subset of samples (defined in both 1157 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1158 when computing the working gas. By default, use all samples defined both in 1159 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1160 + `session_groups`: a list of lists of sessions 1161 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1162 specifying which sessions groups, if any, have the exact same WG composition. 1163 If set to `'all'`, force all sessions to have the same WG composition (use with 1164 caution and on short time scales, since the WG may drift slowly a long time scales). 1165 ''' 1166 1167 self.msg('Computing WG composition:') 1168 1169 a18_acid = self.ALPHA_18O_ACID_REACTION 1170 1171 if samples is None: 1172 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1173 if session_groups is None: 1174 session_groups = [[s] for s in self.sessions] 1175 elif session_groups == 'all': 1176 session_groups = [[s for s in self.sessions]] 1177 1178 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1179 R45R46_standards = {} 1180 for sample in samples: 1181 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1182 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1183 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1184 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1185 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1186 1187 C12_s = 1 / (1 + R13_s) 1188 C13_s = R13_s / (1 + R13_s) 1189 C16_s = 1 / (1 + R17_s + R18_s) 1190 C17_s = R17_s / (1 + R17_s + R18_s) 1191 C18_s = R18_s / (1 + R17_s + R18_s) 1192 1193 C626_s = C12_s * C16_s ** 2 1194 C627_s = 2 * C12_s * C16_s * C17_s 1195 C628_s = 2 * C12_s * C16_s * C18_s 1196 C636_s = C13_s * C16_s ** 2 1197 C637_s = 2 * C13_s * C16_s * C17_s 1198 C727_s = C12_s * C17_s ** 2 1199 1200 R45_s = (C627_s + C636_s) / C626_s 1201 R46_s = (C628_s + C637_s + C727_s) / C626_s 1202 R45R46_standards[sample] = (R45_s, R46_s) 1203 1204 for sg in session_groups: 1205 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1206 assert db, f'No sample from {samples} found in session group {sg}.' 1207 1208 X = [r['d45'] for r in db] 1209 Y = [R45R46_standards[r['Sample']][0] for r in db] 1210 x1, x2 = np.min(X), np.max(X) 1211 1212 if x1 < x2: 1213 wgcoord = x1/(x1-x2) 1214 else: 1215 wgcoord = 999 1216 1217 if wgcoord < -.5 or wgcoord > 1.5: 1218 # unreasonable to extrapolate to d45 = 0 1219 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1220 else : 1221 # d45 = 0 is reasonably well bracketed 1222 R45_wg = np.polyfit(X, Y, 1)[1] 1223 1224 X = [r['d46'] for r in db] 1225 Y = [R45R46_standards[r['Sample']][1] for r in db] 1226 x1, x2 = np.min(X), np.max(X) 1227 1228 if x1 < x2: 1229 wgcoord = x1/(x1-x2) 1230 else: 1231 wgcoord = 999 1232 1233 if wgcoord < -.5 or wgcoord > 1.5: 1234 # unreasonable to extrapolate to d46 = 0 1235 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1236 else : 1237 # d46 = 0 is reasonably well bracketed 1238 R46_wg = np.polyfit(X, Y, 1)[1] 1239 1240 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1241 1242 for s in sg: 1243 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1244 1245 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1246 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1247 for r in self.sessions[s]['data']: 1248 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1249 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1250 1251 1252 def compute_bulk_delta(self, R45, R46, D17O = 0): 1253 ''' 1254 Compute δ13C_VPDB and δ18O_VSMOW, 1255 by solving the generalized form of equation (17) from 1256 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1257 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1258 solving the corresponding second-order Taylor polynomial. 1259 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1260 ''' 1261 1262 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1263 1264 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1265 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1266 C = 2 * self.R18_VSMOW 1267 D = -R46 1268 1269 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1270 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1271 cc = A + B + C + D 1272 1273 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1274 1275 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1276 R17 = K * R18 ** self.LAMBDA_17 1277 R13 = R45 - 2 * R17 1278 1279 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1280 1281 return d13C_VPDB, d18O_VSMOW 1282 1283 1284 @make_verbal 1285 def crunch(self, verbose = ''): 1286 ''' 1287 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1288 ''' 1289 for r in self: 1290 self.compute_bulk_and_clumping_deltas(r) 1291 self.standardize_d13C() 1292 self.standardize_d18O() 1293 self.msg(f"Crunched {len(self)} analyses.") 1294 1295 1296 def fill_in_missing_info(self, session = 'mySession'): 1297 ''' 1298 Fill in optional fields with default values 1299 ''' 1300 for i,r in enumerate(self): 1301 if 'D17O' not in r: 1302 r['D17O'] = 0. 1303 if 'UID' not in r: 1304 r['UID'] = f'{i+1}' 1305 if 'Session' not in r: 1306 r['Session'] = session 1307 for k in ['d47', 'd48', 'd49']: 1308 if k not in r: 1309 r[k] = np.nan 1310 1311 1312 def standardize_d13C(self): 1313 ''' 1314 Perform δ13C standadization within each session `s` according to 1315 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1316 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1317 may be redefined abitrarily at a later stage. 1318 ''' 1319 for s in self.sessions: 1320 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1321 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1322 X,Y = zip(*XY) 1323 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1324 offset = np.mean(Y) - np.mean(X) 1325 for r in self.sessions[s]['data']: 1326 r['d13C_VPDB'] += offset 1327 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1328 a,b = np.polyfit(X,Y,1) 1329 for r in self.sessions[s]['data']: 1330 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1331 1332 def standardize_d18O(self): 1333 ''' 1334 Perform δ18O standadization within each session `s` according to 1335 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1336 which is defined by default by `D47data.refresh_sessions()`as equal to 1337 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1338 ''' 1339 for s in self.sessions: 1340 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1341 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1342 X,Y = zip(*XY) 1343 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1344 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1345 offset = np.mean(Y) - np.mean(X) 1346 for r in self.sessions[s]['data']: 1347 r['d18O_VSMOW'] += offset 1348 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1349 a,b = np.polyfit(X,Y,1) 1350 for r in self.sessions[s]['data']: 1351 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1352 1353 1354 def compute_bulk_and_clumping_deltas(self, r): 1355 ''' 1356 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1357 ''' 1358 1359 # Compute working gas R13, R18, and isobar ratios 1360 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1361 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1362 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1363 1364 # Compute analyte isobar ratios 1365 R45 = (1 + r['d45'] / 1000) * R45_wg 1366 R46 = (1 + r['d46'] / 1000) * R46_wg 1367 R47 = (1 + r['d47'] / 1000) * R47_wg 1368 R48 = (1 + r['d48'] / 1000) * R48_wg 1369 R49 = (1 + r['d49'] / 1000) * R49_wg 1370 1371 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1372 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1373 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1374 1375 # Compute stochastic isobar ratios of the analyte 1376 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1377 R13, R18, D17O = r['D17O'] 1378 ) 1379 1380 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1381 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1382 if (R45 / R45stoch - 1) > 5e-8: 1383 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1384 if (R46 / R46stoch - 1) > 5e-8: 1385 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1386 1387 # Compute raw clumped isotope anomalies 1388 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1389 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1390 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1391 1392 1393 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1394 ''' 1395 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1396 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1397 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1398 ''' 1399 1400 # Compute R17 1401 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1402 1403 # Compute isotope concentrations 1404 C12 = (1 + R13) ** -1 1405 C13 = C12 * R13 1406 C16 = (1 + R17 + R18) ** -1 1407 C17 = C16 * R17 1408 C18 = C16 * R18 1409 1410 # Compute stochastic isotopologue concentrations 1411 C626 = C16 * C12 * C16 1412 C627 = C16 * C12 * C17 * 2 1413 C628 = C16 * C12 * C18 * 2 1414 C636 = C16 * C13 * C16 1415 C637 = C16 * C13 * C17 * 2 1416 C638 = C16 * C13 * C18 * 2 1417 C727 = C17 * C12 * C17 1418 C728 = C17 * C12 * C18 * 2 1419 C737 = C17 * C13 * C17 1420 C738 = C17 * C13 * C18 * 2 1421 C828 = C18 * C12 * C18 1422 C838 = C18 * C13 * C18 1423 1424 # Compute stochastic isobar ratios 1425 R45 = (C636 + C627) / C626 1426 R46 = (C628 + C637 + C727) / C626 1427 R47 = (C638 + C728 + C737) / C626 1428 R48 = (C738 + C828) / C626 1429 R49 = C838 / C626 1430 1431 # Account for stochastic anomalies 1432 R47 *= 1 + D47 / 1000 1433 R48 *= 1 + D48 / 1000 1434 R49 *= 1 + D49 / 1000 1435 1436 # Return isobar ratios 1437 return R45, R46, R47, R48, R49 1438 1439 1440 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1441 ''' 1442 Split unknown samples by UID (treat all analyses as different samples) 1443 or by session (treat analyses of a given sample in different sessions as 1444 different samples). 1445 1446 **Parameters** 1447 1448 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1449 + `grouping`: `by_uid` | `by_session` 1450 ''' 1451 if samples_to_split == 'all': 1452 samples_to_split = [s for s in self.unknowns] 1453 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1454 self.grouping = grouping.lower() 1455 if self.grouping in gkeys: 1456 gkey = gkeys[self.grouping] 1457 for r in self: 1458 if r['Sample'] in samples_to_split: 1459 r['Sample_original'] = r['Sample'] 1460 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1461 elif r['Sample'] in self.unknowns: 1462 r['Sample_original'] = r['Sample'] 1463 self.refresh_samples() 1464 1465 1466 def unsplit_samples(self, tables = False): 1467 ''' 1468 Reverse the effects of `D47data.split_samples()`. 1469 1470 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1471 1472 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1473 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1474 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1475 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1476 that case session-averaged Δ4x values are statistically independent). 1477 ''' 1478 unknowns_old = sorted({s for s in self.unknowns}) 1479 CM_old = self.standardization.covar[:,:] 1480 VD_old = self.standardization.params.valuesdict().copy() 1481 vars_old = self.standardization.var_names 1482 1483 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1484 1485 Ns = len(vars_old) - len(unknowns_old) 1486 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1487 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1488 1489 W = np.zeros((len(vars_new), len(vars_old))) 1490 W[:Ns,:Ns] = np.eye(Ns) 1491 for u in unknowns_new: 1492 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1493 if self.grouping == 'by_session': 1494 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1495 elif self.grouping == 'by_uid': 1496 weights = [1 for s in splits] 1497 sw = sum(weights) 1498 weights = [w/sw for w in weights] 1499 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1500 1501 CM_new = W @ CM_old @ W.T 1502 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1503 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1504 1505 self.standardization.covar = CM_new 1506 self.standardization.params.valuesdict = lambda : VD_new 1507 self.standardization.var_names = vars_new 1508 1509 for r in self: 1510 if r['Sample'] in self.unknowns: 1511 r['Sample_split'] = r['Sample'] 1512 r['Sample'] = r['Sample_original'] 1513 1514 self.refresh_samples() 1515 self.consolidate_samples() 1516 self.repeatabilities() 1517 1518 if tables: 1519 self.table_of_analyses() 1520 self.table_of_samples() 1521 1522 def assign_timestamps(self): 1523 ''' 1524 Assign a time field `t` of type `float` to each analysis. 1525 1526 If `TimeTag` is one of the data fields, `t` is equal within a given session 1527 to `TimeTag` minus the mean value of `TimeTag` for that session. 1528 Otherwise, `TimeTag` is by default equal to the index of each analysis 1529 in the dataset and `t` is defined as above. 1530 ''' 1531 for session in self.sessions: 1532 sdata = self.sessions[session]['data'] 1533 try: 1534 t0 = np.mean([r['TimeTag'] for r in sdata]) 1535 for r in sdata: 1536 r['t'] = r['TimeTag'] - t0 1537 except KeyError: 1538 t0 = (len(sdata)-1)/2 1539 for t,r in enumerate(sdata): 1540 r['t'] = t - t0 1541 1542 1543 def report(self): 1544 ''' 1545 Prints a report on the standardization fit. 1546 Only applicable after `D4xdata.standardize(method='pooled')`. 1547 ''' 1548 report_fit(self.standardization) 1549 1550 1551 def combine_samples(self, sample_groups): 1552 ''' 1553 Combine analyses of different samples to compute weighted average Δ4x 1554 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1555 dictionary. 1556 1557 Caution: samples are weighted by number of replicate analyses, which is a 1558 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1559 correlated analytical errors for one or more samples). 1560 1561 Returns a tuplet of: 1562 1563 + the list of group names 1564 + an array of the corresponding Δ4x values 1565 + the corresponding (co)variance matrix 1566 1567 **Parameters** 1568 1569 + `sample_groups`: a dictionary of the form: 1570 ```py 1571 {'group1': ['sample_1', 'sample_2'], 1572 'group2': ['sample_3', 'sample_4', 'sample_5']} 1573 ``` 1574 ''' 1575 1576 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1577 groups = sorted(sample_groups.keys()) 1578 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1579 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1580 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1581 W = np.array([ 1582 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1583 for j in groups]) 1584 D4x_new = W @ D4x_old 1585 CM_new = W @ CM_old @ W.T 1586 1587 return groups, D4x_new[:,0], CM_new 1588 1589 1590 @make_verbal 1591 def standardize(self, 1592 method = 'pooled', 1593 weighted_sessions = [], 1594 consolidate = True, 1595 consolidate_tables = False, 1596 consolidate_plots = False, 1597 constraints = {}, 1598 ): 1599 ''' 1600 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1601 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1602 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1603 i.e. that their true Δ4x value does not change between sessions, 1604 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1605 `'indep_sessions'`, the standardization processes each session independently, based only 1606 on anchors analyses. 1607 ''' 1608 1609 self.standardization_method = method 1610 self.assign_timestamps() 1611 1612 if method == 'pooled': 1613 if weighted_sessions: 1614 for session_group in weighted_sessions: 1615 if self._4x == '47': 1616 X = D47data([r for r in self if r['Session'] in session_group]) 1617 elif self._4x == '48': 1618 X = D48data([r for r in self if r['Session'] in session_group]) 1619 X.Nominal_D4x = self.Nominal_D4x.copy() 1620 X.refresh() 1621 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1622 w = np.sqrt(result.redchi) 1623 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1624 for r in X: 1625 r[f'wD{self._4x}raw'] *= w 1626 else: 1627 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1628 for r in self: 1629 r[f'wD{self._4x}raw'] = 1. 1630 1631 params = Parameters() 1632 for k,session in enumerate(self.sessions): 1633 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1634 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1635 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1636 s = pf(session) 1637 params.add(f'a_{s}', value = 0.9) 1638 params.add(f'b_{s}', value = 0.) 1639 params.add(f'c_{s}', value = -0.9) 1640 params.add(f'a2_{s}', value = 0., 1641# vary = self.sessions[session]['scrambling_drift'], 1642 ) 1643 params.add(f'b2_{s}', value = 0., 1644# vary = self.sessions[session]['slope_drift'], 1645 ) 1646 params.add(f'c2_{s}', value = 0., 1647# vary = self.sessions[session]['wg_drift'], 1648 ) 1649 if not self.sessions[session]['scrambling_drift']: 1650 params[f'a2_{s}'].expr = '0' 1651 if not self.sessions[session]['slope_drift']: 1652 params[f'b2_{s}'].expr = '0' 1653 if not self.sessions[session]['wg_drift']: 1654 params[f'c2_{s}'].expr = '0' 1655 1656 for sample in self.unknowns: 1657 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1658 1659 for k in constraints: 1660 params[k].expr = constraints[k] 1661 1662 def residuals(p): 1663 R = [] 1664 for r in self: 1665 session = pf(r['Session']) 1666 sample = pf(r['Sample']) 1667 if r['Sample'] in self.Nominal_D4x: 1668 R += [ ( 1669 r[f'D{self._4x}raw'] - ( 1670 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1671 + p[f'b_{session}'] * r[f'd{self._4x}'] 1672 + p[f'c_{session}'] 1673 + r['t'] * ( 1674 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1675 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1676 + p[f'c2_{session}'] 1677 ) 1678 ) 1679 ) / r[f'wD{self._4x}raw'] ] 1680 else: 1681 R += [ ( 1682 r[f'D{self._4x}raw'] - ( 1683 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1684 + p[f'b_{session}'] * r[f'd{self._4x}'] 1685 + p[f'c_{session}'] 1686 + r['t'] * ( 1687 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1688 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1689 + p[f'c2_{session}'] 1690 ) 1691 ) 1692 ) / r[f'wD{self._4x}raw'] ] 1693 return R 1694 1695 M = Minimizer(residuals, params) 1696 result = M.least_squares() 1697 self.Nf = result.nfree 1698 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1699 new_names, new_covar, new_se = _fullcovar(result)[:3] 1700 result.var_names = new_names 1701 result.covar = new_covar 1702 1703 for r in self: 1704 s = pf(r["Session"]) 1705 a = result.params.valuesdict()[f'a_{s}'] 1706 b = result.params.valuesdict()[f'b_{s}'] 1707 c = result.params.valuesdict()[f'c_{s}'] 1708 a2 = result.params.valuesdict()[f'a2_{s}'] 1709 b2 = result.params.valuesdict()[f'b2_{s}'] 1710 c2 = result.params.valuesdict()[f'c2_{s}'] 1711 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1712 1713 1714 self.standardization = result 1715 1716 for session in self.sessions: 1717 self.sessions[session]['Np'] = 3 1718 for k in ['scrambling', 'slope', 'wg']: 1719 if self.sessions[session][f'{k}_drift']: 1720 self.sessions[session]['Np'] += 1 1721 1722 if consolidate: 1723 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1724 return result 1725 1726 1727 elif method == 'indep_sessions': 1728 1729 if weighted_sessions: 1730 for session_group in weighted_sessions: 1731 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1732 X.Nominal_D4x = self.Nominal_D4x.copy() 1733 X.refresh() 1734 # This is only done to assign r['wD47raw'] for r in X: 1735 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1736 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1737 else: 1738 self.msg('All weights set to 1 ‰') 1739 for r in self: 1740 r[f'wD{self._4x}raw'] = 1 1741 1742 for session in self.sessions: 1743 s = self.sessions[session] 1744 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1745 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1746 s['Np'] = sum(p_active) 1747 sdata = s['data'] 1748 1749 A = np.array([ 1750 [ 1751 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1752 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1753 1 / r[f'wD{self._4x}raw'], 1754 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1755 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1756 r['t'] / r[f'wD{self._4x}raw'] 1757 ] 1758 for r in sdata if r['Sample'] in self.anchors 1759 ])[:,p_active] # only keep columns for the active parameters 1760 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1761 s['Na'] = Y.size 1762 CM = linalg.inv(A.T @ A) 1763 bf = (CM @ A.T @ Y).T[0,:] 1764 k = 0 1765 for n,a in zip(p_names, p_active): 1766 if a: 1767 s[n] = bf[k] 1768# self.msg(f'{n} = {bf[k]}') 1769 k += 1 1770 else: 1771 s[n] = 0. 1772# self.msg(f'{n} = 0.0') 1773 1774 for r in sdata : 1775 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1776 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1777 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1778 1779 s['CM'] = np.zeros((6,6)) 1780 i = 0 1781 k_active = [j for j,a in enumerate(p_active) if a] 1782 for j,a in enumerate(p_active): 1783 if a: 1784 s['CM'][j,k_active] = CM[i,:] 1785 i += 1 1786 1787 if not weighted_sessions: 1788 w = self.rmswd()['rmswd'] 1789 for r in self: 1790 r[f'wD{self._4x}'] *= w 1791 r[f'wD{self._4x}raw'] *= w 1792 for session in self.sessions: 1793 self.sessions[session]['CM'] *= w**2 1794 1795 for session in self.sessions: 1796 s = self.sessions[session] 1797 s['SE_a'] = s['CM'][0,0]**.5 1798 s['SE_b'] = s['CM'][1,1]**.5 1799 s['SE_c'] = s['CM'][2,2]**.5 1800 s['SE_a2'] = s['CM'][3,3]**.5 1801 s['SE_b2'] = s['CM'][4,4]**.5 1802 s['SE_c2'] = s['CM'][5,5]**.5 1803 1804 if not weighted_sessions: 1805 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1806 else: 1807 self.Nf = 0 1808 for sg in weighted_sessions: 1809 self.Nf += self.rmswd(sessions = sg)['Nf'] 1810 1811 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1812 1813 avgD4x = { 1814 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1815 for sample in self.samples 1816 } 1817 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1818 rD4x = (chi2/self.Nf)**.5 1819 self.repeatability[f'sigma_{self._4x}'] = rD4x 1820 1821 if consolidate: 1822 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1823 1824 1825 def standardization_error(self, session, d4x, D4x, t = 0): 1826 ''' 1827 Compute standardization error for a given session and 1828 (δ47, Δ47) composition. 1829 ''' 1830 a = self.sessions[session]['a'] 1831 b = self.sessions[session]['b'] 1832 c = self.sessions[session]['c'] 1833 a2 = self.sessions[session]['a2'] 1834 b2 = self.sessions[session]['b2'] 1835 c2 = self.sessions[session]['c2'] 1836 CM = self.sessions[session]['CM'] 1837 1838 x, y = D4x, d4x 1839 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1840# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1841 dxdy = -(b+b2*t) / (a+a2*t) 1842 dxdz = 1. / (a+a2*t) 1843 dxda = -x / (a+a2*t) 1844 dxdb = -y / (a+a2*t) 1845 dxdc = -1. / (a+a2*t) 1846 dxda2 = -x * a2 / (a+a2*t) 1847 dxdb2 = -y * t / (a+a2*t) 1848 dxdc2 = -t / (a+a2*t) 1849 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1850 sx = (V @ CM @ V.T) ** .5 1851 return sx 1852 1853 1854 @make_verbal 1855 def summary(self, 1856 dir = 'output', 1857 filename = None, 1858 save_to_file = True, 1859 print_out = True, 1860 ): 1861 ''' 1862 Print out an/or save to disk a summary of the standardization results. 1863 1864 **Parameters** 1865 1866 + `dir`: the directory in which to save the table 1867 + `filename`: the name to the csv file to write to 1868 + `save_to_file`: whether to save the table to disk 1869 + `print_out`: whether to print out the table 1870 ''' 1871 1872 out = [] 1873 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1874 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1875 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1876 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1877 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1878 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1879 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1880 out += [['Model degrees of freedom', f"{self.Nf}"]] 1881 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1882 out += [['Standardization method', self.standardization_method]] 1883 1884 if save_to_file: 1885 if not os.path.exists(dir): 1886 os.makedirs(dir) 1887 if filename is None: 1888 filename = f'D{self._4x}_summary.csv' 1889 with open(f'{dir}/{filename}', 'w') as fid: 1890 fid.write(make_csv(out)) 1891 if print_out: 1892 self.msg('\n' + pretty_table(out, header = 0)) 1893 1894 1895 @make_verbal 1896 def table_of_sessions(self, 1897 dir = 'output', 1898 filename = None, 1899 save_to_file = True, 1900 print_out = True, 1901 output = None, 1902 ): 1903 ''' 1904 Print out an/or save to disk a table of sessions. 1905 1906 **Parameters** 1907 1908 + `dir`: the directory in which to save the table 1909 + `filename`: the name to the csv file to write to 1910 + `save_to_file`: whether to save the table to disk 1911 + `print_out`: whether to print out the table 1912 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1913 if set to `'raw'`: return a list of list of strings 1914 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1915 ''' 1916 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1917 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1918 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1919 1920 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1921 if include_a2: 1922 out[-1] += ['a2 ± SE'] 1923 if include_b2: 1924 out[-1] += ['b2 ± SE'] 1925 if include_c2: 1926 out[-1] += ['c2 ± SE'] 1927 for session in self.sessions: 1928 out += [[ 1929 session, 1930 f"{self.sessions[session]['Na']}", 1931 f"{self.sessions[session]['Nu']}", 1932 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1933 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1934 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1935 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1936 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1937 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1938 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1939 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1940 ]] 1941 if include_a2: 1942 if self.sessions[session]['scrambling_drift']: 1943 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1944 else: 1945 out[-1] += [''] 1946 if include_b2: 1947 if self.sessions[session]['slope_drift']: 1948 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1949 else: 1950 out[-1] += [''] 1951 if include_c2: 1952 if self.sessions[session]['wg_drift']: 1953 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1954 else: 1955 out[-1] += [''] 1956 1957 if save_to_file: 1958 if not os.path.exists(dir): 1959 os.makedirs(dir) 1960 if filename is None: 1961 filename = f'D{self._4x}_sessions.csv' 1962 with open(f'{dir}/{filename}', 'w') as fid: 1963 fid.write(make_csv(out)) 1964 if print_out: 1965 self.msg('\n' + pretty_table(out)) 1966 if output == 'raw': 1967 return out 1968 elif output == 'pretty': 1969 return pretty_table(out) 1970 1971 1972 @make_verbal 1973 def table_of_analyses( 1974 self, 1975 dir = 'output', 1976 filename = None, 1977 save_to_file = True, 1978 print_out = True, 1979 output = None, 1980 ): 1981 ''' 1982 Print out an/or save to disk a table of analyses. 1983 1984 **Parameters** 1985 1986 + `dir`: the directory in which to save the table 1987 + `filename`: the name to the csv file to write to 1988 + `save_to_file`: whether to save the table to disk 1989 + `print_out`: whether to print out the table 1990 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1991 if set to `'raw'`: return a list of list of strings 1992 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1993 ''' 1994 1995 out = [['UID','Session','Sample']] 1996 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1997 for f in extra_fields: 1998 out[-1] += [f[0]] 1999 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 2000 for r in self: 2001 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 2002 for f in extra_fields: 2003 out[-1] += [f"{r[f[0]]:{f[1]}}"] 2004 out[-1] += [ 2005 f"{r['d13Cwg_VPDB']:.3f}", 2006 f"{r['d18Owg_VSMOW']:.3f}", 2007 f"{r['d45']:.6f}", 2008 f"{r['d46']:.6f}", 2009 f"{r['d47']:.6f}", 2010 f"{r['d48']:.6f}", 2011 f"{r['d49']:.6f}", 2012 f"{r['d13C_VPDB']:.6f}", 2013 f"{r['d18O_VSMOW']:.6f}", 2014 f"{r['D47raw']:.6f}", 2015 f"{r['D48raw']:.6f}", 2016 f"{r['D49raw']:.6f}", 2017 f"{r[f'D{self._4x}']:.6f}" 2018 ] 2019 if save_to_file: 2020 if not os.path.exists(dir): 2021 os.makedirs(dir) 2022 if filename is None: 2023 filename = f'D{self._4x}_analyses.csv' 2024 with open(f'{dir}/{filename}', 'w') as fid: 2025 fid.write(make_csv(out)) 2026 if print_out: 2027 self.msg('\n' + pretty_table(out)) 2028 return out 2029 2030 @make_verbal 2031 def covar_table( 2032 self, 2033 correl = False, 2034 dir = 'output', 2035 filename = None, 2036 save_to_file = True, 2037 print_out = True, 2038 output = None, 2039 ): 2040 ''' 2041 Print out, save to disk and/or return the variance-covariance matrix of D4x 2042 for all unknown samples. 2043 2044 **Parameters** 2045 2046 + `dir`: the directory in which to save the csv 2047 + `filename`: the name of the csv file to write to 2048 + `save_to_file`: whether to save the csv 2049 + `print_out`: whether to print out the matrix 2050 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2051 if set to `'raw'`: return a list of list of strings 2052 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2053 ''' 2054 samples = sorted([u for u in self.unknowns]) 2055 out = [[''] + samples] 2056 for s1 in samples: 2057 out.append([s1]) 2058 for s2 in samples: 2059 if correl: 2060 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2061 else: 2062 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2063 2064 if save_to_file: 2065 if not os.path.exists(dir): 2066 os.makedirs(dir) 2067 if filename is None: 2068 if correl: 2069 filename = f'D{self._4x}_correl.csv' 2070 else: 2071 filename = f'D{self._4x}_covar.csv' 2072 with open(f'{dir}/{filename}', 'w') as fid: 2073 fid.write(make_csv(out)) 2074 if print_out: 2075 self.msg('\n'+pretty_table(out)) 2076 if output == 'raw': 2077 return out 2078 elif output == 'pretty': 2079 return pretty_table(out) 2080 2081 @make_verbal 2082 def table_of_samples( 2083 self, 2084 dir = 'output', 2085 filename = None, 2086 save_to_file = True, 2087 print_out = True, 2088 output = None, 2089 ): 2090 ''' 2091 Print out, save to disk and/or return a table of samples. 2092 2093 **Parameters** 2094 2095 + `dir`: the directory in which to save the csv 2096 + `filename`: the name of the csv file to write to 2097 + `save_to_file`: whether to save the csv 2098 + `print_out`: whether to print out the table 2099 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2100 if set to `'raw'`: return a list of list of strings 2101 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2102 ''' 2103 2104 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2105 for sample in self.anchors: 2106 out += [[ 2107 f"{sample}", 2108 f"{self.samples[sample]['N']}", 2109 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2110 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2111 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2112 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2113 ]] 2114 for sample in self.unknowns: 2115 out += [[ 2116 f"{sample}", 2117 f"{self.samples[sample]['N']}", 2118 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2119 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2120 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2121 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2122 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2123 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2124 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2125 ]] 2126 if save_to_file: 2127 if not os.path.exists(dir): 2128 os.makedirs(dir) 2129 if filename is None: 2130 filename = f'D{self._4x}_samples.csv' 2131 with open(f'{dir}/{filename}', 'w') as fid: 2132 fid.write(make_csv(out)) 2133 if print_out: 2134 self.msg('\n'+pretty_table(out)) 2135 if output == 'raw': 2136 return out 2137 elif output == 'pretty': 2138 return pretty_table(out) 2139 2140 2141 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2142 ''' 2143 Generate session plots and save them to disk. 2144 2145 **Parameters** 2146 2147 + `dir`: the directory in which to save the plots 2148 + `figsize`: the width and height (in inches) of each plot 2149 + `filetype`: 'pdf' or 'png' 2150 + `dpi`: resolution for PNG output 2151 ''' 2152 if not os.path.exists(dir): 2153 os.makedirs(dir) 2154 2155 for session in self.sessions: 2156 sp = self.plot_single_session(session, xylimits = 'constant') 2157 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2158 ppl.close(sp.fig) 2159 2160 2161 2162 @make_verbal 2163 def consolidate_samples(self): 2164 ''' 2165 Compile various statistics for each sample. 2166 2167 For each anchor sample: 2168 2169 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2170 + `SE_D47` or `SE_D48`: set to zero by definition 2171 2172 For each unknown sample: 2173 2174 + `D47` or `D48`: the standardized Δ4x value for this unknown 2175 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2176 2177 For each anchor and unknown: 2178 2179 + `N`: the total number of analyses of this sample 2180 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2181 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2182 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2183 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2184 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2185 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2186 ''' 2187 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2188 for sample in self.samples: 2189 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2190 if self.samples[sample]['N'] > 1: 2191 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2192 2193 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2194 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2195 2196 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2197 if len(D4x_pop) > 2: 2198 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2199 2200 if self.standardization_method == 'pooled': 2201 for sample in self.anchors: 2202 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2203 self.samples[sample][f'SE_D{self._4x}'] = 0. 2204 for sample in self.unknowns: 2205 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2206 try: 2207 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2208 except ValueError: 2209 # when `sample` is constrained by self.standardize(constraints = {...}), 2210 # it is no longer listed in self.standardization.var_names. 2211 # Temporary fix: define SE as zero for now 2212 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2213 2214 elif self.standardization_method == 'indep_sessions': 2215 for sample in self.anchors: 2216 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2217 self.samples[sample][f'SE_D{self._4x}'] = 0. 2218 for sample in self.unknowns: 2219 self.msg(f'Consolidating sample {sample}') 2220 self.unknowns[sample][f'session_D{self._4x}'] = {} 2221 session_avg = [] 2222 for session in self.sessions: 2223 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2224 if sdata: 2225 self.msg(f'{sample} found in session {session}') 2226 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2227 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2228 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2229 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2230 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2231 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2232 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2233 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2234 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2235 wsum = sum([weights[s] for s in weights]) 2236 for s in weights: 2237 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2238 2239 for r in self: 2240 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2241 2242 2243 2244 def consolidate_sessions(self): 2245 ''' 2246 Compute various statistics for each session. 2247 2248 + `Na`: Number of anchor analyses in the session 2249 + `Nu`: Number of unknown analyses in the session 2250 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2251 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2252 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2253 + `a`: scrambling factor 2254 + `b`: compositional slope 2255 + `c`: WG offset 2256 + `SE_a`: Model stadard erorr of `a` 2257 + `SE_b`: Model stadard erorr of `b` 2258 + `SE_c`: Model stadard erorr of `c` 2259 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2260 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2261 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2262 + `a2`: scrambling factor drift 2263 + `b2`: compositional slope drift 2264 + `c2`: WG offset drift 2265 + `Np`: Number of standardization parameters to fit 2266 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2267 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2268 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2269 ''' 2270 for session in self.sessions: 2271 if 'd13Cwg_VPDB' not in self.sessions[session]: 2272 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2273 if 'd18Owg_VSMOW' not in self.sessions[session]: 2274 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2275 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2276 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2277 2278 self.msg(f'Computing repeatabilities for session {session}') 2279 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2280 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2281 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2282 2283 if self.standardization_method == 'pooled': 2284 for session in self.sessions: 2285 2286 # different (better?) computation of D4x repeatability for each session: 2287 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2288 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2289 2290 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2291 i = self.standardization.var_names.index(f'a_{pf(session)}') 2292 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2293 2294 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2295 i = self.standardization.var_names.index(f'b_{pf(session)}') 2296 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2297 2298 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2299 i = self.standardization.var_names.index(f'c_{pf(session)}') 2300 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2301 2302 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2303 if self.sessions[session]['scrambling_drift']: 2304 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2305 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2306 else: 2307 self.sessions[session]['SE_a2'] = 0. 2308 2309 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2310 if self.sessions[session]['slope_drift']: 2311 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2312 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2313 else: 2314 self.sessions[session]['SE_b2'] = 0. 2315 2316 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2317 if self.sessions[session]['wg_drift']: 2318 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2319 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2320 else: 2321 self.sessions[session]['SE_c2'] = 0. 2322 2323 i = self.standardization.var_names.index(f'a_{pf(session)}') 2324 j = self.standardization.var_names.index(f'b_{pf(session)}') 2325 k = self.standardization.var_names.index(f'c_{pf(session)}') 2326 CM = np.zeros((6,6)) 2327 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2328 try: 2329 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2330 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2331 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2332 try: 2333 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2334 CM[3,4] = self.standardization.covar[i2,j2] 2335 CM[4,3] = self.standardization.covar[j2,i2] 2336 except ValueError: 2337 pass 2338 try: 2339 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2340 CM[3,5] = self.standardization.covar[i2,k2] 2341 CM[5,3] = self.standardization.covar[k2,i2] 2342 except ValueError: 2343 pass 2344 except ValueError: 2345 pass 2346 try: 2347 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2348 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2349 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2350 try: 2351 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2352 CM[4,5] = self.standardization.covar[j2,k2] 2353 CM[5,4] = self.standardization.covar[k2,j2] 2354 except ValueError: 2355 pass 2356 except ValueError: 2357 pass 2358 try: 2359 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2360 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2361 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2362 except ValueError: 2363 pass 2364 2365 self.sessions[session]['CM'] = CM 2366 2367 elif self.standardization_method == 'indep_sessions': 2368 pass # Not implemented yet 2369 2370 2371 @make_verbal 2372 def repeatabilities(self): 2373 ''' 2374 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2375 (for all samples, for anchors, and for unknowns). 2376 ''' 2377 self.msg('Computing reproducibilities for all sessions') 2378 2379 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2380 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2381 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2382 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2383 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2384 2385 2386 @make_verbal 2387 def consolidate(self, tables = True, plots = True): 2388 ''' 2389 Collect information about samples, sessions and repeatabilities. 2390 ''' 2391 self.consolidate_samples() 2392 self.consolidate_sessions() 2393 self.repeatabilities() 2394 2395 if tables: 2396 self.summary() 2397 self.table_of_sessions() 2398 self.table_of_analyses() 2399 self.table_of_samples() 2400 2401 if plots: 2402 self.plot_sessions() 2403 2404 2405 @make_verbal 2406 def rmswd(self, 2407 samples = 'all samples', 2408 sessions = 'all sessions', 2409 ): 2410 ''' 2411 Compute the χ2, root mean squared weighted deviation 2412 (i.e. reduced χ2), and corresponding degrees of freedom of the 2413 Δ4x values for samples in `samples` and sessions in `sessions`. 2414 2415 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2416 ''' 2417 if samples == 'all samples': 2418 mysamples = [k for k in self.samples] 2419 elif samples == 'anchors': 2420 mysamples = [k for k in self.anchors] 2421 elif samples == 'unknowns': 2422 mysamples = [k for k in self.unknowns] 2423 else: 2424 mysamples = samples 2425 2426 if sessions == 'all sessions': 2427 sessions = [k for k in self.sessions] 2428 2429 chisq, Nf = 0, 0 2430 for sample in mysamples : 2431 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2432 if len(G) > 1 : 2433 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2434 Nf += (len(G) - 1) 2435 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2436 r = (chisq / Nf)**.5 if Nf > 0 else 0 2437 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2438 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2439 2440 2441 @make_verbal 2442 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2443 ''' 2444 Compute the repeatability of `[r[key] for r in self]` 2445 ''' 2446 2447 if samples == 'all samples': 2448 mysamples = [k for k in self.samples] 2449 elif samples == 'anchors': 2450 mysamples = [k for k in self.anchors] 2451 elif samples == 'unknowns': 2452 mysamples = [k for k in self.unknowns] 2453 else: 2454 mysamples = samples 2455 2456 if sessions == 'all sessions': 2457 sessions = [k for k in self.sessions] 2458 2459 if key in ['D47', 'D48']: 2460 # Full disclosure: the definition of Nf is tricky/debatable 2461 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2462 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2463 Nf = len(G) 2464# print(f'len(G) = {Nf}') 2465 Nf -= len([s for s in mysamples if s in self.unknowns]) 2466# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2467 for session in sessions: 2468 Np = len([ 2469 _ for _ in self.standardization.params 2470 if ( 2471 self.standardization.params[_].expr is not None 2472 and ( 2473 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2474 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2475 ) 2476 ) 2477 ]) 2478# print(f'session {session}: {Np} parameters to consider') 2479 Na = len({ 2480 r['Sample'] for r in self.sessions[session]['data'] 2481 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2482 }) 2483# print(f'session {session}: {Na} different anchors in that session') 2484 Nf -= min(Np, Na) 2485# print(f'Nf = {Nf}') 2486 2487# for sample in mysamples : 2488# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2489# if len(X) > 1 : 2490# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2491# if sample in self.unknowns: 2492# Nf += len(X) - 1 2493# else: 2494# Nf += len(X) 2495# if samples in ['anchors', 'all samples']: 2496# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2497 r = (chisq / Nf)**.5 if Nf > 0 else 0 2498 2499 else: # if key not in ['D47', 'D48'] 2500 chisq, Nf = 0, 0 2501 for sample in mysamples : 2502 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2503 if len(X) > 1 : 2504 Nf += len(X) - 1 2505 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2506 r = (chisq / Nf)**.5 if Nf > 0 else 0 2507 2508 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2509 return r 2510 2511 def sample_average(self, samples, weights = 'equal', normalize = True): 2512 ''' 2513 Weighted average Δ4x value of a group of samples, accounting for covariance. 2514 2515 Returns the weighed average Δ4x value and associated SE 2516 of a group of samples. Weights are equal by default. If `normalize` is 2517 true, `weights` will be rescaled so that their sum equals 1. 2518 2519 **Examples** 2520 2521 ```python 2522 self.sample_average(['X','Y'], [1, 2]) 2523 ``` 2524 2525 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2526 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2527 values of samples X and Y, respectively. 2528 2529 ```python 2530 self.sample_average(['X','Y'], [1, -1], normalize = False) 2531 ``` 2532 2533 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2534 ''' 2535 if weights == 'equal': 2536 weights = [1/len(samples)] * len(samples) 2537 2538 if normalize: 2539 s = sum(weights) 2540 if s: 2541 weights = [w/s for w in weights] 2542 2543 try: 2544# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2545# C = self.standardization.covar[indices,:][:,indices] 2546 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2547 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2548 return correlated_sum(X, C, weights) 2549 except ValueError: 2550 return (0., 0.) 2551 2552 2553 def sample_D4x_covar(self, sample1, sample2 = None): 2554 ''' 2555 Covariance between Δ4x values of samples 2556 2557 Returns the error covariance between the average Δ4x values of two 2558 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2559 returns the Δ4x variance for that sample. 2560 ''' 2561 if sample2 is None: 2562 sample2 = sample1 2563 if self.standardization_method == 'pooled': 2564 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2565 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2566 return self.standardization.covar[i, j] 2567 elif self.standardization_method == 'indep_sessions': 2568 if sample1 == sample2: 2569 return self.samples[sample1][f'SE_D{self._4x}']**2 2570 else: 2571 c = 0 2572 for session in self.sessions: 2573 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2574 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2575 if sdata1 and sdata2: 2576 a = self.sessions[session]['a'] 2577 # !! TODO: CM below does not account for temporal changes in standardization parameters 2578 CM = self.sessions[session]['CM'][:3,:3] 2579 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2580 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2581 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2582 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2583 c += ( 2584 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2585 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2586 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2587 @ CM 2588 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2589 ) / a**2 2590 return float(c) 2591 2592 def sample_D4x_correl(self, sample1, sample2 = None): 2593 ''' 2594 Correlation between Δ4x errors of samples 2595 2596 Returns the error correlation between the average Δ4x values of two samples. 2597 ''' 2598 if sample2 is None or sample2 == sample1: 2599 return 1. 2600 return ( 2601 self.sample_D4x_covar(sample1, sample2) 2602 / self.unknowns[sample1][f'SE_D{self._4x}'] 2603 / self.unknowns[sample2][f'SE_D{self._4x}'] 2604 ) 2605 2606 def plot_single_session(self, 2607 session, 2608 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2609 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2610 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2611 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2612 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2613 xylimits = 'free', # | 'constant' 2614 x_label = None, 2615 y_label = None, 2616 error_contour_interval = 'auto', 2617 fig = 'new', 2618 ): 2619 ''' 2620 Generate plot for a single session 2621 ''' 2622 if x_label is None: 2623 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2624 if y_label is None: 2625 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2626 2627 out = _SessionPlot() 2628 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2629 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2630 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2631 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2632 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2633 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2634 anchor_avg = (np.array([ np.array([ 2635 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2636 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2637 ]) for sample in anchors]).T, 2638 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2639 unknown_avg = (np.array([ np.array([ 2640 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2641 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2642 ]) for sample in unknowns]).T, 2643 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2644 2645 2646 if fig == 'new': 2647 out.fig = ppl.figure(figsize = (6,6)) 2648 ppl.subplots_adjust(.1,.1,.9,.9) 2649 2650 out.anchor_analyses, = ppl.plot( 2651 anchors_d, 2652 anchors_D, 2653 **kw_plot_anchors) 2654 out.unknown_analyses, = ppl.plot( 2655 unknowns_d, 2656 unknowns_D, 2657 **kw_plot_unknowns) 2658 out.anchor_avg = ppl.plot( 2659 *anchor_avg, 2660 **kw_plot_anchor_avg) 2661 out.unknown_avg = ppl.plot( 2662 *unknown_avg, 2663 **kw_plot_unknown_avg) 2664 if xylimits == 'constant': 2665 x = [r[f'd{self._4x}'] for r in self] 2666 y = [r[f'D{self._4x}'] for r in self] 2667 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2668 w, h = x2-x1, y2-y1 2669 x1 -= w/20 2670 x2 += w/20 2671 y1 -= h/20 2672 y2 += h/20 2673 ppl.axis([x1, x2, y1, y2]) 2674 elif xylimits == 'free': 2675 x1, x2, y1, y2 = ppl.axis() 2676 else: 2677 x1, x2, y1, y2 = ppl.axis(xylimits) 2678 2679 if error_contour_interval != 'none': 2680 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2681 XI,YI = np.meshgrid(xi, yi) 2682 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2683 if error_contour_interval == 'auto': 2684 rng = np.max(SI) - np.min(SI) 2685 if rng <= 0.01: 2686 cinterval = 0.001 2687 elif rng <= 0.03: 2688 cinterval = 0.004 2689 elif rng <= 0.1: 2690 cinterval = 0.01 2691 elif rng <= 0.3: 2692 cinterval = 0.03 2693 elif rng <= 1.: 2694 cinterval = 0.1 2695 else: 2696 cinterval = 0.5 2697 else: 2698 cinterval = error_contour_interval 2699 2700 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2701 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2702 out.clabel = ppl.clabel(out.contour) 2703 contour = (XI, YI, SI, cval, cinterval) 2704 2705 if fig == None: 2706 return { 2707 'anchors':anchors, 2708 'unknowns':unknowns, 2709 'anchors_d':anchors_d, 2710 'anchors_D':anchors_D, 2711 'unknowns_d':unknowns_d, 2712 'unknowns_D':unknowns_D, 2713 'anchor_avg':anchor_avg, 2714 'unknown_avg':unknown_avg, 2715 'contour':contour, 2716 } 2717 2718 ppl.xlabel(x_label) 2719 ppl.ylabel(y_label) 2720 ppl.title(session, weight = 'bold') 2721 ppl.grid(alpha = .2) 2722 out.ax = ppl.gca() 2723 2724 return out 2725 2726 def plot_residuals( 2727 self, 2728 kde = False, 2729 hist = False, 2730 binwidth = 2/3, 2731 dir = 'output', 2732 filename = None, 2733 highlight = [], 2734 colors = None, 2735 figsize = None, 2736 dpi = 100, 2737 yspan = None, 2738 ): 2739 ''' 2740 Plot residuals of each analysis as a function of time (actually, as a function of 2741 the order of analyses in the `D4xdata` object) 2742 2743 + `kde`: whether to add a kernel density estimate of residuals 2744 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2745 + `histbins`: specify bin edges for the histogram 2746 + `dir`: the directory in which to save the plot 2747 + `highlight`: a list of samples to highlight 2748 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2749 + `figsize`: (width, height) of figure 2750 + `dpi`: resolution for PNG output 2751 + `yspan`: factor controlling the range of y values shown in plot 2752 (by default: `yspan = 1.5 if kde else 1.0`) 2753 ''' 2754 2755 from matplotlib import ticker 2756 2757 if yspan is None: 2758 if kde: 2759 yspan = 1.5 2760 else: 2761 yspan = 1.0 2762 2763 # Layout 2764 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2765 if hist or kde: 2766 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2767 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2768 else: 2769 ppl.subplots_adjust(.08,.05,.78,.8) 2770 ax1 = ppl.subplot(111) 2771 2772 # Colors 2773 N = len(self.anchors) 2774 if colors is None: 2775 if len(highlight) > 0: 2776 Nh = len(highlight) 2777 if Nh == 1: 2778 colors = {highlight[0]: (0,0,0)} 2779 elif Nh == 3: 2780 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2781 elif Nh == 4: 2782 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2783 else: 2784 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2785 else: 2786 if N == 3: 2787 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2788 elif N == 4: 2789 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2790 else: 2791 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2792 2793 ppl.sca(ax1) 2794 2795 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2796 2797 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2798 2799 session = self[0]['Session'] 2800 x1 = 0 2801# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2802 x_sessions = {} 2803 one_or_more_singlets = False 2804 one_or_more_multiplets = False 2805 multiplets = set() 2806 for k,r in enumerate(self): 2807 if r['Session'] != session: 2808 x2 = k-1 2809 x_sessions[session] = (x1+x2)/2 2810 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2811 session = r['Session'] 2812 x1 = k 2813 singlet = len(self.samples[r['Sample']]['data']) == 1 2814 if not singlet: 2815 multiplets.add(r['Sample']) 2816 if r['Sample'] in self.unknowns: 2817 if singlet: 2818 one_or_more_singlets = True 2819 else: 2820 one_or_more_multiplets = True 2821 kw = dict( 2822 marker = 'x' if singlet else '+', 2823 ms = 4 if singlet else 5, 2824 ls = 'None', 2825 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2826 mew = 1, 2827 alpha = 0.2 if singlet else 1, 2828 ) 2829 if highlight and r['Sample'] not in highlight: 2830 kw['alpha'] = 0.2 2831 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2832 x2 = k 2833 x_sessions[session] = (x1+x2)/2 2834 2835 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2836 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2837 if not (hist or kde): 2838 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2839 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2840 2841 xmin, xmax, ymin, ymax = ppl.axis() 2842 if yspan != 1: 2843 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2844 for s in x_sessions: 2845 ppl.text( 2846 x_sessions[s], 2847 ymax +1, 2848 s, 2849 va = 'bottom', 2850 **( 2851 dict(ha = 'center') 2852 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2853 else dict(ha = 'left', rotation = 45) 2854 ) 2855 ) 2856 2857 if hist or kde: 2858 ppl.sca(ax2) 2859 2860 for s in colors: 2861 kw['marker'] = '+' 2862 kw['ms'] = 5 2863 kw['mec'] = colors[s] 2864 kw['label'] = s 2865 kw['alpha'] = 1 2866 ppl.plot([], [], **kw) 2867 2868 kw['mec'] = (0,0,0) 2869 2870 if one_or_more_singlets: 2871 kw['marker'] = 'x' 2872 kw['ms'] = 4 2873 kw['alpha'] = .2 2874 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2875 ppl.plot([], [], **kw) 2876 2877 if one_or_more_multiplets: 2878 kw['marker'] = '+' 2879 kw['ms'] = 4 2880 kw['alpha'] = 1 2881 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2882 ppl.plot([], [], **kw) 2883 2884 if hist or kde: 2885 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2886 else: 2887 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2888 leg.set_zorder(-1000) 2889 2890 ppl.sca(ax1) 2891 2892 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2893 ppl.xticks([]) 2894 ppl.axis([-1, len(self), None, None]) 2895 2896 if hist or kde: 2897 ppl.sca(ax2) 2898 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2899 2900 if kde: 2901 from scipy.stats import gaussian_kde 2902 yi = np.linspace(ymin, ymax, 201) 2903 xi = gaussian_kde(X).evaluate(yi) 2904 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2905# ppl.plot(xi, yi, 'k-', lw = 1) 2906 elif hist: 2907 ppl.hist( 2908 X, 2909 orientation = 'horizontal', 2910 histtype = 'stepfilled', 2911 ec = [.4]*3, 2912 fc = [.25]*3, 2913 alpha = .25, 2914 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2915 ) 2916 ppl.text(0, 0, 2917 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2918 size = 7.5, 2919 alpha = 1, 2920 va = 'center', 2921 ha = 'left', 2922 ) 2923 2924 ppl.axis([0, None, ymin, ymax]) 2925 ppl.xticks([]) 2926 ppl.yticks([]) 2927# ax2.spines['left'].set_visible(False) 2928 ax2.spines['right'].set_visible(False) 2929 ax2.spines['top'].set_visible(False) 2930 ax2.spines['bottom'].set_visible(False) 2931 2932 ax1.axis([None, None, ymin, ymax]) 2933 2934 if not os.path.exists(dir): 2935 os.makedirs(dir) 2936 if filename is None: 2937 return fig 2938 elif filename == '': 2939 filename = f'D{self._4x}_residuals.pdf' 2940 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2941 ppl.close(fig) 2942 2943 2944 def simulate(self, *args, **kwargs): 2945 ''' 2946 Legacy function with warning message pointing to `virtual_data()` 2947 ''' 2948 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2949 2950 def plot_anchor_residuals( 2951 self, 2952 dir = 'output', 2953 filename = '', 2954 figsize = None, 2955 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2956 dpi = 100, 2957 colors = None, 2958 ): 2959 ''' 2960 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2961 2962 **Parameters** 2963 2964 + `dir`: the directory in which to save the plot 2965 + `filename`: the file name to save to. 2966 + `dpi`: resolution for PNG output 2967 + `figsize`: (width, height) of figure 2968 + `subplots_adjust`: passed to the figure 2969 + `dpi`: resolution for PNG output 2970 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2971 ''' 2972 2973 # Colors 2974 N = len(self.anchors) 2975 if colors is None: 2976 if N == 3: 2977 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2978 elif N == 4: 2979 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2980 else: 2981 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2982 2983 if figsize is None: 2984 figsize = (4, 1.5*N+1) 2985 fig = ppl.figure(figsize = figsize) 2986 ppl.subplots_adjust(*subplots_adjust) 2987 axs = {} 2988 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2989 sigma = self.repeatability['r_D47a'] * 1000 2990 D = max(np.abs(X)) 2991 2992 for k,a in enumerate(self.anchors): 2993 color = colors[a] 2994 axs[a] = ppl.subplot(N, 1, 1+k) 2995 axs[a].text( 2996 0.02, 1-0.05, a, 2997 va = 'top', 2998 ha = 'left', 2999 weight = 'bold', 3000 size = 9, 3001 color = [_*0.75 for _ in color], 3002 transform = axs[a].transAxes, 3003 ) 3004 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 3005 axs[a].axvline(0, lw = 0.5, color = color) 3006 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 3007 3008 xi = np.linspace(-3*D, 3*D, 601) 3009 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 3010 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 3011 3012 axs[a].errorbar( 3013 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3014 ecolor = color, 3015 marker = 's', 3016 ls = 'None', 3017 mec = color, 3018 mew = 1, 3019 mfc = 'w', 3020 ms = 8, 3021 elinewidth = 1, 3022 capsize = 4, 3023 capthick = 1, 3024 ) 3025 3026 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3027 ppl.yticks([]) 3028 3029 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3030 3031 if not os.path.exists(dir): 3032 os.makedirs(dir) 3033 if filename is None: 3034 return fig 3035 elif filename == '': 3036 filename = f'D{self._4x}_anchor_residuals.pdf' 3037 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3038 ppl.close(fig) 3039 3040 3041 def plot_distribution_of_analyses( 3042 self, 3043 dir = 'output', 3044 filename = None, 3045 vs_time = False, 3046 figsize = (6,4), 3047 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3048 output = None, 3049 dpi = 100, 3050 ): 3051 ''' 3052 Plot temporal distribution of all analyses in the data set. 3053 3054 **Parameters** 3055 3056 + `dir`: the directory in which to save the plot 3057 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3058 + `dpi`: resolution for PNG output 3059 + `figsize`: (width, height) of figure 3060 + `dpi`: resolution for PNG output 3061 ''' 3062 3063 asamples = [s for s in self.anchors] 3064 usamples = [s for s in self.unknowns] 3065 if output is None or output == 'fig': 3066 fig = ppl.figure(figsize = figsize) 3067 ppl.subplots_adjust(*subplots_adjust) 3068 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3069 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3070 Xmax += (Xmax-Xmin)/40 3071 Xmin -= (Xmax-Xmin)/41 3072 for k, s in enumerate(asamples + usamples): 3073 if vs_time: 3074 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3075 else: 3076 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3077 Y = [-k for x in X] 3078 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3079 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3080 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3081 ppl.axis([Xmin, Xmax, -k-1, 1]) 3082 ppl.xlabel('\ntime') 3083 ppl.gca().annotate('', 3084 xy = (0.6, -0.02), 3085 xycoords = 'axes fraction', 3086 xytext = (.4, -0.02), 3087 arrowprops = dict(arrowstyle = "->", color = 'k'), 3088 ) 3089 3090 3091 x2 = -1 3092 for session in self.sessions: 3093 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3094 if vs_time: 3095 ppl.axvline(x1, color = 'k', lw = .75) 3096 if x2 > -1: 3097 if not vs_time: 3098 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3099 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3100# from xlrd import xldate_as_datetime 3101# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3102 if vs_time: 3103 ppl.axvline(x2, color = 'k', lw = .75) 3104 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3105 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3106 3107 ppl.xticks([]) 3108 ppl.yticks([]) 3109 3110 if output is None: 3111 if not os.path.exists(dir): 3112 os.makedirs(dir) 3113 if filename == None: 3114 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3115 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3116 ppl.close(fig) 3117 elif output == 'ax': 3118 return ppl.gca() 3119 elif output == 'fig': 3120 return fig 3121 3122 3123 def plot_bulk_compositions( 3124 self, 3125 samples = None, 3126 dir = 'output/bulk_compositions', 3127 figsize = (6,6), 3128 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3129 show = False, 3130 sample_color = (0,.5,1), 3131 analysis_color = (.7,.7,.7), 3132 labeldist = 0.3, 3133 radius = 0.05, 3134 ): 3135 ''' 3136 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3137 3138 By default, creates a directory `./output/bulk_compositions` where plots for 3139 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3140 3141 3142 **Parameters** 3143 3144 + `samples`: Only these samples are processed (by default: all samples). 3145 + `dir`: where to save the plots 3146 + `figsize`: (width, height) of figure 3147 + `subplots_adjust`: passed to `subplots_adjust()` 3148 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3149 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3150 + `sample_color`: color used for replicate markers/labels 3151 + `analysis_color`: color used for sample markers/labels 3152 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3153 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3154 ''' 3155 3156 from matplotlib.patches import Ellipse 3157 3158 if samples is None: 3159 samples = [_ for _ in self.samples] 3160 3161 saved = {} 3162 3163 for s in samples: 3164 3165 fig = ppl.figure(figsize = figsize) 3166 fig.subplots_adjust(*subplots_adjust) 3167 ax = ppl.subplot(111) 3168 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3169 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3170 ppl.title(s) 3171 3172 3173 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3174 UID = [_['UID'] for _ in self.samples[s]['data']] 3175 XY0 = XY.mean(0) 3176 3177 for xy in XY: 3178 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3179 3180 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3181 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3182 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3183 saved[s] = [XY, XY0] 3184 3185 x1, x2, y1, y2 = ppl.axis() 3186 x0, dx = (x1+x2)/2, (x2-x1)/2 3187 y0, dy = (y1+y2)/2, (y2-y1)/2 3188 dx, dy = [max(max(dx, dy), radius)]*2 3189 3190 ppl.axis([ 3191 x0 - 1.2*dx, 3192 x0 + 1.2*dx, 3193 y0 - 1.2*dy, 3194 y0 + 1.2*dy, 3195 ]) 3196 3197 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3198 3199 for xy, uid in zip(XY, UID): 3200 3201 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3202 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3203 3204 if (vector_in_display_space**2).sum() > 0: 3205 3206 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3207 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3208 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3209 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3210 3211 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3212 3213 else: 3214 3215 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3216 3217 if radius: 3218 ax.add_artist(Ellipse( 3219 xy = XY0, 3220 width = radius*2, 3221 height = radius*2, 3222 ls = (0, (2,2)), 3223 lw = .7, 3224 ec = analysis_color, 3225 fc = 'None', 3226 )) 3227 ppl.text( 3228 XY0[0], 3229 XY0[1]-radius, 3230 f'\n± {radius*1e3:.0f} ppm', 3231 color = analysis_color, 3232 va = 'top', 3233 ha = 'center', 3234 linespacing = 0.4, 3235 size = 8, 3236 ) 3237 3238 if not os.path.exists(dir): 3239 os.makedirs(dir) 3240 fig.savefig(f'{dir}/{s}.pdf') 3241 ppl.close(fig) 3242 3243 fig = ppl.figure(figsize = figsize) 3244 fig.subplots_adjust(*subplots_adjust) 3245 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3246 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3247 3248 for s in saved: 3249 for xy in saved[s][0]: 3250 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3251 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3252 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3253 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3254 3255 x1, x2, y1, y2 = ppl.axis() 3256 ppl.axis([ 3257 x1 - (x2-x1)/10, 3258 x2 + (x2-x1)/10, 3259 y1 - (y2-y1)/10, 3260 y2 + (y2-y1)/10, 3261 ]) 3262 3263 3264 if not os.path.exists(dir): 3265 os.makedirs(dir) 3266 fig.savefig(f'{dir}/__all__.pdf') 3267 if show: 3268 ppl.show() 3269 ppl.close(fig) 3270 3271 3272 def _save_D4x_correl( 3273 self, 3274 samples = None, 3275 dir = 'output', 3276 filename = None, 3277 D4x_precision = 4, 3278 correl_precision = 4, 3279 save_to_file = True, 3280 ): 3281 ''' 3282 Save D4x values along with their SE and correlation matrix. 3283 3284 **Parameters** 3285 3286 + `samples`: Only these samples are output (by default: all samples). 3287 + `dir`: the directory in which to save the faile (by defaut: `output`) 3288 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3289 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3290 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3291 + `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`, 3292 returns the output as a string 3293 ''' 3294 if samples is None: 3295 samples = sorted([s for s in self.unknowns]) 3296 3297 out = [['Sample']] + [[s] for s in samples] 3298 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3299 for k,s in enumerate(samples): 3300 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3301 for s2 in samples: 3302 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3303 3304 if save_to_file: 3305 if not os.path.exists(dir): 3306 os.makedirs(dir) 3307 if filename is None: 3308 filename = f'D{self._4x}_correl.csv' 3309 with open(f'{dir}/{filename}', 'w') as fid: 3310 fid.write(make_csv(out)) 3311 else: 3312 return make_csv(out)
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
970 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 971 ''' 972 **Parameters** 973 974 + `l`: a list of dictionaries, with each dictionary including at least the keys 975 `Sample`, `d45`, `d46`, and `d47` or `d48`. 976 + `mass`: `'47'` or `'48'` 977 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 978 + `session`: define session name for analyses without a `Session` key 979 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 980 981 Returns a `D4xdata` object derived from `list`. 982 ''' 983 self._4x = mass 984 self.verbose = verbose 985 self.prefix = 'D4xdata' 986 self.logfile = logfile 987 list.__init__(self, l) 988 self.Nf = None 989 self.repeatability = {} 990 self.refresh(session = session)
Parameters
l: a list of dictionaries, with each dictionary including at least the keysSample,d45,d46, andd47ord48.mass:'47'or'48'logfile: if specified, write detailed logs to this file path when callingD4xdatamethods.session: define session name for analyses without aSessionkeyverbose: ifTrue, print out detailed logs when callingD4xdatamethods.
Returns a D4xdata object derived from list.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg(),
D4xdata.standardize_d13C, and D4xdata.standardize_d18O.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C().
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O().
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none: do not apply any δ13C standardization.'1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB(averaged over all analyses for whichNominal_d13C_VPDBis defined).'2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB(averaged over all analyses for whichNominal_d13C_VPDBis defined).
Method by which to standardize δ18O values:
none: do not apply any δ18O standardization.'1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB(averaged over all analyses for whichNominal_d18O_VPDBis defined).'2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB(averaged over all analyses for whichNominal_d18O_VPDBis defined).
993 def make_verbal(oldfun): 994 ''' 995 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 996 ''' 997 @wraps(oldfun) 998 def newfun(*args, verbose = '', **kwargs): 999 myself = args[0] 1000 oldprefix = myself.prefix 1001 myself.prefix = oldfun.__name__ 1002 if verbose != '': 1003 oldverbose = myself.verbose 1004 myself.verbose = verbose 1005 out = oldfun(*args, **kwargs) 1006 myself.prefix = oldprefix 1007 if verbose != '': 1008 myself.verbose = oldverbose 1009 return out 1010 return newfun
Decorator: allow temporarily changing self.prefix and overriding self.verbose.
1013 def msg(self, txt): 1014 ''' 1015 Log a message to `self.logfile`, and print it out if `verbose = True` 1016 ''' 1017 self.log(txt) 1018 if self.verbose: 1019 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile, and print it out if verbose = True
1022 def vmsg(self, txt): 1023 ''' 1024 Log a message to `self.logfile` and print it out 1025 ''' 1026 self.log(txt) 1027 print(txt)
Log a message to self.logfile and print it out
1030 def log(self, *txts): 1031 ''' 1032 Log a message to `self.logfile` 1033 ''' 1034 if self.logfile: 1035 with open(self.logfile, 'a') as fid: 1036 for txt in txts: 1037 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
1040 def refresh(self, session = 'mySession'): 1041 ''' 1042 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1043 ''' 1044 self.fill_in_missing_info(session = session) 1045 self.refresh_sessions() 1046 self.refresh_samples()
Update self.sessions, self.samples, self.anchors, and self.unknowns.
1049 def refresh_sessions(self): 1050 ''' 1051 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1052 to `False` for all sessions. 1053 ''' 1054 self.sessions = { 1055 s: {'data': [r for r in self if r['Session'] == s]} 1056 for s in sorted({r['Session'] for r in self}) 1057 } 1058 for s in self.sessions: 1059 self.sessions[s]['scrambling_drift'] = False 1060 self.sessions[s]['slope_drift'] = False 1061 self.sessions[s]['wg_drift'] = False 1062 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1063 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions and set scrambling_drift, slope_drift, and wg_drift
to False for all sessions.
1066 def refresh_samples(self): 1067 ''' 1068 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1069 ''' 1070 self.samples = { 1071 s: {'data': [r for r in self if r['Sample'] == s]} 1072 for s in sorted({r['Sample'] for r in self}) 1073 } 1074 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1075 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples, self.anchors, and self.unknowns.
1078 def read(self, filename, sep = '', session = ''): 1079 ''' 1080 Read file in csv format to load data into a `D47data` object. 1081 1082 In the csv file, spaces before and after field separators (`','` by default) 1083 are optional. Each line corresponds to a single analysis. 1084 1085 The required fields are: 1086 1087 + `UID`: a unique identifier 1088 + `Session`: an identifier for the analytical session 1089 + `Sample`: a sample identifier 1090 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1091 1092 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1093 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1094 and `d49` are optional, and set to NaN by default. 1095 1096 **Parameters** 1097 1098 + `fileneme`: the path of the file to read 1099 + `sep`: csv separator delimiting the fields 1100 + `session`: set `Session` field to this string for all analyses 1101 ''' 1102 with open(filename) as fid: 1103 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data object.
In the csv file, spaces before and after field separators (',' by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID: a unique identifierSession: an identifier for the analytical sessionSample: a sample identifierd45,d46, and at least one ofd47ord48: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to
VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48
and d49 are optional, and set to NaN by default.
Parameters
fileneme: the path of the file to readsep: csv separator delimiting the fieldssession: setSessionfield to this string for all analyses
1106 def input(self, txt, sep = '', session = ''): 1107 ''' 1108 Read `txt` string in csv format to load analysis data into a `D47data` object. 1109 1110 In the csv string, spaces before and after field separators (`','` by default) 1111 are optional. Each line corresponds to a single analysis. 1112 1113 The required fields are: 1114 1115 + `UID`: a unique identifier 1116 + `Session`: an identifier for the analytical session 1117 + `Sample`: a sample identifier 1118 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1119 1120 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1121 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1122 and `d49` are optional, and set to NaN by default. 1123 1124 **Parameters** 1125 1126 + `txt`: the csv string to read 1127 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1128 whichever appers most often in `txt`. 1129 + `session`: set `Session` field to this string for all analyses 1130 ''' 1131 if sep == '': 1132 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1133 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1134 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1135 1136 if session != '': 1137 for r in data: 1138 r['Session'] = session 1139 1140 self += data 1141 self.refresh()
Read txt string in csv format to load analysis data into a D47data object.
In the csv string, spaces before and after field separators (',' by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID: a unique identifierSession: an identifier for the analytical sessionSample: a sample identifierd45,d46, and at least one ofd47ord48: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to
VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48
and d49 are optional, and set to NaN by default.
Parameters
txt: the csv string to readsep: csv separator delimiting the fields. By default, use,,;, or, whichever appers most often intxt.session: setSessionfield to this string for all analyses
1144 @make_verbal 1145 def wg(self, 1146 samples = None, 1147 session_groups = None, 1148 ): 1149 ''' 1150 Compute bulk composition of the working gas for each session based (by default) 1151 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1152 `self.Nominal_d18O_VPDB`. 1153 1154 **Parameters** 1155 1156 + `samples`: A list of samples specifying the subset of samples (defined in both 1157 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1158 when computing the working gas. By default, use all samples defined both in 1159 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1160 + `session_groups`: a list of lists of sessions 1161 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1162 specifying which sessions groups, if any, have the exact same WG composition. 1163 If set to `'all'`, force all sessions to have the same WG composition (use with 1164 caution and on short time scales, since the WG may drift slowly a long time scales). 1165 ''' 1166 1167 self.msg('Computing WG composition:') 1168 1169 a18_acid = self.ALPHA_18O_ACID_REACTION 1170 1171 if samples is None: 1172 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1173 if session_groups is None: 1174 session_groups = [[s] for s in self.sessions] 1175 elif session_groups == 'all': 1176 session_groups = [[s for s in self.sessions]] 1177 1178 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1179 R45R46_standards = {} 1180 for sample in samples: 1181 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1182 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1183 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1184 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1185 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1186 1187 C12_s = 1 / (1 + R13_s) 1188 C13_s = R13_s / (1 + R13_s) 1189 C16_s = 1 / (1 + R17_s + R18_s) 1190 C17_s = R17_s / (1 + R17_s + R18_s) 1191 C18_s = R18_s / (1 + R17_s + R18_s) 1192 1193 C626_s = C12_s * C16_s ** 2 1194 C627_s = 2 * C12_s * C16_s * C17_s 1195 C628_s = 2 * C12_s * C16_s * C18_s 1196 C636_s = C13_s * C16_s ** 2 1197 C637_s = 2 * C13_s * C16_s * C17_s 1198 C727_s = C12_s * C17_s ** 2 1199 1200 R45_s = (C627_s + C636_s) / C626_s 1201 R46_s = (C628_s + C637_s + C727_s) / C626_s 1202 R45R46_standards[sample] = (R45_s, R46_s) 1203 1204 for sg in session_groups: 1205 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1206 assert db, f'No sample from {samples} found in session group {sg}.' 1207 1208 X = [r['d45'] for r in db] 1209 Y = [R45R46_standards[r['Sample']][0] for r in db] 1210 x1, x2 = np.min(X), np.max(X) 1211 1212 if x1 < x2: 1213 wgcoord = x1/(x1-x2) 1214 else: 1215 wgcoord = 999 1216 1217 if wgcoord < -.5 or wgcoord > 1.5: 1218 # unreasonable to extrapolate to d45 = 0 1219 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1220 else : 1221 # d45 = 0 is reasonably well bracketed 1222 R45_wg = np.polyfit(X, Y, 1)[1] 1223 1224 X = [r['d46'] for r in db] 1225 Y = [R45R46_standards[r['Sample']][1] for r in db] 1226 x1, x2 = np.min(X), np.max(X) 1227 1228 if x1 < x2: 1229 wgcoord = x1/(x1-x2) 1230 else: 1231 wgcoord = 999 1232 1233 if wgcoord < -.5 or wgcoord > 1.5: 1234 # unreasonable to extrapolate to d46 = 0 1235 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1236 else : 1237 # d46 = 0 is reasonably well bracketed 1238 R46_wg = np.polyfit(X, Y, 1)[1] 1239 1240 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1241 1242 for s in sg: 1243 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1244 1245 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1246 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1247 for r in self.sessions[s]['data']: 1248 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1249 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based (by default)
on the carbonate standards defined in both self.Nominal_d13C_VPDB and
self.Nominal_d18O_VPDB.
Parameters
samples: A list of samples specifying the subset of samples (defined in bothself.Nominal_d13C_VPDBandself.Nominal_d18O_VPDB) which will be considered when computing the working gas. By default, use all samples defined both inself.Nominal_d13C_VPDBandself.Nominal_d18O_VPDB.session_groups: a list of lists of sessions (e.g.,[['session1', 'session2'], ['session3', 'session4', 'session5']]) specifying which sessions groups, if any, have the exact same WG composition. If set to'all', force all sessions to have the same WG composition (use with caution and on short time scales, since the WG may drift slowly a long time scales).
1252 def compute_bulk_delta(self, R45, R46, D17O = 0): 1253 ''' 1254 Compute δ13C_VPDB and δ18O_VSMOW, 1255 by solving the generalized form of equation (17) from 1256 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1257 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1258 solving the corresponding second-order Taylor polynomial. 1259 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1260 ''' 1261 1262 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1263 1264 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1265 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1266 C = 2 * self.R18_VSMOW 1267 D = -R46 1268 1269 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1270 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1271 cc = A + B + C + D 1272 1273 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1274 1275 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1276 R17 = K * R18 ** self.LAMBDA_17 1277 R13 = R45 - 2 * R17 1278 1279 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1280 1281 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1284 @make_verbal 1285 def crunch(self, verbose = ''): 1286 ''' 1287 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1288 ''' 1289 for r in self: 1290 self.compute_bulk_and_clumping_deltas(r) 1291 self.standardize_d13C() 1292 self.standardize_d18O() 1293 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1296 def fill_in_missing_info(self, session = 'mySession'): 1297 ''' 1298 Fill in optional fields with default values 1299 ''' 1300 for i,r in enumerate(self): 1301 if 'D17O' not in r: 1302 r['D17O'] = 0. 1303 if 'UID' not in r: 1304 r['UID'] = f'{i+1}' 1305 if 'Session' not in r: 1306 r['Session'] = session 1307 for k in ['d47', 'd48', 'd49']: 1308 if k not in r: 1309 r[k] = np.nan
Fill in optional fields with default values
1312 def standardize_d13C(self): 1313 ''' 1314 Perform δ13C standadization within each session `s` according to 1315 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1316 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1317 may be redefined abitrarily at a later stage. 1318 ''' 1319 for s in self.sessions: 1320 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1321 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1322 X,Y = zip(*XY) 1323 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1324 offset = np.mean(Y) - np.mean(X) 1325 for r in self.sessions[s]['data']: 1326 r['d13C_VPDB'] += offset 1327 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1328 a,b = np.polyfit(X,Y,1) 1329 for r in self.sessions[s]['data']: 1330 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s according to
self.sessions[s]['d13C_standardization_method'], which is defined by default
by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but
may be redefined abitrarily at a later stage.
1332 def standardize_d18O(self): 1333 ''' 1334 Perform δ18O standadization within each session `s` according to 1335 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1336 which is defined by default by `D47data.refresh_sessions()`as equal to 1337 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1338 ''' 1339 for s in self.sessions: 1340 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1341 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1342 X,Y = zip(*XY) 1343 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1344 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1345 offset = np.mean(Y) - np.mean(X) 1346 for r in self.sessions[s]['data']: 1347 r['d18O_VSMOW'] += offset 1348 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1349 a,b = np.polyfit(X,Y,1) 1350 for r in self.sessions[s]['data']: 1351 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s according to
self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'],
which is defined by default by D47data.refresh_sessions()as equal to
self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.
1354 def compute_bulk_and_clumping_deltas(self, r): 1355 ''' 1356 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1357 ''' 1358 1359 # Compute working gas R13, R18, and isobar ratios 1360 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1361 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1362 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1363 1364 # Compute analyte isobar ratios 1365 R45 = (1 + r['d45'] / 1000) * R45_wg 1366 R46 = (1 + r['d46'] / 1000) * R46_wg 1367 R47 = (1 + r['d47'] / 1000) * R47_wg 1368 R48 = (1 + r['d48'] / 1000) * R48_wg 1369 R49 = (1 + r['d49'] / 1000) * R49_wg 1370 1371 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1372 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1373 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1374 1375 # Compute stochastic isobar ratios of the analyte 1376 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1377 R13, R18, D17O = r['D17O'] 1378 ) 1379 1380 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1381 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1382 if (R45 / R45stoch - 1) > 5e-8: 1383 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1384 if (R46 / R46stoch - 1) > 5e-8: 1385 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1386 1387 # Compute raw clumped isotope anomalies 1388 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1389 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1390 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.
1393 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1394 ''' 1395 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1396 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1397 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1398 ''' 1399 1400 # Compute R17 1401 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1402 1403 # Compute isotope concentrations 1404 C12 = (1 + R13) ** -1 1405 C13 = C12 * R13 1406 C16 = (1 + R17 + R18) ** -1 1407 C17 = C16 * R17 1408 C18 = C16 * R18 1409 1410 # Compute stochastic isotopologue concentrations 1411 C626 = C16 * C12 * C16 1412 C627 = C16 * C12 * C17 * 2 1413 C628 = C16 * C12 * C18 * 2 1414 C636 = C16 * C13 * C16 1415 C637 = C16 * C13 * C17 * 2 1416 C638 = C16 * C13 * C18 * 2 1417 C727 = C17 * C12 * C17 1418 C728 = C17 * C12 * C18 * 2 1419 C737 = C17 * C13 * C17 1420 C738 = C17 * C13 * C18 * 2 1421 C828 = C18 * C12 * C18 1422 C838 = C18 * C13 * C18 1423 1424 # Compute stochastic isobar ratios 1425 R45 = (C636 + C627) / C626 1426 R46 = (C628 + C637 + C727) / C626 1427 R47 = (C638 + C728 + C737) / C626 1428 R48 = (C738 + C828) / C626 1429 R49 = C838 / C626 1430 1431 # Account for stochastic anomalies 1432 R47 *= 1 + D47 / 1000 1433 R48 *= 1 + D48 / 1000 1434 R49 *= 1 + D49 / 1000 1435 1436 # Return isobar ratios 1437 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13 and R18,
optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope
anomalies (D47, D48, D49), all expressed in permil.
1440 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1441 ''' 1442 Split unknown samples by UID (treat all analyses as different samples) 1443 or by session (treat analyses of a given sample in different sessions as 1444 different samples). 1445 1446 **Parameters** 1447 1448 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1449 + `grouping`: `by_uid` | `by_session` 1450 ''' 1451 if samples_to_split == 'all': 1452 samples_to_split = [s for s in self.unknowns] 1453 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1454 self.grouping = grouping.lower() 1455 if self.grouping in gkeys: 1456 gkey = gkeys[self.grouping] 1457 for r in self: 1458 if r['Sample'] in samples_to_split: 1459 r['Sample_original'] = r['Sample'] 1460 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1461 elif r['Sample'] in self.unknowns: 1462 r['Sample_original'] = r['Sample'] 1463 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']grouping:by_uid|by_session
1466 def unsplit_samples(self, tables = False): 1467 ''' 1468 Reverse the effects of `D47data.split_samples()`. 1469 1470 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1471 1472 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1473 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1474 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1475 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1476 that case session-averaged Δ4x values are statistically independent). 1477 ''' 1478 unknowns_old = sorted({s for s in self.unknowns}) 1479 CM_old = self.standardization.covar[:,:] 1480 VD_old = self.standardization.params.valuesdict().copy() 1481 vars_old = self.standardization.var_names 1482 1483 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1484 1485 Ns = len(vars_old) - len(unknowns_old) 1486 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1487 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1488 1489 W = np.zeros((len(vars_new), len(vars_old))) 1490 W[:Ns,:Ns] = np.eye(Ns) 1491 for u in unknowns_new: 1492 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1493 if self.grouping == 'by_session': 1494 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1495 elif self.grouping == 'by_uid': 1496 weights = [1 for s in splits] 1497 sw = sum(weights) 1498 weights = [w/sw for w in weights] 1499 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1500 1501 CM_new = W @ CM_old @ W.T 1502 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1503 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1504 1505 self.standardization.covar = CM_new 1506 self.standardization.params.valuesdict = lambda : VD_new 1507 self.standardization.var_names = vars_new 1508 1509 for r in self: 1510 if r['Sample'] in self.unknowns: 1511 r['Sample_split'] = r['Sample'] 1512 r['Sample'] = r['Sample_original'] 1513 1514 self.refresh_samples() 1515 self.consolidate_samples() 1516 self.repeatabilities() 1517 1518 if tables: 1519 self.table_of_analyses() 1520 self.table_of_samples()
Reverse the effects of D47data.split_samples().
This should only be used after D4xdata.standardize() with method='pooled'.
After D4xdata.standardize() with method='indep_sessions', one should
probably use D4xdata.combine_samples() instead to reverse the effects of
D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the
effects of D47data.split_samples() with grouping='by_sessions' (because in
that case session-averaged Δ4x values are statistically independent).
1522 def assign_timestamps(self): 1523 ''' 1524 Assign a time field `t` of type `float` to each analysis. 1525 1526 If `TimeTag` is one of the data fields, `t` is equal within a given session 1527 to `TimeTag` minus the mean value of `TimeTag` for that session. 1528 Otherwise, `TimeTag` is by default equal to the index of each analysis 1529 in the dataset and `t` is defined as above. 1530 ''' 1531 for session in self.sessions: 1532 sdata = self.sessions[session]['data'] 1533 try: 1534 t0 = np.mean([r['TimeTag'] for r in sdata]) 1535 for r in sdata: 1536 r['t'] = r['TimeTag'] - t0 1537 except KeyError: 1538 t0 = (len(sdata)-1)/2 1539 for t,r in enumerate(sdata): 1540 r['t'] = t - t0
Assign a time field t of type float to each analysis.
If TimeTag is one of the data fields, t is equal within a given session
to TimeTag minus the mean value of TimeTag for that session.
Otherwise, TimeTag is by default equal to the index of each analysis
in the dataset and t is defined as above.
1543 def report(self): 1544 ''' 1545 Prints a report on the standardization fit. 1546 Only applicable after `D4xdata.standardize(method='pooled')`. 1547 ''' 1548 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled').
1551 def combine_samples(self, sample_groups): 1552 ''' 1553 Combine analyses of different samples to compute weighted average Δ4x 1554 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1555 dictionary. 1556 1557 Caution: samples are weighted by number of replicate analyses, which is a 1558 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1559 correlated analytical errors for one or more samples). 1560 1561 Returns a tuplet of: 1562 1563 + the list of group names 1564 + an array of the corresponding Δ4x values 1565 + the corresponding (co)variance matrix 1566 1567 **Parameters** 1568 1569 + `sample_groups`: a dictionary of the form: 1570 ```py 1571 {'group1': ['sample_1', 'sample_2'], 1572 'group2': ['sample_3', 'sample_4', 'sample_5']} 1573 ``` 1574 ''' 1575 1576 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1577 groups = sorted(sample_groups.keys()) 1578 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1579 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1580 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1581 W = np.array([ 1582 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1583 for j in groups]) 1584 D4x_new = W @ D4x_old 1585 CM_new = W @ CM_old @ W.T 1586 1587 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1590 @make_verbal 1591 def standardize(self, 1592 method = 'pooled', 1593 weighted_sessions = [], 1594 consolidate = True, 1595 consolidate_tables = False, 1596 consolidate_plots = False, 1597 constraints = {}, 1598 ): 1599 ''' 1600 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1601 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1602 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1603 i.e. that their true Δ4x value does not change between sessions, 1604 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1605 `'indep_sessions'`, the standardization processes each session independently, based only 1606 on anchors analyses. 1607 ''' 1608 1609 self.standardization_method = method 1610 self.assign_timestamps() 1611 1612 if method == 'pooled': 1613 if weighted_sessions: 1614 for session_group in weighted_sessions: 1615 if self._4x == '47': 1616 X = D47data([r for r in self if r['Session'] in session_group]) 1617 elif self._4x == '48': 1618 X = D48data([r for r in self if r['Session'] in session_group]) 1619 X.Nominal_D4x = self.Nominal_D4x.copy() 1620 X.refresh() 1621 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1622 w = np.sqrt(result.redchi) 1623 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1624 for r in X: 1625 r[f'wD{self._4x}raw'] *= w 1626 else: 1627 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1628 for r in self: 1629 r[f'wD{self._4x}raw'] = 1. 1630 1631 params = Parameters() 1632 for k,session in enumerate(self.sessions): 1633 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1634 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1635 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1636 s = pf(session) 1637 params.add(f'a_{s}', value = 0.9) 1638 params.add(f'b_{s}', value = 0.) 1639 params.add(f'c_{s}', value = -0.9) 1640 params.add(f'a2_{s}', value = 0., 1641# vary = self.sessions[session]['scrambling_drift'], 1642 ) 1643 params.add(f'b2_{s}', value = 0., 1644# vary = self.sessions[session]['slope_drift'], 1645 ) 1646 params.add(f'c2_{s}', value = 0., 1647# vary = self.sessions[session]['wg_drift'], 1648 ) 1649 if not self.sessions[session]['scrambling_drift']: 1650 params[f'a2_{s}'].expr = '0' 1651 if not self.sessions[session]['slope_drift']: 1652 params[f'b2_{s}'].expr = '0' 1653 if not self.sessions[session]['wg_drift']: 1654 params[f'c2_{s}'].expr = '0' 1655 1656 for sample in self.unknowns: 1657 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1658 1659 for k in constraints: 1660 params[k].expr = constraints[k] 1661 1662 def residuals(p): 1663 R = [] 1664 for r in self: 1665 session = pf(r['Session']) 1666 sample = pf(r['Sample']) 1667 if r['Sample'] in self.Nominal_D4x: 1668 R += [ ( 1669 r[f'D{self._4x}raw'] - ( 1670 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1671 + p[f'b_{session}'] * r[f'd{self._4x}'] 1672 + p[f'c_{session}'] 1673 + r['t'] * ( 1674 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1675 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1676 + p[f'c2_{session}'] 1677 ) 1678 ) 1679 ) / r[f'wD{self._4x}raw'] ] 1680 else: 1681 R += [ ( 1682 r[f'D{self._4x}raw'] - ( 1683 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1684 + p[f'b_{session}'] * r[f'd{self._4x}'] 1685 + p[f'c_{session}'] 1686 + r['t'] * ( 1687 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1688 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1689 + p[f'c2_{session}'] 1690 ) 1691 ) 1692 ) / r[f'wD{self._4x}raw'] ] 1693 return R 1694 1695 M = Minimizer(residuals, params) 1696 result = M.least_squares() 1697 self.Nf = result.nfree 1698 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1699 new_names, new_covar, new_se = _fullcovar(result)[:3] 1700 result.var_names = new_names 1701 result.covar = new_covar 1702 1703 for r in self: 1704 s = pf(r["Session"]) 1705 a = result.params.valuesdict()[f'a_{s}'] 1706 b = result.params.valuesdict()[f'b_{s}'] 1707 c = result.params.valuesdict()[f'c_{s}'] 1708 a2 = result.params.valuesdict()[f'a2_{s}'] 1709 b2 = result.params.valuesdict()[f'b2_{s}'] 1710 c2 = result.params.valuesdict()[f'c2_{s}'] 1711 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1712 1713 1714 self.standardization = result 1715 1716 for session in self.sessions: 1717 self.sessions[session]['Np'] = 3 1718 for k in ['scrambling', 'slope', 'wg']: 1719 if self.sessions[session][f'{k}_drift']: 1720 self.sessions[session]['Np'] += 1 1721 1722 if consolidate: 1723 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1724 return result 1725 1726 1727 elif method == 'indep_sessions': 1728 1729 if weighted_sessions: 1730 for session_group in weighted_sessions: 1731 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1732 X.Nominal_D4x = self.Nominal_D4x.copy() 1733 X.refresh() 1734 # This is only done to assign r['wD47raw'] for r in X: 1735 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1736 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1737 else: 1738 self.msg('All weights set to 1 ‰') 1739 for r in self: 1740 r[f'wD{self._4x}raw'] = 1 1741 1742 for session in self.sessions: 1743 s = self.sessions[session] 1744 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1745 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1746 s['Np'] = sum(p_active) 1747 sdata = s['data'] 1748 1749 A = np.array([ 1750 [ 1751 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1752 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1753 1 / r[f'wD{self._4x}raw'], 1754 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1755 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1756 r['t'] / r[f'wD{self._4x}raw'] 1757 ] 1758 for r in sdata if r['Sample'] in self.anchors 1759 ])[:,p_active] # only keep columns for the active parameters 1760 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1761 s['Na'] = Y.size 1762 CM = linalg.inv(A.T @ A) 1763 bf = (CM @ A.T @ Y).T[0,:] 1764 k = 0 1765 for n,a in zip(p_names, p_active): 1766 if a: 1767 s[n] = bf[k] 1768# self.msg(f'{n} = {bf[k]}') 1769 k += 1 1770 else: 1771 s[n] = 0. 1772# self.msg(f'{n} = 0.0') 1773 1774 for r in sdata : 1775 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1776 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1777 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1778 1779 s['CM'] = np.zeros((6,6)) 1780 i = 0 1781 k_active = [j for j,a in enumerate(p_active) if a] 1782 for j,a in enumerate(p_active): 1783 if a: 1784 s['CM'][j,k_active] = CM[i,:] 1785 i += 1 1786 1787 if not weighted_sessions: 1788 w = self.rmswd()['rmswd'] 1789 for r in self: 1790 r[f'wD{self._4x}'] *= w 1791 r[f'wD{self._4x}raw'] *= w 1792 for session in self.sessions: 1793 self.sessions[session]['CM'] *= w**2 1794 1795 for session in self.sessions: 1796 s = self.sessions[session] 1797 s['SE_a'] = s['CM'][0,0]**.5 1798 s['SE_b'] = s['CM'][1,1]**.5 1799 s['SE_c'] = s['CM'][2,2]**.5 1800 s['SE_a2'] = s['CM'][3,3]**.5 1801 s['SE_b2'] = s['CM'][4,4]**.5 1802 s['SE_c2'] = s['CM'][5,5]**.5 1803 1804 if not weighted_sessions: 1805 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1806 else: 1807 self.Nf = 0 1808 for sg in weighted_sessions: 1809 self.Nf += self.rmswd(sessions = sg)['Nf'] 1810 1811 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1812 1813 avgD4x = { 1814 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1815 for sample in self.samples 1816 } 1817 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1818 rD4x = (chi2/self.Nf)**.5 1819 self.repeatability[f'sigma_{self._4x}'] = rD4x 1820 1821 if consolidate: 1822 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method argument is set to 'pooled', the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method argument is set to
'indep_sessions', the standardization processes each session independently, based only
on anchors analyses.
1825 def standardization_error(self, session, d4x, D4x, t = 0): 1826 ''' 1827 Compute standardization error for a given session and 1828 (δ47, Δ47) composition. 1829 ''' 1830 a = self.sessions[session]['a'] 1831 b = self.sessions[session]['b'] 1832 c = self.sessions[session]['c'] 1833 a2 = self.sessions[session]['a2'] 1834 b2 = self.sessions[session]['b2'] 1835 c2 = self.sessions[session]['c2'] 1836 CM = self.sessions[session]['CM'] 1837 1838 x, y = D4x, d4x 1839 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1840# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1841 dxdy = -(b+b2*t) / (a+a2*t) 1842 dxdz = 1. / (a+a2*t) 1843 dxda = -x / (a+a2*t) 1844 dxdb = -y / (a+a2*t) 1845 dxdc = -1. / (a+a2*t) 1846 dxda2 = -x * a2 / (a+a2*t) 1847 dxdb2 = -y * t / (a+a2*t) 1848 dxdc2 = -t / (a+a2*t) 1849 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1850 sx = (V @ CM @ V.T) ** .5 1851 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1854 @make_verbal 1855 def summary(self, 1856 dir = 'output', 1857 filename = None, 1858 save_to_file = True, 1859 print_out = True, 1860 ): 1861 ''' 1862 Print out an/or save to disk a summary of the standardization results. 1863 1864 **Parameters** 1865 1866 + `dir`: the directory in which to save the table 1867 + `filename`: the name to the csv file to write to 1868 + `save_to_file`: whether to save the table to disk 1869 + `print_out`: whether to print out the table 1870 ''' 1871 1872 out = [] 1873 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1874 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1875 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1876 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1877 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1878 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1879 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1880 out += [['Model degrees of freedom', f"{self.Nf}"]] 1881 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1882 out += [['Standardization method', self.standardization_method]] 1883 1884 if save_to_file: 1885 if not os.path.exists(dir): 1886 os.makedirs(dir) 1887 if filename is None: 1888 filename = f'D{self._4x}_summary.csv' 1889 with open(f'{dir}/{filename}', 'w') as fid: 1890 fid.write(make_csv(out)) 1891 if print_out: 1892 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the table
1895 @make_verbal 1896 def table_of_sessions(self, 1897 dir = 'output', 1898 filename = None, 1899 save_to_file = True, 1900 print_out = True, 1901 output = None, 1902 ): 1903 ''' 1904 Print out an/or save to disk a table of sessions. 1905 1906 **Parameters** 1907 1908 + `dir`: the directory in which to save the table 1909 + `filename`: the name to the csv file to write to 1910 + `save_to_file`: whether to save the table to disk 1911 + `print_out`: whether to print out the table 1912 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1913 if set to `'raw'`: return a list of list of strings 1914 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1915 ''' 1916 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1917 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1918 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1919 1920 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1921 if include_a2: 1922 out[-1] += ['a2 ± SE'] 1923 if include_b2: 1924 out[-1] += ['b2 ± SE'] 1925 if include_c2: 1926 out[-1] += ['c2 ± SE'] 1927 for session in self.sessions: 1928 out += [[ 1929 session, 1930 f"{self.sessions[session]['Na']}", 1931 f"{self.sessions[session]['Nu']}", 1932 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1933 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1934 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1935 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1936 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1937 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1938 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1939 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1940 ]] 1941 if include_a2: 1942 if self.sessions[session]['scrambling_drift']: 1943 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1944 else: 1945 out[-1] += [''] 1946 if include_b2: 1947 if self.sessions[session]['slope_drift']: 1948 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1949 else: 1950 out[-1] += [''] 1951 if include_c2: 1952 if self.sessions[session]['wg_drift']: 1953 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1954 else: 1955 out[-1] += [''] 1956 1957 if save_to_file: 1958 if not os.path.exists(dir): 1959 os.makedirs(dir) 1960 if filename is None: 1961 filename = f'D{self._4x}_sessions.csv' 1962 with open(f'{dir}/{filename}', 'w') as fid: 1963 fid.write(make_csv(out)) 1964 if print_out: 1965 self.msg('\n' + pretty_table(out)) 1966 if output == 'raw': 1967 return out 1968 elif output == 'pretty': 1969 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
1972 @make_verbal 1973 def table_of_analyses( 1974 self, 1975 dir = 'output', 1976 filename = None, 1977 save_to_file = True, 1978 print_out = True, 1979 output = None, 1980 ): 1981 ''' 1982 Print out an/or save to disk a table of analyses. 1983 1984 **Parameters** 1985 1986 + `dir`: the directory in which to save the table 1987 + `filename`: the name to the csv file to write to 1988 + `save_to_file`: whether to save the table to disk 1989 + `print_out`: whether to print out the table 1990 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1991 if set to `'raw'`: return a list of list of strings 1992 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1993 ''' 1994 1995 out = [['UID','Session','Sample']] 1996 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1997 for f in extra_fields: 1998 out[-1] += [f[0]] 1999 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 2000 for r in self: 2001 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 2002 for f in extra_fields: 2003 out[-1] += [f"{r[f[0]]:{f[1]}}"] 2004 out[-1] += [ 2005 f"{r['d13Cwg_VPDB']:.3f}", 2006 f"{r['d18Owg_VSMOW']:.3f}", 2007 f"{r['d45']:.6f}", 2008 f"{r['d46']:.6f}", 2009 f"{r['d47']:.6f}", 2010 f"{r['d48']:.6f}", 2011 f"{r['d49']:.6f}", 2012 f"{r['d13C_VPDB']:.6f}", 2013 f"{r['d18O_VSMOW']:.6f}", 2014 f"{r['D47raw']:.6f}", 2015 f"{r['D48raw']:.6f}", 2016 f"{r['D49raw']:.6f}", 2017 f"{r[f'D{self._4x}']:.6f}" 2018 ] 2019 if save_to_file: 2020 if not os.path.exists(dir): 2021 os.makedirs(dir) 2022 if filename is None: 2023 filename = f'D{self._4x}_analyses.csv' 2024 with open(f'{dir}/{filename}', 'w') as fid: 2025 fid.write(make_csv(out)) 2026 if print_out: 2027 self.msg('\n' + pretty_table(out)) 2028 return out
Print out an/or save to disk a table of analyses.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2030 @make_verbal 2031 def covar_table( 2032 self, 2033 correl = False, 2034 dir = 'output', 2035 filename = None, 2036 save_to_file = True, 2037 print_out = True, 2038 output = None, 2039 ): 2040 ''' 2041 Print out, save to disk and/or return the variance-covariance matrix of D4x 2042 for all unknown samples. 2043 2044 **Parameters** 2045 2046 + `dir`: the directory in which to save the csv 2047 + `filename`: the name of the csv file to write to 2048 + `save_to_file`: whether to save the csv 2049 + `print_out`: whether to print out the matrix 2050 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2051 if set to `'raw'`: return a list of list of strings 2052 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2053 ''' 2054 samples = sorted([u for u in self.unknowns]) 2055 out = [[''] + samples] 2056 for s1 in samples: 2057 out.append([s1]) 2058 for s2 in samples: 2059 if correl: 2060 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2061 else: 2062 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2063 2064 if save_to_file: 2065 if not os.path.exists(dir): 2066 os.makedirs(dir) 2067 if filename is None: 2068 if correl: 2069 filename = f'D{self._4x}_correl.csv' 2070 else: 2071 filename = f'D{self._4x}_covar.csv' 2072 with open(f'{dir}/{filename}', 'w') as fid: 2073 fid.write(make_csv(out)) 2074 if print_out: 2075 self.msg('\n'+pretty_table(out)) 2076 if output == 'raw': 2077 return out 2078 elif output == 'pretty': 2079 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir: the directory in which to save the csvfilename: the name of the csv file to write tosave_to_file: whether to save the csvprint_out: whether to print out the matrixoutput: if set to'pretty': return a pretty text matrix (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2081 @make_verbal 2082 def table_of_samples( 2083 self, 2084 dir = 'output', 2085 filename = None, 2086 save_to_file = True, 2087 print_out = True, 2088 output = None, 2089 ): 2090 ''' 2091 Print out, save to disk and/or return a table of samples. 2092 2093 **Parameters** 2094 2095 + `dir`: the directory in which to save the csv 2096 + `filename`: the name of the csv file to write to 2097 + `save_to_file`: whether to save the csv 2098 + `print_out`: whether to print out the table 2099 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2100 if set to `'raw'`: return a list of list of strings 2101 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2102 ''' 2103 2104 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2105 for sample in self.anchors: 2106 out += [[ 2107 f"{sample}", 2108 f"{self.samples[sample]['N']}", 2109 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2110 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2111 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2112 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2113 ]] 2114 for sample in self.unknowns: 2115 out += [[ 2116 f"{sample}", 2117 f"{self.samples[sample]['N']}", 2118 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2119 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2120 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2121 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2122 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2123 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2124 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2125 ]] 2126 if save_to_file: 2127 if not os.path.exists(dir): 2128 os.makedirs(dir) 2129 if filename is None: 2130 filename = f'D{self._4x}_samples.csv' 2131 with open(f'{dir}/{filename}', 'w') as fid: 2132 fid.write(make_csv(out)) 2133 if print_out: 2134 self.msg('\n'+pretty_table(out)) 2135 if output == 'raw': 2136 return out 2137 elif output == 'pretty': 2138 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir: the directory in which to save the csvfilename: the name of the csv file to write tosave_to_file: whether to save the csvprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2141 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2142 ''' 2143 Generate session plots and save them to disk. 2144 2145 **Parameters** 2146 2147 + `dir`: the directory in which to save the plots 2148 + `figsize`: the width and height (in inches) of each plot 2149 + `filetype`: 'pdf' or 'png' 2150 + `dpi`: resolution for PNG output 2151 ''' 2152 if not os.path.exists(dir): 2153 os.makedirs(dir) 2154 2155 for session in self.sessions: 2156 sp = self.plot_single_session(session, xylimits = 'constant') 2157 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2158 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir: the directory in which to save the plotsfigsize: the width and height (in inches) of each plotfiletype: 'pdf' or 'png'dpi: resolution for PNG output
2162 @make_verbal 2163 def consolidate_samples(self): 2164 ''' 2165 Compile various statistics for each sample. 2166 2167 For each anchor sample: 2168 2169 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2170 + `SE_D47` or `SE_D48`: set to zero by definition 2171 2172 For each unknown sample: 2173 2174 + `D47` or `D48`: the standardized Δ4x value for this unknown 2175 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2176 2177 For each anchor and unknown: 2178 2179 + `N`: the total number of analyses of this sample 2180 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2181 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2182 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2183 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2184 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2185 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2186 ''' 2187 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2188 for sample in self.samples: 2189 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2190 if self.samples[sample]['N'] > 1: 2191 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2192 2193 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2194 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2195 2196 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2197 if len(D4x_pop) > 2: 2198 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2199 2200 if self.standardization_method == 'pooled': 2201 for sample in self.anchors: 2202 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2203 self.samples[sample][f'SE_D{self._4x}'] = 0. 2204 for sample in self.unknowns: 2205 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2206 try: 2207 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2208 except ValueError: 2209 # when `sample` is constrained by self.standardize(constraints = {...}), 2210 # it is no longer listed in self.standardization.var_names. 2211 # Temporary fix: define SE as zero for now 2212 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2213 2214 elif self.standardization_method == 'indep_sessions': 2215 for sample in self.anchors: 2216 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2217 self.samples[sample][f'SE_D{self._4x}'] = 0. 2218 for sample in self.unknowns: 2219 self.msg(f'Consolidating sample {sample}') 2220 self.unknowns[sample][f'session_D{self._4x}'] = {} 2221 session_avg = [] 2222 for session in self.sessions: 2223 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2224 if sdata: 2225 self.msg(f'{sample} found in session {session}') 2226 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2227 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2228 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2229 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2230 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2231 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2232 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2233 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2234 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2235 wsum = sum([weights[s] for s in weights]) 2236 for s in weights: 2237 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2238 2239 for r in self: 2240 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47orD48: the nominal Δ4x value for this anchor, specified byself.Nominal_D4xSE_D47orSE_D48: set to zero by definition
For each unknown sample:
D47orD48: the standardized Δ4x value for this unknownSE_D47orSE_D48: the standard error of Δ4x for this unknown
For each anchor and unknown:
N: the total number of analyses of this sampleSD_D47orSD_D48: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB: the average δ13CVPDB value for this sampled18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE.
2244 def consolidate_sessions(self): 2245 ''' 2246 Compute various statistics for each session. 2247 2248 + `Na`: Number of anchor analyses in the session 2249 + `Nu`: Number of unknown analyses in the session 2250 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2251 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2252 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2253 + `a`: scrambling factor 2254 + `b`: compositional slope 2255 + `c`: WG offset 2256 + `SE_a`: Model stadard erorr of `a` 2257 + `SE_b`: Model stadard erorr of `b` 2258 + `SE_c`: Model stadard erorr of `c` 2259 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2260 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2261 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2262 + `a2`: scrambling factor drift 2263 + `b2`: compositional slope drift 2264 + `c2`: WG offset drift 2265 + `Np`: Number of standardization parameters to fit 2266 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2267 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2268 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2269 ''' 2270 for session in self.sessions: 2271 if 'd13Cwg_VPDB' not in self.sessions[session]: 2272 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2273 if 'd18Owg_VSMOW' not in self.sessions[session]: 2274 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2275 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2276 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2277 2278 self.msg(f'Computing repeatabilities for session {session}') 2279 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2280 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2281 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2282 2283 if self.standardization_method == 'pooled': 2284 for session in self.sessions: 2285 2286 # different (better?) computation of D4x repeatability for each session: 2287 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2288 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2289 2290 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2291 i = self.standardization.var_names.index(f'a_{pf(session)}') 2292 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2293 2294 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2295 i = self.standardization.var_names.index(f'b_{pf(session)}') 2296 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2297 2298 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2299 i = self.standardization.var_names.index(f'c_{pf(session)}') 2300 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2301 2302 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2303 if self.sessions[session]['scrambling_drift']: 2304 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2305 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2306 else: 2307 self.sessions[session]['SE_a2'] = 0. 2308 2309 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2310 if self.sessions[session]['slope_drift']: 2311 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2312 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2313 else: 2314 self.sessions[session]['SE_b2'] = 0. 2315 2316 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2317 if self.sessions[session]['wg_drift']: 2318 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2319 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2320 else: 2321 self.sessions[session]['SE_c2'] = 0. 2322 2323 i = self.standardization.var_names.index(f'a_{pf(session)}') 2324 j = self.standardization.var_names.index(f'b_{pf(session)}') 2325 k = self.standardization.var_names.index(f'c_{pf(session)}') 2326 CM = np.zeros((6,6)) 2327 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2328 try: 2329 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2330 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2331 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2332 try: 2333 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2334 CM[3,4] = self.standardization.covar[i2,j2] 2335 CM[4,3] = self.standardization.covar[j2,i2] 2336 except ValueError: 2337 pass 2338 try: 2339 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2340 CM[3,5] = self.standardization.covar[i2,k2] 2341 CM[5,3] = self.standardization.covar[k2,i2] 2342 except ValueError: 2343 pass 2344 except ValueError: 2345 pass 2346 try: 2347 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2348 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2349 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2350 try: 2351 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2352 CM[4,5] = self.standardization.covar[j2,k2] 2353 CM[5,4] = self.standardization.covar[k2,j2] 2354 except ValueError: 2355 pass 2356 except ValueError: 2357 pass 2358 try: 2359 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2360 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2361 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2362 except ValueError: 2363 pass 2364 2365 self.sessions[session]['CM'] = CM 2366 2367 elif self.standardization_method == 'indep_sessions': 2368 pass # Not implemented yet
Compute various statistics for each session.
Na: Number of anchor analyses in the sessionNu: Number of unknown analyses in the sessionr_d13C_VPDB: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the sessionr_D47orr_D48: Δ4x repeatability of analyses within the sessiona: scrambling factorb: compositional slopec: WG offsetSE_a: Model stadard erorr ofaSE_b: Model stadard erorr ofbSE_c: Model stadard erorr ofcscrambling_drift(boolean): whether to allow a temporal drift in the scrambling factor (a)slope_drift(boolean): whether to allow a temporal drift in the compositional slope (b)wg_drift(boolean): whether to allow a temporal drift in the WG offset (c)a2: scrambling factor driftb2: compositional slope driftc2: WG offset driftNp: Number of standardization parameters to fitCM: model covariance matrix for (a,b,c,a2,b2,c2)d13Cwg_VPDB: δ13CVPDB of WGd18Owg_VSMOW: δ18OVSMOW of WG
2371 @make_verbal 2372 def repeatabilities(self): 2373 ''' 2374 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2375 (for all samples, for anchors, and for unknowns). 2376 ''' 2377 self.msg('Computing reproducibilities for all sessions') 2378 2379 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2380 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2381 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2382 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2383 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2386 @make_verbal 2387 def consolidate(self, tables = True, plots = True): 2388 ''' 2389 Collect information about samples, sessions and repeatabilities. 2390 ''' 2391 self.consolidate_samples() 2392 self.consolidate_sessions() 2393 self.repeatabilities() 2394 2395 if tables: 2396 self.summary() 2397 self.table_of_sessions() 2398 self.table_of_analyses() 2399 self.table_of_samples() 2400 2401 if plots: 2402 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2405 @make_verbal 2406 def rmswd(self, 2407 samples = 'all samples', 2408 sessions = 'all sessions', 2409 ): 2410 ''' 2411 Compute the χ2, root mean squared weighted deviation 2412 (i.e. reduced χ2), and corresponding degrees of freedom of the 2413 Δ4x values for samples in `samples` and sessions in `sessions`. 2414 2415 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2416 ''' 2417 if samples == 'all samples': 2418 mysamples = [k for k in self.samples] 2419 elif samples == 'anchors': 2420 mysamples = [k for k in self.anchors] 2421 elif samples == 'unknowns': 2422 mysamples = [k for k in self.unknowns] 2423 else: 2424 mysamples = samples 2425 2426 if sessions == 'all sessions': 2427 sessions = [k for k in self.sessions] 2428 2429 chisq, Nf = 0, 0 2430 for sample in mysamples : 2431 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2432 if len(G) > 1 : 2433 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2434 Nf += (len(G) - 1) 2435 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2436 r = (chisq / Nf)**.5 if Nf > 0 else 0 2437 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2438 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples and sessions in sessions.
Only used in D4xdata.standardize() with method='indep_sessions'.
2441 @make_verbal 2442 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2443 ''' 2444 Compute the repeatability of `[r[key] for r in self]` 2445 ''' 2446 2447 if samples == 'all samples': 2448 mysamples = [k for k in self.samples] 2449 elif samples == 'anchors': 2450 mysamples = [k for k in self.anchors] 2451 elif samples == 'unknowns': 2452 mysamples = [k for k in self.unknowns] 2453 else: 2454 mysamples = samples 2455 2456 if sessions == 'all sessions': 2457 sessions = [k for k in self.sessions] 2458 2459 if key in ['D47', 'D48']: 2460 # Full disclosure: the definition of Nf is tricky/debatable 2461 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2462 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2463 Nf = len(G) 2464# print(f'len(G) = {Nf}') 2465 Nf -= len([s for s in mysamples if s in self.unknowns]) 2466# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2467 for session in sessions: 2468 Np = len([ 2469 _ for _ in self.standardization.params 2470 if ( 2471 self.standardization.params[_].expr is not None 2472 and ( 2473 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2474 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2475 ) 2476 ) 2477 ]) 2478# print(f'session {session}: {Np} parameters to consider') 2479 Na = len({ 2480 r['Sample'] for r in self.sessions[session]['data'] 2481 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2482 }) 2483# print(f'session {session}: {Na} different anchors in that session') 2484 Nf -= min(Np, Na) 2485# print(f'Nf = {Nf}') 2486 2487# for sample in mysamples : 2488# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2489# if len(X) > 1 : 2490# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2491# if sample in self.unknowns: 2492# Nf += len(X) - 1 2493# else: 2494# Nf += len(X) 2495# if samples in ['anchors', 'all samples']: 2496# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2497 r = (chisq / Nf)**.5 if Nf > 0 else 0 2498 2499 else: # if key not in ['D47', 'D48'] 2500 chisq, Nf = 0, 0 2501 for sample in mysamples : 2502 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2503 if len(X) > 1 : 2504 Nf += len(X) - 1 2505 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2506 r = (chisq / Nf)**.5 if Nf > 0 else 0 2507 2508 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2509 return r
Compute the repeatability of [r[key] for r in self]
2511 def sample_average(self, samples, weights = 'equal', normalize = True): 2512 ''' 2513 Weighted average Δ4x value of a group of samples, accounting for covariance. 2514 2515 Returns the weighed average Δ4x value and associated SE 2516 of a group of samples. Weights are equal by default. If `normalize` is 2517 true, `weights` will be rescaled so that their sum equals 1. 2518 2519 **Examples** 2520 2521 ```python 2522 self.sample_average(['X','Y'], [1, 2]) 2523 ``` 2524 2525 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2526 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2527 values of samples X and Y, respectively. 2528 2529 ```python 2530 self.sample_average(['X','Y'], [1, -1], normalize = False) 2531 ``` 2532 2533 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2534 ''' 2535 if weights == 'equal': 2536 weights = [1/len(samples)] * len(samples) 2537 2538 if normalize: 2539 s = sum(weights) 2540 if s: 2541 weights = [w/s for w in weights] 2542 2543 try: 2544# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2545# C = self.standardization.covar[indices,:][:,indices] 2546 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2547 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2548 return correlated_sum(X, C, weights) 2549 except ValueError: 2550 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize is
true, weights will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2553 def sample_D4x_covar(self, sample1, sample2 = None): 2554 ''' 2555 Covariance between Δ4x values of samples 2556 2557 Returns the error covariance between the average Δ4x values of two 2558 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2559 returns the Δ4x variance for that sample. 2560 ''' 2561 if sample2 is None: 2562 sample2 = sample1 2563 if self.standardization_method == 'pooled': 2564 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2565 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2566 return self.standardization.covar[i, j] 2567 elif self.standardization_method == 'indep_sessions': 2568 if sample1 == sample2: 2569 return self.samples[sample1][f'SE_D{self._4x}']**2 2570 else: 2571 c = 0 2572 for session in self.sessions: 2573 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2574 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2575 if sdata1 and sdata2: 2576 a = self.sessions[session]['a'] 2577 # !! TODO: CM below does not account for temporal changes in standardization parameters 2578 CM = self.sessions[session]['CM'][:3,:3] 2579 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2580 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2581 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2582 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2583 c += ( 2584 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2585 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2586 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2587 @ CM 2588 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2589 ) / a**2 2590 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1 is specified, or if sample_1 == sample_2),
returns the Δ4x variance for that sample.
2592 def sample_D4x_correl(self, sample1, sample2 = None): 2593 ''' 2594 Correlation between Δ4x errors of samples 2595 2596 Returns the error correlation between the average Δ4x values of two samples. 2597 ''' 2598 if sample2 is None or sample2 == sample1: 2599 return 1. 2600 return ( 2601 self.sample_D4x_covar(sample1, sample2) 2602 / self.unknowns[sample1][f'SE_D{self._4x}'] 2603 / self.unknowns[sample2][f'SE_D{self._4x}'] 2604 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2606 def plot_single_session(self, 2607 session, 2608 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2609 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2610 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2611 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2612 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2613 xylimits = 'free', # | 'constant' 2614 x_label = None, 2615 y_label = None, 2616 error_contour_interval = 'auto', 2617 fig = 'new', 2618 ): 2619 ''' 2620 Generate plot for a single session 2621 ''' 2622 if x_label is None: 2623 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2624 if y_label is None: 2625 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2626 2627 out = _SessionPlot() 2628 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2629 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2630 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2631 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2632 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2633 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2634 anchor_avg = (np.array([ np.array([ 2635 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2636 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2637 ]) for sample in anchors]).T, 2638 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2639 unknown_avg = (np.array([ np.array([ 2640 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2641 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2642 ]) for sample in unknowns]).T, 2643 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2644 2645 2646 if fig == 'new': 2647 out.fig = ppl.figure(figsize = (6,6)) 2648 ppl.subplots_adjust(.1,.1,.9,.9) 2649 2650 out.anchor_analyses, = ppl.plot( 2651 anchors_d, 2652 anchors_D, 2653 **kw_plot_anchors) 2654 out.unknown_analyses, = ppl.plot( 2655 unknowns_d, 2656 unknowns_D, 2657 **kw_plot_unknowns) 2658 out.anchor_avg = ppl.plot( 2659 *anchor_avg, 2660 **kw_plot_anchor_avg) 2661 out.unknown_avg = ppl.plot( 2662 *unknown_avg, 2663 **kw_plot_unknown_avg) 2664 if xylimits == 'constant': 2665 x = [r[f'd{self._4x}'] for r in self] 2666 y = [r[f'D{self._4x}'] for r in self] 2667 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2668 w, h = x2-x1, y2-y1 2669 x1 -= w/20 2670 x2 += w/20 2671 y1 -= h/20 2672 y2 += h/20 2673 ppl.axis([x1, x2, y1, y2]) 2674 elif xylimits == 'free': 2675 x1, x2, y1, y2 = ppl.axis() 2676 else: 2677 x1, x2, y1, y2 = ppl.axis(xylimits) 2678 2679 if error_contour_interval != 'none': 2680 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2681 XI,YI = np.meshgrid(xi, yi) 2682 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2683 if error_contour_interval == 'auto': 2684 rng = np.max(SI) - np.min(SI) 2685 if rng <= 0.01: 2686 cinterval = 0.001 2687 elif rng <= 0.03: 2688 cinterval = 0.004 2689 elif rng <= 0.1: 2690 cinterval = 0.01 2691 elif rng <= 0.3: 2692 cinterval = 0.03 2693 elif rng <= 1.: 2694 cinterval = 0.1 2695 else: 2696 cinterval = 0.5 2697 else: 2698 cinterval = error_contour_interval 2699 2700 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2701 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2702 out.clabel = ppl.clabel(out.contour) 2703 contour = (XI, YI, SI, cval, cinterval) 2704 2705 if fig == None: 2706 return { 2707 'anchors':anchors, 2708 'unknowns':unknowns, 2709 'anchors_d':anchors_d, 2710 'anchors_D':anchors_D, 2711 'unknowns_d':unknowns_d, 2712 'unknowns_D':unknowns_D, 2713 'anchor_avg':anchor_avg, 2714 'unknown_avg':unknown_avg, 2715 'contour':contour, 2716 } 2717 2718 ppl.xlabel(x_label) 2719 ppl.ylabel(y_label) 2720 ppl.title(session, weight = 'bold') 2721 ppl.grid(alpha = .2) 2722 out.ax = ppl.gca() 2723 2724 return out
Generate plot for a single session
2726 def plot_residuals( 2727 self, 2728 kde = False, 2729 hist = False, 2730 binwidth = 2/3, 2731 dir = 'output', 2732 filename = None, 2733 highlight = [], 2734 colors = None, 2735 figsize = None, 2736 dpi = 100, 2737 yspan = None, 2738 ): 2739 ''' 2740 Plot residuals of each analysis as a function of time (actually, as a function of 2741 the order of analyses in the `D4xdata` object) 2742 2743 + `kde`: whether to add a kernel density estimate of residuals 2744 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2745 + `histbins`: specify bin edges for the histogram 2746 + `dir`: the directory in which to save the plot 2747 + `highlight`: a list of samples to highlight 2748 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2749 + `figsize`: (width, height) of figure 2750 + `dpi`: resolution for PNG output 2751 + `yspan`: factor controlling the range of y values shown in plot 2752 (by default: `yspan = 1.5 if kde else 1.0`) 2753 ''' 2754 2755 from matplotlib import ticker 2756 2757 if yspan is None: 2758 if kde: 2759 yspan = 1.5 2760 else: 2761 yspan = 1.0 2762 2763 # Layout 2764 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2765 if hist or kde: 2766 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2767 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2768 else: 2769 ppl.subplots_adjust(.08,.05,.78,.8) 2770 ax1 = ppl.subplot(111) 2771 2772 # Colors 2773 N = len(self.anchors) 2774 if colors is None: 2775 if len(highlight) > 0: 2776 Nh = len(highlight) 2777 if Nh == 1: 2778 colors = {highlight[0]: (0,0,0)} 2779 elif Nh == 3: 2780 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2781 elif Nh == 4: 2782 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2783 else: 2784 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2785 else: 2786 if N == 3: 2787 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2788 elif N == 4: 2789 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2790 else: 2791 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2792 2793 ppl.sca(ax1) 2794 2795 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2796 2797 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2798 2799 session = self[0]['Session'] 2800 x1 = 0 2801# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2802 x_sessions = {} 2803 one_or_more_singlets = False 2804 one_or_more_multiplets = False 2805 multiplets = set() 2806 for k,r in enumerate(self): 2807 if r['Session'] != session: 2808 x2 = k-1 2809 x_sessions[session] = (x1+x2)/2 2810 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2811 session = r['Session'] 2812 x1 = k 2813 singlet = len(self.samples[r['Sample']]['data']) == 1 2814 if not singlet: 2815 multiplets.add(r['Sample']) 2816 if r['Sample'] in self.unknowns: 2817 if singlet: 2818 one_or_more_singlets = True 2819 else: 2820 one_or_more_multiplets = True 2821 kw = dict( 2822 marker = 'x' if singlet else '+', 2823 ms = 4 if singlet else 5, 2824 ls = 'None', 2825 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2826 mew = 1, 2827 alpha = 0.2 if singlet else 1, 2828 ) 2829 if highlight and r['Sample'] not in highlight: 2830 kw['alpha'] = 0.2 2831 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2832 x2 = k 2833 x_sessions[session] = (x1+x2)/2 2834 2835 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2836 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2837 if not (hist or kde): 2838 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2839 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2840 2841 xmin, xmax, ymin, ymax = ppl.axis() 2842 if yspan != 1: 2843 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2844 for s in x_sessions: 2845 ppl.text( 2846 x_sessions[s], 2847 ymax +1, 2848 s, 2849 va = 'bottom', 2850 **( 2851 dict(ha = 'center') 2852 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2853 else dict(ha = 'left', rotation = 45) 2854 ) 2855 ) 2856 2857 if hist or kde: 2858 ppl.sca(ax2) 2859 2860 for s in colors: 2861 kw['marker'] = '+' 2862 kw['ms'] = 5 2863 kw['mec'] = colors[s] 2864 kw['label'] = s 2865 kw['alpha'] = 1 2866 ppl.plot([], [], **kw) 2867 2868 kw['mec'] = (0,0,0) 2869 2870 if one_or_more_singlets: 2871 kw['marker'] = 'x' 2872 kw['ms'] = 4 2873 kw['alpha'] = .2 2874 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2875 ppl.plot([], [], **kw) 2876 2877 if one_or_more_multiplets: 2878 kw['marker'] = '+' 2879 kw['ms'] = 4 2880 kw['alpha'] = 1 2881 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2882 ppl.plot([], [], **kw) 2883 2884 if hist or kde: 2885 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2886 else: 2887 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2888 leg.set_zorder(-1000) 2889 2890 ppl.sca(ax1) 2891 2892 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2893 ppl.xticks([]) 2894 ppl.axis([-1, len(self), None, None]) 2895 2896 if hist or kde: 2897 ppl.sca(ax2) 2898 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2899 2900 if kde: 2901 from scipy.stats import gaussian_kde 2902 yi = np.linspace(ymin, ymax, 201) 2903 xi = gaussian_kde(X).evaluate(yi) 2904 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2905# ppl.plot(xi, yi, 'k-', lw = 1) 2906 elif hist: 2907 ppl.hist( 2908 X, 2909 orientation = 'horizontal', 2910 histtype = 'stepfilled', 2911 ec = [.4]*3, 2912 fc = [.25]*3, 2913 alpha = .25, 2914 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2915 ) 2916 ppl.text(0, 0, 2917 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2918 size = 7.5, 2919 alpha = 1, 2920 va = 'center', 2921 ha = 'left', 2922 ) 2923 2924 ppl.axis([0, None, ymin, ymax]) 2925 ppl.xticks([]) 2926 ppl.yticks([]) 2927# ax2.spines['left'].set_visible(False) 2928 ax2.spines['right'].set_visible(False) 2929 ax2.spines['top'].set_visible(False) 2930 ax2.spines['bottom'].set_visible(False) 2931 2932 ax1.axis([None, None, ymin, ymax]) 2933 2934 if not os.path.exists(dir): 2935 os.makedirs(dir) 2936 if filename is None: 2937 return fig 2938 elif filename == '': 2939 filename = f'D{self._4x}_residuals.pdf' 2940 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2941 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata object)
kde: whether to add a kernel density estimate of residualshist: whether to add a histogram of residuals (incompatible withkde)histbins: specify bin edges for the histogramdir: the directory in which to save the plothighlight: a list of samples to highlightcolors: a dict of{<sample>: (r, g, b)}for all samplesfigsize: (width, height) of figuredpi: resolution for PNG outputyspan: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0)
2944 def simulate(self, *args, **kwargs): 2945 ''' 2946 Legacy function with warning message pointing to `virtual_data()` 2947 ''' 2948 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2950 def plot_anchor_residuals( 2951 self, 2952 dir = 'output', 2953 filename = '', 2954 figsize = None, 2955 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2956 dpi = 100, 2957 colors = None, 2958 ): 2959 ''' 2960 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2961 2962 **Parameters** 2963 2964 + `dir`: the directory in which to save the plot 2965 + `filename`: the file name to save to. 2966 + `dpi`: resolution for PNG output 2967 + `figsize`: (width, height) of figure 2968 + `subplots_adjust`: passed to the figure 2969 + `dpi`: resolution for PNG output 2970 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2971 ''' 2972 2973 # Colors 2974 N = len(self.anchors) 2975 if colors is None: 2976 if N == 3: 2977 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2978 elif N == 4: 2979 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2980 else: 2981 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2982 2983 if figsize is None: 2984 figsize = (4, 1.5*N+1) 2985 fig = ppl.figure(figsize = figsize) 2986 ppl.subplots_adjust(*subplots_adjust) 2987 axs = {} 2988 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2989 sigma = self.repeatability['r_D47a'] * 1000 2990 D = max(np.abs(X)) 2991 2992 for k,a in enumerate(self.anchors): 2993 color = colors[a] 2994 axs[a] = ppl.subplot(N, 1, 1+k) 2995 axs[a].text( 2996 0.02, 1-0.05, a, 2997 va = 'top', 2998 ha = 'left', 2999 weight = 'bold', 3000 size = 9, 3001 color = [_*0.75 for _ in color], 3002 transform = axs[a].transAxes, 3003 ) 3004 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 3005 axs[a].axvline(0, lw = 0.5, color = color) 3006 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 3007 3008 xi = np.linspace(-3*D, 3*D, 601) 3009 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 3010 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 3011 3012 axs[a].errorbar( 3013 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3014 ecolor = color, 3015 marker = 's', 3016 ls = 'None', 3017 mec = color, 3018 mew = 1, 3019 mfc = 'w', 3020 ms = 8, 3021 elinewidth = 1, 3022 capsize = 4, 3023 capthick = 1, 3024 ) 3025 3026 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3027 ppl.yticks([]) 3028 3029 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3030 3031 if not os.path.exists(dir): 3032 os.makedirs(dir) 3033 if filename is None: 3034 return fig 3035 elif filename == '': 3036 filename = f'D{self._4x}_anchor_residuals.pdf' 3037 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3038 ppl.close(fig)
Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
Parameters
dir: the directory in which to save the plotfilename: the file name to save to.dpi: resolution for PNG outputfigsize: (width, height) of figuresubplots_adjust: passed to the figuredpi: resolution for PNG outputcolors: a dict of{<sample>: (r, g, b)}for all samples
3041 def plot_distribution_of_analyses( 3042 self, 3043 dir = 'output', 3044 filename = None, 3045 vs_time = False, 3046 figsize = (6,4), 3047 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3048 output = None, 3049 dpi = 100, 3050 ): 3051 ''' 3052 Plot temporal distribution of all analyses in the data set. 3053 3054 **Parameters** 3055 3056 + `dir`: the directory in which to save the plot 3057 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3058 + `dpi`: resolution for PNG output 3059 + `figsize`: (width, height) of figure 3060 + `dpi`: resolution for PNG output 3061 ''' 3062 3063 asamples = [s for s in self.anchors] 3064 usamples = [s for s in self.unknowns] 3065 if output is None or output == 'fig': 3066 fig = ppl.figure(figsize = figsize) 3067 ppl.subplots_adjust(*subplots_adjust) 3068 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3069 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3070 Xmax += (Xmax-Xmin)/40 3071 Xmin -= (Xmax-Xmin)/41 3072 for k, s in enumerate(asamples + usamples): 3073 if vs_time: 3074 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3075 else: 3076 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3077 Y = [-k for x in X] 3078 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3079 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3080 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3081 ppl.axis([Xmin, Xmax, -k-1, 1]) 3082 ppl.xlabel('\ntime') 3083 ppl.gca().annotate('', 3084 xy = (0.6, -0.02), 3085 xycoords = 'axes fraction', 3086 xytext = (.4, -0.02), 3087 arrowprops = dict(arrowstyle = "->", color = 'k'), 3088 ) 3089 3090 3091 x2 = -1 3092 for session in self.sessions: 3093 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3094 if vs_time: 3095 ppl.axvline(x1, color = 'k', lw = .75) 3096 if x2 > -1: 3097 if not vs_time: 3098 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3099 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3100# from xlrd import xldate_as_datetime 3101# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3102 if vs_time: 3103 ppl.axvline(x2, color = 'k', lw = .75) 3104 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3105 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3106 3107 ppl.xticks([]) 3108 ppl.yticks([]) 3109 3110 if output is None: 3111 if not os.path.exists(dir): 3112 os.makedirs(dir) 3113 if filename == None: 3114 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3115 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3116 ppl.close(fig) 3117 elif output == 'ax': 3118 return ppl.gca() 3119 elif output == 'fig': 3120 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir: the directory in which to save the plotvs_time: ifTrue, plot as a function ofTimeTagrather than sequentially.dpi: resolution for PNG outputfigsize: (width, height) of figuredpi: resolution for PNG output
3123 def plot_bulk_compositions( 3124 self, 3125 samples = None, 3126 dir = 'output/bulk_compositions', 3127 figsize = (6,6), 3128 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3129 show = False, 3130 sample_color = (0,.5,1), 3131 analysis_color = (.7,.7,.7), 3132 labeldist = 0.3, 3133 radius = 0.05, 3134 ): 3135 ''' 3136 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3137 3138 By default, creates a directory `./output/bulk_compositions` where plots for 3139 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3140 3141 3142 **Parameters** 3143 3144 + `samples`: Only these samples are processed (by default: all samples). 3145 + `dir`: where to save the plots 3146 + `figsize`: (width, height) of figure 3147 + `subplots_adjust`: passed to `subplots_adjust()` 3148 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3149 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3150 + `sample_color`: color used for replicate markers/labels 3151 + `analysis_color`: color used for sample markers/labels 3152 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3153 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3154 ''' 3155 3156 from matplotlib.patches import Ellipse 3157 3158 if samples is None: 3159 samples = [_ for _ in self.samples] 3160 3161 saved = {} 3162 3163 for s in samples: 3164 3165 fig = ppl.figure(figsize = figsize) 3166 fig.subplots_adjust(*subplots_adjust) 3167 ax = ppl.subplot(111) 3168 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3169 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3170 ppl.title(s) 3171 3172 3173 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3174 UID = [_['UID'] for _ in self.samples[s]['data']] 3175 XY0 = XY.mean(0) 3176 3177 for xy in XY: 3178 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3179 3180 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3181 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3182 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3183 saved[s] = [XY, XY0] 3184 3185 x1, x2, y1, y2 = ppl.axis() 3186 x0, dx = (x1+x2)/2, (x2-x1)/2 3187 y0, dy = (y1+y2)/2, (y2-y1)/2 3188 dx, dy = [max(max(dx, dy), radius)]*2 3189 3190 ppl.axis([ 3191 x0 - 1.2*dx, 3192 x0 + 1.2*dx, 3193 y0 - 1.2*dy, 3194 y0 + 1.2*dy, 3195 ]) 3196 3197 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3198 3199 for xy, uid in zip(XY, UID): 3200 3201 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3202 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3203 3204 if (vector_in_display_space**2).sum() > 0: 3205 3206 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3207 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3208 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3209 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3210 3211 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3212 3213 else: 3214 3215 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3216 3217 if radius: 3218 ax.add_artist(Ellipse( 3219 xy = XY0, 3220 width = radius*2, 3221 height = radius*2, 3222 ls = (0, (2,2)), 3223 lw = .7, 3224 ec = analysis_color, 3225 fc = 'None', 3226 )) 3227 ppl.text( 3228 XY0[0], 3229 XY0[1]-radius, 3230 f'\n± {radius*1e3:.0f} ppm', 3231 color = analysis_color, 3232 va = 'top', 3233 ha = 'center', 3234 linespacing = 0.4, 3235 size = 8, 3236 ) 3237 3238 if not os.path.exists(dir): 3239 os.makedirs(dir) 3240 fig.savefig(f'{dir}/{s}.pdf') 3241 ppl.close(fig) 3242 3243 fig = ppl.figure(figsize = figsize) 3244 fig.subplots_adjust(*subplots_adjust) 3245 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3246 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3247 3248 for s in saved: 3249 for xy in saved[s][0]: 3250 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3251 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3252 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3253 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3254 3255 x1, x2, y1, y2 = ppl.axis() 3256 ppl.axis([ 3257 x1 - (x2-x1)/10, 3258 x2 + (x2-x1)/10, 3259 y1 - (y2-y1)/10, 3260 y2 + (y2-y1)/10, 3261 ]) 3262 3263 3264 if not os.path.exists(dir): 3265 os.makedirs(dir) 3266 fig.savefig(f'{dir}/__all__.pdf') 3267 if show: 3268 ppl.show() 3269 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions where plots for
each sample are saved. Another plot named __all__.pdf shows all analyses together.
Parameters
samples: Only these samples are processed (by default: all samples).dir: where to save the plotsfigsize: (width, height) of figuresubplots_adjust: passed tosubplots_adjust()show: whether to callmatplotlib.pyplot.show()on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color: color used for replicate markers/labelsanalysis_color: color used for sample markers/labelslabeldist: distance (in inches) from replicate markers to replicate labelsradius: radius of the dashed circle providing scale. No circle ifradius = 0.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3315class D47data(D4xdata): 3316 ''' 3317 Store and process data for a large set of Δ47 analyses, 3318 usually comprising more than one analytical session. 3319 ''' 3320 3321 Nominal_D4x = { 3322 'ETH-1': 0.2052, 3323 'ETH-2': 0.2085, 3324 'ETH-3': 0.6132, 3325 'ETH-4': 0.4511, 3326 'IAEA-C1': 0.3018, 3327 'IAEA-C2': 0.6409, 3328 'MERCK': 0.5135, 3329 } # I-CDES (Bernasconi et al., 2021) 3330 ''' 3331 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3332 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3333 reference frame. 3334 3335 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3336 ```py 3337 { 3338 'ETH-1' : 0.2052, 3339 'ETH-2' : 0.2085, 3340 'ETH-3' : 0.6132, 3341 'ETH-4' : 0.4511, 3342 'IAEA-C1' : 0.3018, 3343 'IAEA-C2' : 0.6409, 3344 'MERCK' : 0.5135, 3345 } 3346 ``` 3347 ''' 3348 3349 3350 @property 3351 def Nominal_D47(self): 3352 return self.Nominal_D4x 3353 3354 3355 @Nominal_D47.setter 3356 def Nominal_D47(self, new): 3357 self.Nominal_D4x = dict(**new) 3358 self.refresh() 3359 3360 3361 def __init__(self, l = [], **kwargs): 3362 ''' 3363 **Parameters:** same as `D4xdata.__init__()` 3364 ''' 3365 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3366 3367 3368 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3369 ''' 3370 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3371 value for that temperature, and add treat these samples as additional anchors. 3372 3373 **Parameters** 3374 3375 + `fCo2eqD47`: Which CO2 equilibrium law to use 3376 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3377 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3378 + `priority`: if `replace`: forget old anchors and only use the new ones; 3379 if `new`: keep pre-existing anchors but update them in case of conflict 3380 between old and new Δ47 values; 3381 if `old`: keep pre-existing anchors but preserve their original Δ47 3382 values in case of conflict. 3383 ''' 3384 f = { 3385 'petersen': fCO2eqD47_Petersen, 3386 'wang': fCO2eqD47_Wang, 3387 }[fCo2eqD47] 3388 foo = {} 3389 for r in self: 3390 if 'Teq' in r: 3391 if r['Sample'] in foo: 3392 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3393 else: 3394 foo[r['Sample']] = f(r['Teq']) 3395 else: 3396 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3397 3398 if priority == 'replace': 3399 self.Nominal_D47 = {} 3400 for s in foo: 3401 if priority != 'old' or s not in self.Nominal_D47: 3402 self.Nominal_D47[s] = foo[s] 3403 3404 def save_D47_correl(self, *args, **kwargs): 3405 return self._save_D4x_correl(*args, **kwargs) 3406 3407 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3361 def __init__(self, l = [], **kwargs): 3362 ''' 3363 **Parameters:** same as `D4xdata.__init__()` 3364 ''' 3365 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize() to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3368 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3369 ''' 3370 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3371 value for that temperature, and add treat these samples as additional anchors. 3372 3373 **Parameters** 3374 3375 + `fCo2eqD47`: Which CO2 equilibrium law to use 3376 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3377 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3378 + `priority`: if `replace`: forget old anchors and only use the new ones; 3379 if `new`: keep pre-existing anchors but update them in case of conflict 3380 between old and new Δ47 values; 3381 if `old`: keep pre-existing anchors but preserve their original Δ47 3382 values in case of conflict. 3383 ''' 3384 f = { 3385 'petersen': fCO2eqD47_Petersen, 3386 'wang': fCO2eqD47_Wang, 3387 }[fCo2eqD47] 3388 foo = {} 3389 for r in self: 3390 if 'Teq' in r: 3391 if r['Sample'] in foo: 3392 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3393 else: 3394 foo[r['Sample']] = f(r['Teq']) 3395 else: 3396 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3397 3398 if priority == 'replace': 3399 self.Nominal_D47 = {} 3400 for s in foo: 3401 if priority != 'old' or s not in self.Nominal_D47: 3402 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019);wang: Wang et al. (2019)).priority: ifreplace: forget old anchors and only use the new ones; ifnew: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Save D47 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D47_correl.csv)D47_precision: the precision to use when writingD47andD47_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3410class D48data(D4xdata): 3411 ''' 3412 Store and process data for a large set of Δ48 analyses, 3413 usually comprising more than one analytical session. 3414 ''' 3415 3416 Nominal_D4x = { 3417 'ETH-1': 0.138, 3418 'ETH-2': 0.138, 3419 'ETH-3': 0.270, 3420 'ETH-4': 0.223, 3421 'GU-1': -0.419, 3422 } # (Fiebig et al., 2019, 2021) 3423 ''' 3424 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3425 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3426 reference frame. 3427 3428 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3429 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3430 3431 ```py 3432 { 3433 'ETH-1' : 0.138, 3434 'ETH-2' : 0.138, 3435 'ETH-3' : 0.270, 3436 'ETH-4' : 0.223, 3437 'GU-1' : -0.419, 3438 } 3439 ``` 3440 ''' 3441 3442 3443 @property 3444 def Nominal_D48(self): 3445 return self.Nominal_D4x 3446 3447 3448 @Nominal_D48.setter 3449 def Nominal_D48(self, new): 3450 self.Nominal_D4x = dict(**new) 3451 self.refresh() 3452 3453 3454 def __init__(self, l = [], **kwargs): 3455 ''' 3456 **Parameters:** same as `D4xdata.__init__()` 3457 ''' 3458 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3459 3460 def save_D48_correl(self, *args, **kwargs): 3461 return self._save_D4x_correl(*args, **kwargs) 3462 3463 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3454 def __init__(self, l = [], **kwargs): 3455 ''' 3456 **Parameters:** same as `D4xdata.__init__()` 3457 ''' 3458 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize() to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Save D48 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D48_correl.csv)D48_precision: the precision to use when writingD48andD48_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3466class D49data(D4xdata): 3467 ''' 3468 Store and process data for a large set of Δ49 analyses, 3469 usually comprising more than one analytical session. 3470 ''' 3471 3472 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3473 ''' 3474 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3475 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3476 reference frame. 3477 3478 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3479 3480 ```py 3481 { 3482 "1000C": 0.0, 3483 "25C": 2.228 3484 } 3485 ``` 3486 ''' 3487 3488 @property 3489 def Nominal_D49(self): 3490 return self.Nominal_D4x 3491 3492 @Nominal_D49.setter 3493 def Nominal_D49(self, new): 3494 self.Nominal_D4x = dict(**new) 3495 self.refresh() 3496 3497 def __init__(self, l=[], **kwargs): 3498 ''' 3499 **Parameters:** same as `D4xdata.__init__()` 3500 ''' 3501 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3502 3503 def save_D49_correl(self, *args, **kwargs): 3504 return self._save_D4x_correl(*args, **kwargs) 3505 3506 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.
3497 def __init__(self, l=[], **kwargs): 3498 ''' 3499 **Parameters:** same as `D4xdata.__init__()` 3500 ''' 3501 D4xdata.__init__(self, l=l, mass='49', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ49 values assigned to the Δ49 anchor samples, used by
D49data.standardize() to normalize unknown samples to an absolute Δ49
reference frame.
By default equal to (after Wang et al. (2004)):
{
"1000C": 0.0,
"25C": 2.228
}
Save D49 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D49_correl.csv)D49_precision: the precision to use when writingD49andD49_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort