D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
devbranch source code here and rename it toD47crunch.py. - Do any of the following:
- copy
D47crunch.pyto somewhere in your Python path - copy
D47crunch.pyto a working directory (import D47crunchwill only work if called within that directory) - copy
D47crunch.pyto any other location (e.g.,/foo/bar) and then use the following code snippet in your own code to importD47crunch:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data object which will store and process this data:
import D47crunch
mydata = D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv into our D47data object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data(). The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO and BAR with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data() documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')

The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses() for how to plot analyses as a function of “true” time (based on the TimeTag for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.

2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)

Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions() produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE:

2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDBD4xdata.Nominal_d18O_VPDBD47data.Nominal_D4x(also accessible throughD47data.Nominal_D47)D48data.Nominal_D4x(also accessible throughD48data.Nominal_D48)
17O correction parameters are defined by:
D4xdata.R13_VPDBD4xdata.R18_VSMOWD4xdata.R18_VPDBD4xdata.LAMBDA_17D4xdata.R17_VSMOWD4xdata.R17_VPDB
When creating a new instance of D47data or D48data, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW and Nominal_D47 can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW and D47data.Nominal_D47 _before_ creating a D47data object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW and Nominal_D47 _after_ creating a D47data object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch uses two independent classes — D47data and D48data — which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data and D48data.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:
D47crunch rawdata.csv
This will create a directory named output and populate it by calling the following methods:
D47data.wg()D47data.crunch()D47data.standardize()D47data.summary()D47data.table_of_samples()D47data.table_of_sessions()D47data.plot_sessions()D47data.plot_residuals()D47data.table_of_analyses()D47data.plot_distribution_of_analyses()D47data.plot_bulk_compositions()D47data.save_D47_correl()
You may specify a custom set of anchors instead of the default ones using the --anchors or -a option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB, d18O_VPDB, and D47 values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude or -e option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03, A09, and B06, and those of samples MYBADSAMPLE-1 and MYBADSAMPLE-2. It is possible to have and exclude file with only the UID column, or only the Sample column, or both, in any order.
The --output-dir or -o option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
To process Δ48 as well as Δ47, just add the --D48 option.
API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../../docpages/tutorial.md 13.. include:: ../../docpages/howto.md 14.. include:: ../../docpages/cli.md 15 16<h1>API Documentation</h1> 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) Mathieu Daëron' 23__license__ = 'MIT License - https://opensource.org/licenses/MIT' 24__date__ = '2025-12-14' 25__version__ = '2.5.1' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42from typer import rich_utils 43 44rich_utils.STYLE_HELPTEXT = '' 45 46rcParams['font.family'] = 'sans-serif' 47rcParams['font.sans-serif'] = 'Helvetica' 48rcParams['font.size'] = 10 49rcParams['mathtext.fontset'] = 'custom' 50rcParams['mathtext.rm'] = 'sans' 51rcParams['mathtext.bf'] = 'sans:bold' 52rcParams['mathtext.it'] = 'sans:italic' 53rcParams['mathtext.cal'] = 'sans:italic' 54rcParams['mathtext.default'] = 'rm' 55rcParams['xtick.major.size'] = 4 56rcParams['xtick.major.width'] = 1 57rcParams['ytick.major.size'] = 4 58rcParams['ytick.major.width'] = 1 59rcParams['axes.grid'] = False 60rcParams['axes.linewidth'] = 1 61rcParams['grid.linewidth'] = .75 62rcParams['grid.linestyle'] = '-' 63rcParams['grid.alpha'] = .15 64rcParams['savefig.dpi'] = 150 65 66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 68def fCO2eqD47_Petersen(T): 69 ''' 70 CO2 equilibrium Δ47 value as a function of T (in degrees C) 71 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 72 73 ''' 74 return float(_fCO2eqD47_Petersen(T)) 75 76 77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 79def fCO2eqD47_Wang(T): 80 ''' 81 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 82 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 83 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 84 ''' 85 return float(_fCO2eqD47_Wang(T)) 86 87 88def correlated_sum(X, C, w = None): 89 ''' 90 Compute covariance-aware linear combinations 91 92 **Parameters** 93 94 + `X`: list or 1-D array of values to sum 95 + `C`: covariance matrix for the elements of `X` 96 + `w`: list or 1-D array of weights to apply to the elements of `X` 97 (all equal to 1 by default) 98 99 Return the sum (and its SE) of the elements of `X`, with optional weights equal 100 to the elements of `w`, accounting for covariances between the elements of `X`. 101 ''' 102 if w is None: 103 w = [1 for x in X] 104 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 105 106 107def make_csv(x, hsep = ',', vsep = '\n'): 108 ''' 109 Formats a list of lists of strings as a CSV 110 111 **Parameters** 112 113 + `x`: the list of lists of strings to format 114 + `hsep`: the field separator (`,` by default) 115 + `vsep`: the line-ending convention to use (`\\n` by default) 116 117 **Example** 118 119 ```py 120 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 121 ``` 122 123 outputs: 124 125 ```py 126 a,b,c 127 d,e,f 128 ``` 129 ''' 130 return vsep.join([hsep.join(l) for l in x]) 131 132 133def pf(txt): 134 ''' 135 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 136 ''' 137 return txt.replace('-','_').replace('.','_').replace(' ','_') 138 139 140def smart_type(x): 141 ''' 142 Tries to convert string `x` to a float if it includes a decimal point, or 143 to an integer if it does not. If both attempts fail, return the original 144 string unchanged. 145 ''' 146 try: 147 y = float(x) 148 except ValueError: 149 return x 150 if '.' not in x: 151 return int(y) 152 return y 153 154class _Defaults(): 155 def __init__(self): 156 pass 157 158D47crunch_defaults = _Defaults() 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—' 160 161def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 162 ''' 163 Reads a list of lists of strings and outputs an ascii table 164 165 **Parameters** 166 167 + `x`: a list of lists of strings 168 + `header`: the number of lines to treat as header lines 169 + `hsep`: the horizontal separator between columns 170 + `vsep`: the character to use as vertical separator 171 + `align`: string of left (`<`) or right (`>`) alignment characters. 172 173 **Example** 174 175 ```py 176 print(pretty_table([ 177 ['A', 'B', 'C'], 178 ['1', '1.9999', 'foo'], 179 ['10', 'x', 'bar'], 180 ])) 181 ``` 182 yields: 183 ``` 184 —— —————— ——— 185 A B C 186 —— —————— ——— 187 1 1.9999 foo 188 10 x bar 189 —— —————— ——— 190 ``` 191 192 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 193 194 ```py 195 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 196 print(pretty_table([ 197 ['A', 'B', 'C'], 198 ['1', '1.9999', 'foo'], 199 ['10', 'x', 'bar'], 200 ])) 201 ``` 202 yields: 203 ``` 204 == ====== === 205 A B C 206 == ====== === 207 1 1.9999 foo 208 10 x bar 209 == ====== === 210 ``` 211 ''' 212 213 if vsep is None: 214 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 215 216 txt = [] 217 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 218 219 if len(widths) > len(align): 220 align += '>' * (len(widths)-len(align)) 221 sepline = hsep.join([vsep*w for w in widths]) 222 txt += [sepline] 223 for k,l in enumerate(x): 224 if k and k == header: 225 txt += [sepline] 226 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 227 txt += [sepline] 228 txt += [''] 229 return '\n'.join(txt) 230 231 232def transpose_table(x): 233 ''' 234 Transpose a list if lists 235 236 **Parameters** 237 238 + `x`: a list of lists 239 240 **Example** 241 242 ```py 243 x = [[1, 2], [3, 4]] 244 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 245 ``` 246 ''' 247 return [[e for e in c] for c in zip(*x)] 248 249 250def w_avg(X, sX) : 251 ''' 252 Compute variance-weighted average 253 254 Returns the value and SE of the weighted average of the elements of `X`, 255 with relative weights equal to their inverse variances (`1/sX**2`). 256 257 **Parameters** 258 259 + `X`: array-like of elements to average 260 + `sX`: array-like of the corresponding SE values 261 262 **Tip** 263 264 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 265 they may be rearranged using `zip()`: 266 267 ```python 268 foo = [(0, 1), (1, 0.5), (2, 0.5)] 269 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 270 ``` 271 ''' 272 X = [ x for x in X ] 273 sX = [ sx for sx in sX ] 274 W = [ sx**-2 for sx in sX ] 275 W = [ w/sum(W) for w in W ] 276 Xavg = sum([ w*x for w,x in zip(W,X) ]) 277 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 278 return Xavg, sXavg 279 280 281def read_csv(filename, sep = ''): 282 ''' 283 Read contents of `filename` in csv format and return a list of dictionaries. 284 285 In the csv string, spaces before and after field separators (`','` by default) 286 are optional. 287 288 **Parameters** 289 290 + `filename`: the csv file to read 291 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 292 whichever appers most often in the contents of `filename`. 293 ''' 294 with open(filename) as fid: 295 txt = fid.read() 296 297 if sep == '': 298 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 299 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 300 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 301 302 303def simulate_single_analysis( 304 sample = 'MYSAMPLE', 305 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 306 d13C_VPDB = None, d18O_VPDB = None, 307 D47 = None, D48 = None, D49 = 0., D17O = 0., 308 a47 = 1., b47 = 0., c47 = -0.9, 309 a48 = 1., b48 = 0., c48 = -0.45, 310 Nominal_D47 = None, 311 Nominal_D48 = None, 312 Nominal_d13C_VPDB = None, 313 Nominal_d18O_VPDB = None, 314 ALPHA_18O_ACID_REACTION = None, 315 R13_VPDB = None, 316 R17_VSMOW = None, 317 R18_VSMOW = None, 318 LAMBDA_17 = None, 319 R18_VPDB = None, 320 ): 321 ''' 322 Compute working-gas delta values for a single analysis, assuming a stochastic working 323 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 324 325 **Parameters** 326 327 + `sample`: sample name 328 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 329 (respectively –4 and +26 ‰ by default) 330 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 331 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 332 of the carbonate sample 333 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 334 Δ48 values if `D47` or `D48` are not specified 335 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 336 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 337 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 338 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 339 correction parameters (by default equal to the `D4xdata` default values) 340 341 Returns a dictionary with fields 342 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 343 ''' 344 345 if Nominal_d13C_VPDB is None: 346 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 347 348 if Nominal_d18O_VPDB is None: 349 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 350 351 if ALPHA_18O_ACID_REACTION is None: 352 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 353 354 if R13_VPDB is None: 355 R13_VPDB = D4xdata().R13_VPDB 356 357 if R17_VSMOW is None: 358 R17_VSMOW = D4xdata().R17_VSMOW 359 360 if R18_VSMOW is None: 361 R18_VSMOW = D4xdata().R18_VSMOW 362 363 if LAMBDA_17 is None: 364 LAMBDA_17 = D4xdata().LAMBDA_17 365 366 if R18_VPDB is None: 367 R18_VPDB = D4xdata().R18_VPDB 368 369 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 370 371 if Nominal_D47 is None: 372 Nominal_D47 = D47data().Nominal_D47 373 374 if Nominal_D48 is None: 375 Nominal_D48 = D48data().Nominal_D48 376 377 if d13C_VPDB is None: 378 if sample in Nominal_d13C_VPDB: 379 d13C_VPDB = Nominal_d13C_VPDB[sample] 380 else: 381 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 382 383 if d18O_VPDB is None: 384 if sample in Nominal_d18O_VPDB: 385 d18O_VPDB = Nominal_d18O_VPDB[sample] 386 else: 387 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 388 389 if D47 is None: 390 if sample in Nominal_D47: 391 D47 = Nominal_D47[sample] 392 else: 393 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 394 395 if D48 is None: 396 if sample in Nominal_D48: 397 D48 = Nominal_D48[sample] 398 else: 399 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 400 401 X = D4xdata() 402 X.R13_VPDB = R13_VPDB 403 X.R17_VSMOW = R17_VSMOW 404 X.R18_VSMOW = R18_VSMOW 405 X.LAMBDA_17 = LAMBDA_17 406 X.R18_VPDB = R18_VPDB 407 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 408 409 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 410 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 411 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 412 ) 413 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 414 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 415 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 416 D17O=D17O, D47=D47, D48=D48, D49=D49, 417 ) 418 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 419 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 420 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 421 D17O=D17O, 422 ) 423 424 d45 = 1000 * (R45/R45wg - 1) 425 d46 = 1000 * (R46/R46wg - 1) 426 d47 = 1000 * (R47/R47wg - 1) 427 d48 = 1000 * (R48/R48wg - 1) 428 d49 = 1000 * (R49/R49wg - 1) 429 430 for k in range(3): # dumb iteration to adjust for small changes in d47 431 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 432 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 433 d47 = 1000 * (R47raw/R47wg - 1) 434 d48 = 1000 * (R48raw/R48wg - 1) 435 436 return dict( 437 Sample = sample, 438 D17O = D17O, 439 d13Cwg_VPDB = d13Cwg_VPDB, 440 d18Owg_VSMOW = d18Owg_VSMOW, 441 d45 = d45, 442 d46 = d46, 443 d47 = d47, 444 d48 = d48, 445 d49 = d49, 446 ) 447 448 449def virtual_data( 450 samples = [], 451 a47 = 1., b47 = 0., c47 = -0.9, 452 a48 = 1., b48 = 0., c48 = -0.45, 453 rd45 = 0.020, rd46 = 0.060, 454 rD47 = 0.015, rD48 = 0.045, 455 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 456 session = None, 457 Nominal_D47 = None, Nominal_D48 = None, 458 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 459 ALPHA_18O_ACID_REACTION = None, 460 R13_VPDB = None, 461 R17_VSMOW = None, 462 R18_VSMOW = None, 463 LAMBDA_17 = None, 464 R18_VPDB = None, 465 seed = 0, 466 shuffle = True, 467 ): 468 ''' 469 Return list with simulated analyses from a single session. 470 471 **Parameters** 472 473 + `samples`: a list of entries; each entry is a dictionary with the following fields: 474 * `Sample`: the name of the sample 475 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 476 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 477 * `N`: how many analyses to generate for this sample 478 + `a47`: scrambling factor for Δ47 479 + `b47`: compositional nonlinearity for Δ47 480 + `c47`: working gas offset for Δ47 481 + `a48`: scrambling factor for Δ48 482 + `b48`: compositional nonlinearity for Δ48 483 + `c48`: working gas offset for Δ48 484 + `rd45`: analytical repeatability of δ45 485 + `rd46`: analytical repeatability of δ46 486 + `rD47`: analytical repeatability of Δ47 487 + `rD48`: analytical repeatability of Δ48 488 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 489 (by default equal to the `simulate_single_analysis` default values) 490 + `session`: name of the session (no name by default) 491 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 492 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 493 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 494 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 495 (by default equal to the `simulate_single_analysis` defaults) 496 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 497 (by default equal to the `simulate_single_analysis` defaults) 498 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 499 correction parameters (by default equal to the `simulate_single_analysis` default) 500 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 501 + `shuffle`: randomly reorder the sequence of analyses 502 503 504 Here is an example of using this method to generate an arbitrary combination of 505 anchors and unknowns for a bunch of sessions: 506 507 ```py 508 .. include:: ../../code_examples/virtual_data/example.py 509 ``` 510 511 This should output something like: 512 513 ``` 514 .. include:: ../../code_examples/virtual_data/output.txt 515 ``` 516 ''' 517 518 kwargs = locals().copy() 519 520 from numpy import random as nprandom 521 if seed: 522 nprandom.seed(seed) 523 rng = nprandom.default_rng(seed) 524 else: 525 rng = nprandom.default_rng() 526 527 N = sum([s['N'] for s in samples]) 528 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 529 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 530 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 531 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 532 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 533 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 534 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 535 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 536 537 k = 0 538 out = [] 539 for s in samples: 540 kw = {} 541 kw['sample'] = s['Sample'] 542 kw = { 543 **kw, 544 **{var: kwargs[var] 545 for var in [ 546 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 547 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 548 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 549 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 550 ] 551 if kwargs[var] is not None}, 552 **{var: s[var] 553 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 554 if var in s}, 555 } 556 557 sN = s['N'] 558 while sN: 559 out.append(simulate_single_analysis(**kw)) 560 out[-1]['d45'] += errors45[k] 561 out[-1]['d46'] += errors46[k] 562 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 563 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 564 sN -= 1 565 k += 1 566 567 if session is not None: 568 for r in out: 569 r['Session'] = session 570 571 if shuffle: 572 nprandom.shuffle(out) 573 574 return out 575 576def table_of_samples( 577 data47 = None, 578 data48 = None, 579 dir = 'output', 580 filename = None, 581 save_to_file = True, 582 print_out = True, 583 output = None, 584 ): 585 ''' 586 Print out, save to disk and/or return a combined table of samples 587 for a pair of `D47data` and `D48data` objects. 588 589 **Parameters** 590 591 + `data47`: `D47data` instance 592 + `data48`: `D48data` instance 593 + `dir`: the directory in which to save the table 594 + `filename`: the name to the csv file to write to 595 + `save_to_file`: whether to save the table to disk 596 + `print_out`: whether to print out the table 597 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 598 if set to `'raw'`: return a list of list of strings 599 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 600 ''' 601 if data47 is None: 602 if data48 is None: 603 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 604 else: 605 return data48.table_of_samples( 606 dir = dir, 607 filename = filename, 608 save_to_file = save_to_file, 609 print_out = print_out, 610 output = output 611 ) 612 else: 613 if data48 is None: 614 return data47.table_of_samples( 615 dir = dir, 616 filename = filename, 617 save_to_file = save_to_file, 618 print_out = print_out, 619 output = output 620 ) 621 else: 622 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 623 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 624 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 625 626 if save_to_file: 627 if not os.path.exists(dir): 628 os.makedirs(dir) 629 if filename is None: 630 filename = f'D47D48_samples.csv' 631 with open(f'{dir}/{filename}', 'w') as fid: 632 fid.write(make_csv(out)) 633 if print_out: 634 print('\n'+pretty_table(out)) 635 if output == 'raw': 636 return out 637 elif output == 'pretty': 638 return pretty_table(out) 639 640 641def table_of_sessions( 642 data47 = None, 643 data48 = None, 644 dir = 'output', 645 filename = None, 646 save_to_file = True, 647 print_out = True, 648 output = None, 649 ): 650 ''' 651 Print out, save to disk and/or return a combined table of sessions 652 for a pair of `D47data` and `D48data` objects. 653 ***Only applicable if the sessions in `data47` and those in `data48` 654 consist of the exact same sets of analyses.*** 655 656 **Parameters** 657 658 + `data47`: `D47data` instance 659 + `data48`: `D48data` instance 660 + `dir`: the directory in which to save the table 661 + `filename`: the name to the csv file to write to 662 + `save_to_file`: whether to save the table to disk 663 + `print_out`: whether to print out the table 664 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 665 if set to `'raw'`: return a list of list of strings 666 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 667 ''' 668 if data47 is None: 669 if data48 is None: 670 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 671 else: 672 return data48.table_of_sessions( 673 dir = dir, 674 filename = filename, 675 save_to_file = save_to_file, 676 print_out = print_out, 677 output = output 678 ) 679 else: 680 if data48 is None: 681 return data47.table_of_sessions( 682 dir = dir, 683 filename = filename, 684 save_to_file = save_to_file, 685 print_out = print_out, 686 output = output 687 ) 688 else: 689 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 690 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 691 for k,x in enumerate(out47[0]): 692 if k>7: 693 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 694 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 695 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 696 697 if save_to_file: 698 if not os.path.exists(dir): 699 os.makedirs(dir) 700 if filename is None: 701 filename = f'D47D48_sessions.csv' 702 with open(f'{dir}/{filename}', 'w') as fid: 703 fid.write(make_csv(out)) 704 if print_out: 705 print('\n'+pretty_table(out)) 706 if output == 'raw': 707 return out 708 elif output == 'pretty': 709 return pretty_table(out) 710 711 712def table_of_analyses( 713 data47 = None, 714 data48 = None, 715 dir = 'output', 716 filename = None, 717 save_to_file = True, 718 print_out = True, 719 output = None, 720 ): 721 ''' 722 Print out, save to disk and/or return a combined table of analyses 723 for a pair of `D47data` and `D48data` objects. 724 725 If the sessions in `data47` and those in `data48` do not consist of 726 the exact same sets of analyses, the table will have two columns 727 `Session_47` and `Session_48` instead of a single `Session` column. 728 729 **Parameters** 730 731 + `data47`: `D47data` instance 732 + `data48`: `D48data` instance 733 + `dir`: the directory in which to save the table 734 + `filename`: the name to the csv file to write to 735 + `save_to_file`: whether to save the table to disk 736 + `print_out`: whether to print out the table 737 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 738 if set to `'raw'`: return a list of list of strings 739 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 740 ''' 741 if data47 is None: 742 if data48 is None: 743 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 744 else: 745 return data48.table_of_analyses( 746 dir = dir, 747 filename = filename, 748 save_to_file = save_to_file, 749 print_out = print_out, 750 output = output 751 ) 752 else: 753 if data48 is None: 754 return data47.table_of_analyses( 755 dir = dir, 756 filename = filename, 757 save_to_file = save_to_file, 758 print_out = print_out, 759 output = output 760 ) 761 else: 762 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 763 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 764 765 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 766 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 767 else: 768 out47[0][1] = 'Session_47' 769 out48[0][1] = 'Session_48' 770 out47 = transpose_table(out47) 771 out48 = transpose_table(out48) 772 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 773 774 if save_to_file: 775 if not os.path.exists(dir): 776 os.makedirs(dir) 777 if filename is None: 778 filename = f'D47D48_sessions.csv' 779 with open(f'{dir}/{filename}', 'w') as fid: 780 fid.write(make_csv(out)) 781 if print_out: 782 print('\n'+pretty_table(out)) 783 if output == 'raw': 784 return out 785 elif output == 'pretty': 786 return pretty_table(out) 787 788 789def _fullcovar(minresult, epsilon = 0.01, named = False): 790 ''' 791 Construct full covariance matrix in the case of constrained parameters 792 ''' 793 794 import asteval 795 796 def f(values): 797 interp = asteval.Interpreter() 798 for n,v in zip(minresult.var_names, values): 799 interp(f'{n} = {v}') 800 for q in minresult.params: 801 if minresult.params[q].expr: 802 interp(f'{q} = {minresult.params[q].expr}') 803 return np.array([interp.symtable[q] for q in minresult.params]) 804 805 # construct Jacobian 806 J = np.zeros((minresult.nvarys, len(minresult.params))) 807 X = np.array([minresult.params[p].value for p in minresult.var_names]) 808 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 809 810 for j in range(minresult.nvarys): 811 x1 = [_ for _ in X] 812 x1[j] += epsilon * sX[j] 813 x2 = [_ for _ in X] 814 x2[j] -= epsilon * sX[j] 815 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 816 817 _names = [q for q in minresult.params] 818 _covar = J.T @ minresult.covar @ J 819 _se = np.diag(_covar)**.5 820 _correl = _covar.copy() 821 for k,s in enumerate(_se): 822 if s: 823 _correl[k,:] /= s 824 _correl[:,k] /= s 825 826 if named: 827 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 828 _se = {i: _se[i] for i in minresult.params} 829 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 830 831 return _names, _covar, _se, _correl 832 833 834class D4xdata(list): 835 ''' 836 Store and process data for a large set of Δ47 and/or Δ48 837 analyses, usually comprising more than one analytical session. 838 ''' 839 840 ### 17O CORRECTION PARAMETERS 841 R13_VPDB = 0.01118 # (Chang & Li, 1990) 842 ''' 843 Absolute (13C/12C) ratio of VPDB. 844 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 845 ''' 846 847 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 848 ''' 849 Absolute (18O/16C) ratio of VSMOW. 850 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 851 ''' 852 853 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 854 ''' 855 Mass-dependent exponent for triple oxygen isotopes. 856 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 857 ''' 858 859 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 860 ''' 861 Absolute (17O/16C) ratio of VSMOW. 862 By default equal to 0.00038475 863 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 864 rescaled to `R13_VPDB`) 865 ''' 866 867 R18_VPDB = R18_VSMOW * 1.03092 868 ''' 869 Absolute (18O/16C) ratio of VPDB. 870 By definition equal to `R18_VSMOW * 1.03092`. 871 ''' 872 873 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 874 ''' 875 Absolute (17O/16C) ratio of VPDB. 876 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 877 ''' 878 879 LEVENE_REF_SAMPLE = 'ETH-3' 880 ''' 881 After the Δ4x standardization step, each sample is tested to 882 assess whether the Δ4x variance within all analyses for that 883 sample differs significantly from that observed for a given reference 884 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 885 which yields a p-value corresponding to the null hypothesis that the 886 underlying variances are equal). 887 888 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 889 sample should be used as a reference for this test. 890 ''' 891 892 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 893 ''' 894 Specifies the 18O/16O fractionation factor generally applicable 895 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 896 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 897 898 By default equal to 1.008129 (calcite reacted at 90 °C, 899 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 900 ''' 901 902 Nominal_d13C_VPDB = { 903 'ETH-1': 2.02, 904 'ETH-2': -10.17, 905 'ETH-3': 1.71, 906 } # (Bernasconi et al., 2018) 907 ''' 908 Nominal δ13C_VPDB values assigned to carbonate standards, used by 909 `D4xdata.standardize_d13C()`. 910 911 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 912 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 913 ''' 914 915 Nominal_d18O_VPDB = { 916 'ETH-1': -2.19, 917 'ETH-2': -18.69, 918 'ETH-3': -1.78, 919 } # (Bernasconi et al., 2018) 920 ''' 921 Nominal δ18O_VPDB values assigned to carbonate standards, used by 922 `D4xdata.standardize_d18O()`. 923 924 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 925 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 926 ''' 927 928 d13C_STANDARDIZATION_METHOD = '2pt' 929 ''' 930 Method by which to standardize δ13C values: 931 932 + `none`: do not apply any δ13C standardization. 933 + `'1pt'`: within each session, offset all initial δ13C values so as to 934 minimize the difference between final δ13C_VPDB values and 935 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 936 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 937 values so as to minimize the difference between final δ13C_VPDB 938 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 939 is defined). 940 ''' 941 942 d18O_STANDARDIZATION_METHOD = '2pt' 943 ''' 944 Method by which to standardize δ18O values: 945 946 + `none`: do not apply any δ18O standardization. 947 + `'1pt'`: within each session, offset all initial δ18O values so as to 948 minimize the difference between final δ18O_VPDB values and 949 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 950 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 951 values so as to minimize the difference between final δ18O_VPDB 952 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 953 is defined). 954 ''' 955 956 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 957 ''' 958 **Parameters** 959 960 + `l`: a list of dictionaries, with each dictionary including at least the keys 961 `Sample`, `d45`, `d46`, and `d47` or `d48`. 962 + `mass`: `'47'` or `'48'` 963 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 964 + `session`: define session name for analyses without a `Session` key 965 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 966 967 Returns a `D4xdata` object derived from `list`. 968 ''' 969 self._4x = mass 970 self.verbose = verbose 971 self.prefix = 'D4xdata' 972 self.logfile = logfile 973 list.__init__(self, l) 974 self.Nf = None 975 self.repeatability = {} 976 self.refresh(session = session) 977 978 979 def make_verbal(oldfun): 980 ''' 981 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 982 ''' 983 @wraps(oldfun) 984 def newfun(*args, verbose = '', **kwargs): 985 myself = args[0] 986 oldprefix = myself.prefix 987 myself.prefix = oldfun.__name__ 988 if verbose != '': 989 oldverbose = myself.verbose 990 myself.verbose = verbose 991 out = oldfun(*args, **kwargs) 992 myself.prefix = oldprefix 993 if verbose != '': 994 myself.verbose = oldverbose 995 return out 996 return newfun 997 998 999 def msg(self, txt): 1000 ''' 1001 Log a message to `self.logfile`, and print it out if `verbose = True` 1002 ''' 1003 self.log(txt) 1004 if self.verbose: 1005 print(f'{f"[{self.prefix}]":<16} {txt}') 1006 1007 1008 def vmsg(self, txt): 1009 ''' 1010 Log a message to `self.logfile` and print it out 1011 ''' 1012 self.log(txt) 1013 print(txt) 1014 1015 1016 def log(self, *txts): 1017 ''' 1018 Log a message to `self.logfile` 1019 ''' 1020 if self.logfile: 1021 with open(self.logfile, 'a') as fid: 1022 for txt in txts: 1023 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1024 1025 1026 def refresh(self, session = 'mySession'): 1027 ''' 1028 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1029 ''' 1030 self.fill_in_missing_info(session = session) 1031 self.refresh_sessions() 1032 self.refresh_samples() 1033 1034 1035 def refresh_sessions(self): 1036 ''' 1037 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1038 to `False` for all sessions. 1039 ''' 1040 self.sessions = { 1041 s: {'data': [r for r in self if r['Session'] == s]} 1042 for s in sorted({r['Session'] for r in self}) 1043 } 1044 for s in self.sessions: 1045 self.sessions[s]['scrambling_drift'] = False 1046 self.sessions[s]['slope_drift'] = False 1047 self.sessions[s]['wg_drift'] = False 1048 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1049 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1050 1051 1052 def refresh_samples(self): 1053 ''' 1054 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1055 ''' 1056 self.samples = { 1057 s: {'data': [r for r in self if r['Sample'] == s]} 1058 for s in sorted({r['Sample'] for r in self}) 1059 } 1060 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1061 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1062 1063 1064 def read(self, filename, sep = '', session = ''): 1065 ''' 1066 Read file in csv format to load data into a `D47data` object. 1067 1068 In the csv file, spaces before and after field separators (`','` by default) 1069 are optional. Each line corresponds to a single analysis. 1070 1071 The required fields are: 1072 1073 + `UID`: a unique identifier 1074 + `Session`: an identifier for the analytical session 1075 + `Sample`: a sample identifier 1076 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1077 1078 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1079 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1080 and `d49` are optional, and set to NaN by default. 1081 1082 **Parameters** 1083 1084 + `fileneme`: the path of the file to read 1085 + `sep`: csv separator delimiting the fields 1086 + `session`: set `Session` field to this string for all analyses 1087 ''' 1088 with open(filename) as fid: 1089 self.input(fid.read(), sep = sep, session = session) 1090 1091 1092 def input(self, txt, sep = '', session = ''): 1093 ''' 1094 Read `txt` string in csv format to load analysis data into a `D47data` object. 1095 1096 In the csv string, spaces before and after field separators (`','` by default) 1097 are optional. Each line corresponds to a single analysis. 1098 1099 The required fields are: 1100 1101 + `UID`: a unique identifier 1102 + `Session`: an identifier for the analytical session 1103 + `Sample`: a sample identifier 1104 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1105 1106 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1107 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1108 and `d49` are optional, and set to NaN by default. 1109 1110 **Parameters** 1111 1112 + `txt`: the csv string to read 1113 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1114 whichever appers most often in `txt`. 1115 + `session`: set `Session` field to this string for all analyses 1116 ''' 1117 if sep == '': 1118 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1119 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1120 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1121 1122 if session != '': 1123 for r in data: 1124 r['Session'] = session 1125 1126 self += data 1127 self.refresh() 1128 1129 1130 @make_verbal 1131 def wg(self, 1132 samples = None, 1133 session_groups = None, 1134 ): 1135 ''' 1136 Compute bulk composition of the working gas for each session based (by default) 1137 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1138 `self.Nominal_d18O_VPDB`. 1139 1140 **Parameters** 1141 1142 + `samples`: A list of samples specifying the subset of samples (defined in both 1143 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1144 when computing the working gas. By default, use all samples defined both in 1145 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1146 + `session_groups`: a list of lists of sessions 1147 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1148 specifying which sessions groups, if any, have the exact same WG composition. 1149 If set to `'all'`, force all sessions to have the same WG composition (use with 1150 caution and on short time scales, since the WG may drift slowly a long time scales). 1151 ''' 1152 1153 self.msg('Computing WG composition:') 1154 1155 a18_acid = self.ALPHA_18O_ACID_REACTION 1156 1157 if samples is None: 1158 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1159 if session_groups is None: 1160 session_groups = [[s] for s in self.sessions] 1161 elif session_groups == 'all': 1162 session_groups = [[s for s in self.sessions]] 1163 1164 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1165 R45R46_standards = {} 1166 for sample in samples: 1167 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1168 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1169 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1170 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1171 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1172 1173 C12_s = 1 / (1 + R13_s) 1174 C13_s = R13_s / (1 + R13_s) 1175 C16_s = 1 / (1 + R17_s + R18_s) 1176 C17_s = R17_s / (1 + R17_s + R18_s) 1177 C18_s = R18_s / (1 + R17_s + R18_s) 1178 1179 C626_s = C12_s * C16_s ** 2 1180 C627_s = 2 * C12_s * C16_s * C17_s 1181 C628_s = 2 * C12_s * C16_s * C18_s 1182 C636_s = C13_s * C16_s ** 2 1183 C637_s = 2 * C13_s * C16_s * C17_s 1184 C727_s = C12_s * C17_s ** 2 1185 1186 R45_s = (C627_s + C636_s) / C626_s 1187 R46_s = (C628_s + C637_s + C727_s) / C626_s 1188 R45R46_standards[sample] = (R45_s, R46_s) 1189 1190 for sg in session_groups: 1191 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1192 assert db, f'No sample from {samples} found in session group {sg}.' 1193 1194 X = [r['d45'] for r in db] 1195 Y = [R45R46_standards[r['Sample']][0] for r in db] 1196 x1, x2 = np.min(X), np.max(X) 1197 1198 if x1 < x2: 1199 wgcoord = x1/(x1-x2) 1200 else: 1201 wgcoord = 999 1202 1203 if wgcoord < -.5 or wgcoord > 1.5: 1204 # unreasonable to extrapolate to d45 = 0 1205 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1206 else : 1207 # d45 = 0 is reasonably well bracketed 1208 R45_wg = np.polyfit(X, Y, 1)[1] 1209 1210 X = [r['d46'] for r in db] 1211 Y = [R45R46_standards[r['Sample']][1] for r in db] 1212 x1, x2 = np.min(X), np.max(X) 1213 1214 if x1 < x2: 1215 wgcoord = x1/(x1-x2) 1216 else: 1217 wgcoord = 999 1218 1219 if wgcoord < -.5 or wgcoord > 1.5: 1220 # unreasonable to extrapolate to d46 = 0 1221 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1222 else : 1223 # d46 = 0 is reasonably well bracketed 1224 R46_wg = np.polyfit(X, Y, 1)[1] 1225 1226 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1227 1228 for s in sg: 1229 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1230 1231 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1232 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1233 for r in self.sessions[s]['data']: 1234 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1235 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1236 1237 1238 def compute_bulk_delta(self, R45, R46, D17O = 0): 1239 ''' 1240 Compute δ13C_VPDB and δ18O_VSMOW, 1241 by solving the generalized form of equation (17) from 1242 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1243 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1244 solving the corresponding second-order Taylor polynomial. 1245 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1246 ''' 1247 1248 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1249 1250 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1251 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1252 C = 2 * self.R18_VSMOW 1253 D = -R46 1254 1255 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1256 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1257 cc = A + B + C + D 1258 1259 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1260 1261 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1262 R17 = K * R18 ** self.LAMBDA_17 1263 R13 = R45 - 2 * R17 1264 1265 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1266 1267 return d13C_VPDB, d18O_VSMOW 1268 1269 1270 @make_verbal 1271 def crunch(self, verbose = ''): 1272 ''' 1273 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1274 ''' 1275 for r in self: 1276 self.compute_bulk_and_clumping_deltas(r) 1277 self.standardize_d13C() 1278 self.standardize_d18O() 1279 self.msg(f"Crunched {len(self)} analyses.") 1280 1281 1282 def fill_in_missing_info(self, session = 'mySession'): 1283 ''' 1284 Fill in optional fields with default values 1285 ''' 1286 for i,r in enumerate(self): 1287 if 'D17O' not in r: 1288 r['D17O'] = 0. 1289 if 'UID' not in r: 1290 r['UID'] = f'{i+1}' 1291 if 'Session' not in r: 1292 r['Session'] = session 1293 for k in ['d47', 'd48', 'd49']: 1294 if k not in r: 1295 r[k] = np.nan 1296 1297 1298 def standardize_d13C(self): 1299 ''' 1300 Perform δ13C standadization within each session `s` according to 1301 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1302 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1303 may be redefined abitrarily at a later stage. 1304 ''' 1305 for s in self.sessions: 1306 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1307 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1308 X,Y = zip(*XY) 1309 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1310 offset = np.mean(Y) - np.mean(X) 1311 for r in self.sessions[s]['data']: 1312 r['d13C_VPDB'] += offset 1313 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1314 a,b = np.polyfit(X,Y,1) 1315 for r in self.sessions[s]['data']: 1316 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1317 1318 def standardize_d18O(self): 1319 ''' 1320 Perform δ18O standadization within each session `s` according to 1321 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1322 which is defined by default by `D47data.refresh_sessions()`as equal to 1323 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1324 ''' 1325 for s in self.sessions: 1326 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1327 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1328 X,Y = zip(*XY) 1329 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1330 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1331 offset = np.mean(Y) - np.mean(X) 1332 for r in self.sessions[s]['data']: 1333 r['d18O_VSMOW'] += offset 1334 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1335 a,b = np.polyfit(X,Y,1) 1336 for r in self.sessions[s]['data']: 1337 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1338 1339 1340 def compute_bulk_and_clumping_deltas(self, r): 1341 ''' 1342 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1343 ''' 1344 1345 # Compute working gas R13, R18, and isobar ratios 1346 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1347 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1348 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1349 1350 # Compute analyte isobar ratios 1351 R45 = (1 + r['d45'] / 1000) * R45_wg 1352 R46 = (1 + r['d46'] / 1000) * R46_wg 1353 R47 = (1 + r['d47'] / 1000) * R47_wg 1354 R48 = (1 + r['d48'] / 1000) * R48_wg 1355 R49 = (1 + r['d49'] / 1000) * R49_wg 1356 1357 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1358 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1359 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1360 1361 # Compute stochastic isobar ratios of the analyte 1362 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1363 R13, R18, D17O = r['D17O'] 1364 ) 1365 1366 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1367 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1368 if (R45 / R45stoch - 1) > 5e-8: 1369 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1370 if (R46 / R46stoch - 1) > 5e-8: 1371 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1372 1373 # Compute raw clumped isotope anomalies 1374 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1375 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1376 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1377 1378 1379 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1380 ''' 1381 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1382 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1383 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1384 ''' 1385 1386 # Compute R17 1387 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1388 1389 # Compute isotope concentrations 1390 C12 = (1 + R13) ** -1 1391 C13 = C12 * R13 1392 C16 = (1 + R17 + R18) ** -1 1393 C17 = C16 * R17 1394 C18 = C16 * R18 1395 1396 # Compute stochastic isotopologue concentrations 1397 C626 = C16 * C12 * C16 1398 C627 = C16 * C12 * C17 * 2 1399 C628 = C16 * C12 * C18 * 2 1400 C636 = C16 * C13 * C16 1401 C637 = C16 * C13 * C17 * 2 1402 C638 = C16 * C13 * C18 * 2 1403 C727 = C17 * C12 * C17 1404 C728 = C17 * C12 * C18 * 2 1405 C737 = C17 * C13 * C17 1406 C738 = C17 * C13 * C18 * 2 1407 C828 = C18 * C12 * C18 1408 C838 = C18 * C13 * C18 1409 1410 # Compute stochastic isobar ratios 1411 R45 = (C636 + C627) / C626 1412 R46 = (C628 + C637 + C727) / C626 1413 R47 = (C638 + C728 + C737) / C626 1414 R48 = (C738 + C828) / C626 1415 R49 = C838 / C626 1416 1417 # Account for stochastic anomalies 1418 R47 *= 1 + D47 / 1000 1419 R48 *= 1 + D48 / 1000 1420 R49 *= 1 + D49 / 1000 1421 1422 # Return isobar ratios 1423 return R45, R46, R47, R48, R49 1424 1425 1426 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1427 ''' 1428 Split unknown samples by UID (treat all analyses as different samples) 1429 or by session (treat analyses of a given sample in different sessions as 1430 different samples). 1431 1432 **Parameters** 1433 1434 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1435 + `grouping`: `by_uid` | `by_session` 1436 ''' 1437 if samples_to_split == 'all': 1438 samples_to_split = [s for s in self.unknowns] 1439 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1440 self.grouping = grouping.lower() 1441 if self.grouping in gkeys: 1442 gkey = gkeys[self.grouping] 1443 for r in self: 1444 if r['Sample'] in samples_to_split: 1445 r['Sample_original'] = r['Sample'] 1446 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1447 elif r['Sample'] in self.unknowns: 1448 r['Sample_original'] = r['Sample'] 1449 self.refresh_samples() 1450 1451 1452 def unsplit_samples(self, tables = False): 1453 ''' 1454 Reverse the effects of `D47data.split_samples()`. 1455 1456 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1457 1458 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1459 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1460 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1461 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1462 that case session-averaged Δ4x values are statistically independent). 1463 ''' 1464 unknowns_old = sorted({s for s in self.unknowns}) 1465 CM_old = self.standardization.covar[:,:] 1466 VD_old = self.standardization.params.valuesdict().copy() 1467 vars_old = self.standardization.var_names 1468 1469 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1470 1471 Ns = len(vars_old) - len(unknowns_old) 1472 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1473 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1474 1475 W = np.zeros((len(vars_new), len(vars_old))) 1476 W[:Ns,:Ns] = np.eye(Ns) 1477 for u in unknowns_new: 1478 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1479 if self.grouping == 'by_session': 1480 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1481 elif self.grouping == 'by_uid': 1482 weights = [1 for s in splits] 1483 sw = sum(weights) 1484 weights = [w/sw for w in weights] 1485 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1486 1487 CM_new = W @ CM_old @ W.T 1488 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1489 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1490 1491 self.standardization.covar = CM_new 1492 self.standardization.params.valuesdict = lambda : VD_new 1493 self.standardization.var_names = vars_new 1494 1495 for r in self: 1496 if r['Sample'] in self.unknowns: 1497 r['Sample_split'] = r['Sample'] 1498 r['Sample'] = r['Sample_original'] 1499 1500 self.refresh_samples() 1501 self.consolidate_samples() 1502 self.repeatabilities() 1503 1504 if tables: 1505 self.table_of_analyses() 1506 self.table_of_samples() 1507 1508 def assign_timestamps(self): 1509 ''' 1510 Assign a time field `t` of type `float` to each analysis. 1511 1512 If `TimeTag` is one of the data fields, `t` is equal within a given session 1513 to `TimeTag` minus the mean value of `TimeTag` for that session. 1514 Otherwise, `TimeTag` is by default equal to the index of each analysis 1515 in the dataset and `t` is defined as above. 1516 ''' 1517 for session in self.sessions: 1518 sdata = self.sessions[session]['data'] 1519 try: 1520 t0 = np.mean([r['TimeTag'] for r in sdata]) 1521 for r in sdata: 1522 r['t'] = r['TimeTag'] - t0 1523 except KeyError: 1524 t0 = (len(sdata)-1)/2 1525 for t,r in enumerate(sdata): 1526 r['t'] = t - t0 1527 1528 1529 def report(self): 1530 ''' 1531 Prints a report on the standardization fit. 1532 Only applicable after `D4xdata.standardize(method='pooled')`. 1533 ''' 1534 report_fit(self.standardization) 1535 1536 1537 def combine_samples(self, sample_groups): 1538 ''' 1539 Combine analyses of different samples to compute weighted average Δ4x 1540 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1541 dictionary. 1542 1543 Caution: samples are weighted by number of replicate analyses, which is a 1544 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1545 correlated analytical errors for one or more samples). 1546 1547 Returns a tuplet of: 1548 1549 + the list of group names 1550 + an array of the corresponding Δ4x values 1551 + the corresponding (co)variance matrix 1552 1553 **Parameters** 1554 1555 + `sample_groups`: a dictionary of the form: 1556 ```py 1557 {'group1': ['sample_1', 'sample_2'], 1558 'group2': ['sample_3', 'sample_4', 'sample_5']} 1559 ``` 1560 ''' 1561 1562 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1563 groups = sorted(sample_groups.keys()) 1564 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1565 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1566 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1567 W = np.array([ 1568 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1569 for j in groups]) 1570 D4x_new = W @ D4x_old 1571 CM_new = W @ CM_old @ W.T 1572 1573 return groups, D4x_new[:,0], CM_new 1574 1575 1576 @make_verbal 1577 def standardize(self, 1578 method = 'pooled', 1579 weighted_sessions = [], 1580 consolidate = True, 1581 consolidate_tables = False, 1582 consolidate_plots = False, 1583 constraints = {}, 1584 ): 1585 ''' 1586 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1587 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1588 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1589 i.e. that their true Δ4x value does not change between sessions, 1590 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1591 `'indep_sessions'`, the standardization processes each session independently, based only 1592 on anchors analyses. 1593 ''' 1594 1595 self.standardization_method = method 1596 self.assign_timestamps() 1597 1598 if method == 'pooled': 1599 if weighted_sessions: 1600 for session_group in weighted_sessions: 1601 if self._4x == '47': 1602 X = D47data([r for r in self if r['Session'] in session_group]) 1603 elif self._4x == '48': 1604 X = D48data([r for r in self if r['Session'] in session_group]) 1605 X.Nominal_D4x = self.Nominal_D4x.copy() 1606 X.refresh() 1607 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1608 w = np.sqrt(result.redchi) 1609 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1610 for r in X: 1611 r[f'wD{self._4x}raw'] *= w 1612 else: 1613 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1614 for r in self: 1615 r[f'wD{self._4x}raw'] = 1. 1616 1617 params = Parameters() 1618 for k,session in enumerate(self.sessions): 1619 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1620 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1621 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1622 s = pf(session) 1623 params.add(f'a_{s}', value = 0.9) 1624 params.add(f'b_{s}', value = 0.) 1625 params.add(f'c_{s}', value = -0.9) 1626 params.add(f'a2_{s}', value = 0., 1627# vary = self.sessions[session]['scrambling_drift'], 1628 ) 1629 params.add(f'b2_{s}', value = 0., 1630# vary = self.sessions[session]['slope_drift'], 1631 ) 1632 params.add(f'c2_{s}', value = 0., 1633# vary = self.sessions[session]['wg_drift'], 1634 ) 1635 if not self.sessions[session]['scrambling_drift']: 1636 params[f'a2_{s}'].expr = '0' 1637 if not self.sessions[session]['slope_drift']: 1638 params[f'b2_{s}'].expr = '0' 1639 if not self.sessions[session]['wg_drift']: 1640 params[f'c2_{s}'].expr = '0' 1641 1642 for sample in self.unknowns: 1643 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1644 1645 for k in constraints: 1646 params[k].expr = constraints[k] 1647 1648 def residuals(p): 1649 R = [] 1650 for r in self: 1651 session = pf(r['Session']) 1652 sample = pf(r['Sample']) 1653 if r['Sample'] in self.Nominal_D4x: 1654 R += [ ( 1655 r[f'D{self._4x}raw'] - ( 1656 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1657 + p[f'b_{session}'] * r[f'd{self._4x}'] 1658 + p[f'c_{session}'] 1659 + r['t'] * ( 1660 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1661 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1662 + p[f'c2_{session}'] 1663 ) 1664 ) 1665 ) / r[f'wD{self._4x}raw'] ] 1666 else: 1667 R += [ ( 1668 r[f'D{self._4x}raw'] - ( 1669 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1670 + p[f'b_{session}'] * r[f'd{self._4x}'] 1671 + p[f'c_{session}'] 1672 + r['t'] * ( 1673 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1674 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1675 + p[f'c2_{session}'] 1676 ) 1677 ) 1678 ) / r[f'wD{self._4x}raw'] ] 1679 return R 1680 1681 M = Minimizer(residuals, params) 1682 result = M.least_squares() 1683 self.Nf = result.nfree 1684 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1685 new_names, new_covar, new_se = _fullcovar(result)[:3] 1686 result.var_names = new_names 1687 result.covar = new_covar 1688 1689 for r in self: 1690 s = pf(r["Session"]) 1691 a = result.params.valuesdict()[f'a_{s}'] 1692 b = result.params.valuesdict()[f'b_{s}'] 1693 c = result.params.valuesdict()[f'c_{s}'] 1694 a2 = result.params.valuesdict()[f'a2_{s}'] 1695 b2 = result.params.valuesdict()[f'b2_{s}'] 1696 c2 = result.params.valuesdict()[f'c2_{s}'] 1697 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1698 1699 1700 self.standardization = result 1701 1702 for session in self.sessions: 1703 self.sessions[session]['Np'] = 3 1704 for k in ['scrambling', 'slope', 'wg']: 1705 if self.sessions[session][f'{k}_drift']: 1706 self.sessions[session]['Np'] += 1 1707 1708 if consolidate: 1709 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1710 return result 1711 1712 1713 elif method == 'indep_sessions': 1714 1715 if weighted_sessions: 1716 for session_group in weighted_sessions: 1717 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1718 X.Nominal_D4x = self.Nominal_D4x.copy() 1719 X.refresh() 1720 # This is only done to assign r['wD47raw'] for r in X: 1721 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1722 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1723 else: 1724 self.msg('All weights set to 1 ‰') 1725 for r in self: 1726 r[f'wD{self._4x}raw'] = 1 1727 1728 for session in self.sessions: 1729 s = self.sessions[session] 1730 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1731 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1732 s['Np'] = sum(p_active) 1733 sdata = s['data'] 1734 1735 A = np.array([ 1736 [ 1737 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1738 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1739 1 / r[f'wD{self._4x}raw'], 1740 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1741 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1742 r['t'] / r[f'wD{self._4x}raw'] 1743 ] 1744 for r in sdata if r['Sample'] in self.anchors 1745 ])[:,p_active] # only keep columns for the active parameters 1746 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1747 s['Na'] = Y.size 1748 CM = linalg.inv(A.T @ A) 1749 bf = (CM @ A.T @ Y).T[0,:] 1750 k = 0 1751 for n,a in zip(p_names, p_active): 1752 if a: 1753 s[n] = bf[k] 1754# self.msg(f'{n} = {bf[k]}') 1755 k += 1 1756 else: 1757 s[n] = 0. 1758# self.msg(f'{n} = 0.0') 1759 1760 for r in sdata : 1761 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1762 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1763 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1764 1765 s['CM'] = np.zeros((6,6)) 1766 i = 0 1767 k_active = [j for j,a in enumerate(p_active) if a] 1768 for j,a in enumerate(p_active): 1769 if a: 1770 s['CM'][j,k_active] = CM[i,:] 1771 i += 1 1772 1773 if not weighted_sessions: 1774 w = self.rmswd()['rmswd'] 1775 for r in self: 1776 r[f'wD{self._4x}'] *= w 1777 r[f'wD{self._4x}raw'] *= w 1778 for session in self.sessions: 1779 self.sessions[session]['CM'] *= w**2 1780 1781 for session in self.sessions: 1782 s = self.sessions[session] 1783 s['SE_a'] = s['CM'][0,0]**.5 1784 s['SE_b'] = s['CM'][1,1]**.5 1785 s['SE_c'] = s['CM'][2,2]**.5 1786 s['SE_a2'] = s['CM'][3,3]**.5 1787 s['SE_b2'] = s['CM'][4,4]**.5 1788 s['SE_c2'] = s['CM'][5,5]**.5 1789 1790 if not weighted_sessions: 1791 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1792 else: 1793 self.Nf = 0 1794 for sg in weighted_sessions: 1795 self.Nf += self.rmswd(sessions = sg)['Nf'] 1796 1797 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1798 1799 avgD4x = { 1800 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1801 for sample in self.samples 1802 } 1803 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1804 rD4x = (chi2/self.Nf)**.5 1805 self.repeatability[f'sigma_{self._4x}'] = rD4x 1806 1807 if consolidate: 1808 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1809 1810 1811 def standardization_error(self, session, d4x, D4x, t = 0): 1812 ''' 1813 Compute standardization error for a given session and 1814 (δ47, Δ47) composition. 1815 ''' 1816 a = self.sessions[session]['a'] 1817 b = self.sessions[session]['b'] 1818 c = self.sessions[session]['c'] 1819 a2 = self.sessions[session]['a2'] 1820 b2 = self.sessions[session]['b2'] 1821 c2 = self.sessions[session]['c2'] 1822 CM = self.sessions[session]['CM'] 1823 1824 x, y = D4x, d4x 1825 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1826# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1827 dxdy = -(b+b2*t) / (a+a2*t) 1828 dxdz = 1. / (a+a2*t) 1829 dxda = -x / (a+a2*t) 1830 dxdb = -y / (a+a2*t) 1831 dxdc = -1. / (a+a2*t) 1832 dxda2 = -x * a2 / (a+a2*t) 1833 dxdb2 = -y * t / (a+a2*t) 1834 dxdc2 = -t / (a+a2*t) 1835 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1836 sx = (V @ CM @ V.T) ** .5 1837 return sx 1838 1839 1840 @make_verbal 1841 def summary(self, 1842 dir = 'output', 1843 filename = None, 1844 save_to_file = True, 1845 print_out = True, 1846 ): 1847 ''' 1848 Print out an/or save to disk a summary of the standardization results. 1849 1850 **Parameters** 1851 1852 + `dir`: the directory in which to save the table 1853 + `filename`: the name to the csv file to write to 1854 + `save_to_file`: whether to save the table to disk 1855 + `print_out`: whether to print out the table 1856 ''' 1857 1858 out = [] 1859 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1860 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1861 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1862 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1863 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1864 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1865 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1866 out += [['Model degrees of freedom', f"{self.Nf}"]] 1867 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1868 out += [['Standardization method', self.standardization_method]] 1869 1870 if save_to_file: 1871 if not os.path.exists(dir): 1872 os.makedirs(dir) 1873 if filename is None: 1874 filename = f'D{self._4x}_summary.csv' 1875 with open(f'{dir}/{filename}', 'w') as fid: 1876 fid.write(make_csv(out)) 1877 if print_out: 1878 self.msg('\n' + pretty_table(out, header = 0)) 1879 1880 1881 @make_verbal 1882 def table_of_sessions(self, 1883 dir = 'output', 1884 filename = None, 1885 save_to_file = True, 1886 print_out = True, 1887 output = None, 1888 ): 1889 ''' 1890 Print out an/or save to disk a table of sessions. 1891 1892 **Parameters** 1893 1894 + `dir`: the directory in which to save the table 1895 + `filename`: the name to the csv file to write to 1896 + `save_to_file`: whether to save the table to disk 1897 + `print_out`: whether to print out the table 1898 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1899 if set to `'raw'`: return a list of list of strings 1900 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1901 ''' 1902 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1903 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1904 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1905 1906 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1907 if include_a2: 1908 out[-1] += ['a2 ± SE'] 1909 if include_b2: 1910 out[-1] += ['b2 ± SE'] 1911 if include_c2: 1912 out[-1] += ['c2 ± SE'] 1913 for session in self.sessions: 1914 out += [[ 1915 session, 1916 f"{self.sessions[session]['Na']}", 1917 f"{self.sessions[session]['Nu']}", 1918 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1919 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1920 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1921 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1922 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1923 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1924 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1925 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1926 ]] 1927 if include_a2: 1928 if self.sessions[session]['scrambling_drift']: 1929 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1930 else: 1931 out[-1] += [''] 1932 if include_b2: 1933 if self.sessions[session]['slope_drift']: 1934 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1935 else: 1936 out[-1] += [''] 1937 if include_c2: 1938 if self.sessions[session]['wg_drift']: 1939 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1940 else: 1941 out[-1] += [''] 1942 1943 if save_to_file: 1944 if not os.path.exists(dir): 1945 os.makedirs(dir) 1946 if filename is None: 1947 filename = f'D{self._4x}_sessions.csv' 1948 with open(f'{dir}/{filename}', 'w') as fid: 1949 fid.write(make_csv(out)) 1950 if print_out: 1951 self.msg('\n' + pretty_table(out)) 1952 if output == 'raw': 1953 return out 1954 elif output == 'pretty': 1955 return pretty_table(out) 1956 1957 1958 @make_verbal 1959 def table_of_analyses( 1960 self, 1961 dir = 'output', 1962 filename = None, 1963 save_to_file = True, 1964 print_out = True, 1965 output = None, 1966 ): 1967 ''' 1968 Print out an/or save to disk a table of analyses. 1969 1970 **Parameters** 1971 1972 + `dir`: the directory in which to save the table 1973 + `filename`: the name to the csv file to write to 1974 + `save_to_file`: whether to save the table to disk 1975 + `print_out`: whether to print out the table 1976 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1977 if set to `'raw'`: return a list of list of strings 1978 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1979 ''' 1980 1981 out = [['UID','Session','Sample']] 1982 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1983 for f in extra_fields: 1984 out[-1] += [f[0]] 1985 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1986 for r in self: 1987 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1988 for f in extra_fields: 1989 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1990 out[-1] += [ 1991 f"{r['d13Cwg_VPDB']:.3f}", 1992 f"{r['d18Owg_VSMOW']:.3f}", 1993 f"{r['d45']:.6f}", 1994 f"{r['d46']:.6f}", 1995 f"{r['d47']:.6f}", 1996 f"{r['d48']:.6f}", 1997 f"{r['d49']:.6f}", 1998 f"{r['d13C_VPDB']:.6f}", 1999 f"{r['d18O_VSMOW']:.6f}", 2000 f"{r['D47raw']:.6f}", 2001 f"{r['D48raw']:.6f}", 2002 f"{r['D49raw']:.6f}", 2003 f"{r[f'D{self._4x}']:.6f}" 2004 ] 2005 if save_to_file: 2006 if not os.path.exists(dir): 2007 os.makedirs(dir) 2008 if filename is None: 2009 filename = f'D{self._4x}_analyses.csv' 2010 with open(f'{dir}/{filename}', 'w') as fid: 2011 fid.write(make_csv(out)) 2012 if print_out: 2013 self.msg('\n' + pretty_table(out)) 2014 return out 2015 2016 @make_verbal 2017 def covar_table( 2018 self, 2019 correl = False, 2020 dir = 'output', 2021 filename = None, 2022 save_to_file = True, 2023 print_out = True, 2024 output = None, 2025 ): 2026 ''' 2027 Print out, save to disk and/or return the variance-covariance matrix of D4x 2028 for all unknown samples. 2029 2030 **Parameters** 2031 2032 + `dir`: the directory in which to save the csv 2033 + `filename`: the name of the csv file to write to 2034 + `save_to_file`: whether to save the csv 2035 + `print_out`: whether to print out the matrix 2036 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2037 if set to `'raw'`: return a list of list of strings 2038 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2039 ''' 2040 samples = sorted([u for u in self.unknowns]) 2041 out = [[''] + samples] 2042 for s1 in samples: 2043 out.append([s1]) 2044 for s2 in samples: 2045 if correl: 2046 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2047 else: 2048 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2049 2050 if save_to_file: 2051 if not os.path.exists(dir): 2052 os.makedirs(dir) 2053 if filename is None: 2054 if correl: 2055 filename = f'D{self._4x}_correl.csv' 2056 else: 2057 filename = f'D{self._4x}_covar.csv' 2058 with open(f'{dir}/{filename}', 'w') as fid: 2059 fid.write(make_csv(out)) 2060 if print_out: 2061 self.msg('\n'+pretty_table(out)) 2062 if output == 'raw': 2063 return out 2064 elif output == 'pretty': 2065 return pretty_table(out) 2066 2067 @make_verbal 2068 def table_of_samples( 2069 self, 2070 dir = 'output', 2071 filename = None, 2072 save_to_file = True, 2073 print_out = True, 2074 output = None, 2075 ): 2076 ''' 2077 Print out, save to disk and/or return a table of samples. 2078 2079 **Parameters** 2080 2081 + `dir`: the directory in which to save the csv 2082 + `filename`: the name of the csv file to write to 2083 + `save_to_file`: whether to save the csv 2084 + `print_out`: whether to print out the table 2085 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2086 if set to `'raw'`: return a list of list of strings 2087 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2088 ''' 2089 2090 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2091 for sample in self.anchors: 2092 out += [[ 2093 f"{sample}", 2094 f"{self.samples[sample]['N']}", 2095 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2096 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2097 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2098 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2099 ]] 2100 for sample in self.unknowns: 2101 out += [[ 2102 f"{sample}", 2103 f"{self.samples[sample]['N']}", 2104 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2105 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2106 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2107 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2108 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2109 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2110 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2111 ]] 2112 if save_to_file: 2113 if not os.path.exists(dir): 2114 os.makedirs(dir) 2115 if filename is None: 2116 filename = f'D{self._4x}_samples.csv' 2117 with open(f'{dir}/{filename}', 'w') as fid: 2118 fid.write(make_csv(out)) 2119 if print_out: 2120 self.msg('\n'+pretty_table(out)) 2121 if output == 'raw': 2122 return out 2123 elif output == 'pretty': 2124 return pretty_table(out) 2125 2126 2127 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2128 ''' 2129 Generate session plots and save them to disk. 2130 2131 **Parameters** 2132 2133 + `dir`: the directory in which to save the plots 2134 + `figsize`: the width and height (in inches) of each plot 2135 + `filetype`: 'pdf' or 'png' 2136 + `dpi`: resolution for PNG output 2137 ''' 2138 if not os.path.exists(dir): 2139 os.makedirs(dir) 2140 2141 for session in self.sessions: 2142 sp = self.plot_single_session(session, xylimits = 'constant') 2143 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2144 ppl.close(sp.fig) 2145 2146 2147 2148 @make_verbal 2149 def consolidate_samples(self): 2150 ''' 2151 Compile various statistics for each sample. 2152 2153 For each anchor sample: 2154 2155 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2156 + `SE_D47` or `SE_D48`: set to zero by definition 2157 2158 For each unknown sample: 2159 2160 + `D47` or `D48`: the standardized Δ4x value for this unknown 2161 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2162 2163 For each anchor and unknown: 2164 2165 + `N`: the total number of analyses of this sample 2166 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2167 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2168 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2169 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2170 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2171 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2172 ''' 2173 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2174 for sample in self.samples: 2175 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2176 if self.samples[sample]['N'] > 1: 2177 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2178 2179 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2180 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2181 2182 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2183 if len(D4x_pop) > 2: 2184 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2185 2186 if self.standardization_method == 'pooled': 2187 for sample in self.anchors: 2188 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2189 self.samples[sample][f'SE_D{self._4x}'] = 0. 2190 for sample in self.unknowns: 2191 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2192 try: 2193 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2194 except ValueError: 2195 # when `sample` is constrained by self.standardize(constraints = {...}), 2196 # it is no longer listed in self.standardization.var_names. 2197 # Temporary fix: define SE as zero for now 2198 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2199 2200 elif self.standardization_method == 'indep_sessions': 2201 for sample in self.anchors: 2202 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2203 self.samples[sample][f'SE_D{self._4x}'] = 0. 2204 for sample in self.unknowns: 2205 self.msg(f'Consolidating sample {sample}') 2206 self.unknowns[sample][f'session_D{self._4x}'] = {} 2207 session_avg = [] 2208 for session in self.sessions: 2209 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2210 if sdata: 2211 self.msg(f'{sample} found in session {session}') 2212 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2213 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2214 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2215 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2216 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2217 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2218 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2219 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2220 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2221 wsum = sum([weights[s] for s in weights]) 2222 for s in weights: 2223 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2224 2225 for r in self: 2226 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2227 2228 2229 2230 def consolidate_sessions(self): 2231 ''' 2232 Compute various statistics for each session. 2233 2234 + `Na`: Number of anchor analyses in the session 2235 + `Nu`: Number of unknown analyses in the session 2236 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2237 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2238 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2239 + `a`: scrambling factor 2240 + `b`: compositional slope 2241 + `c`: WG offset 2242 + `SE_a`: Model stadard erorr of `a` 2243 + `SE_b`: Model stadard erorr of `b` 2244 + `SE_c`: Model stadard erorr of `c` 2245 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2246 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2247 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2248 + `a2`: scrambling factor drift 2249 + `b2`: compositional slope drift 2250 + `c2`: WG offset drift 2251 + `Np`: Number of standardization parameters to fit 2252 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2253 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2254 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2255 ''' 2256 for session in self.sessions: 2257 if 'd13Cwg_VPDB' not in self.sessions[session]: 2258 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2259 if 'd18Owg_VSMOW' not in self.sessions[session]: 2260 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2261 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2262 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2263 2264 self.msg(f'Computing repeatabilities for session {session}') 2265 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2266 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2267 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2268 2269 if self.standardization_method == 'pooled': 2270 for session in self.sessions: 2271 2272 # different (better?) computation of D4x repeatability for each session: 2273 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2274 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2275 2276 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2277 i = self.standardization.var_names.index(f'a_{pf(session)}') 2278 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2279 2280 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2281 i = self.standardization.var_names.index(f'b_{pf(session)}') 2282 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2283 2284 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2285 i = self.standardization.var_names.index(f'c_{pf(session)}') 2286 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2287 2288 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2289 if self.sessions[session]['scrambling_drift']: 2290 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2291 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2292 else: 2293 self.sessions[session]['SE_a2'] = 0. 2294 2295 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2296 if self.sessions[session]['slope_drift']: 2297 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2298 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2299 else: 2300 self.sessions[session]['SE_b2'] = 0. 2301 2302 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2303 if self.sessions[session]['wg_drift']: 2304 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2305 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2306 else: 2307 self.sessions[session]['SE_c2'] = 0. 2308 2309 i = self.standardization.var_names.index(f'a_{pf(session)}') 2310 j = self.standardization.var_names.index(f'b_{pf(session)}') 2311 k = self.standardization.var_names.index(f'c_{pf(session)}') 2312 CM = np.zeros((6,6)) 2313 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2314 try: 2315 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2316 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2317 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2318 try: 2319 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2320 CM[3,4] = self.standardization.covar[i2,j2] 2321 CM[4,3] = self.standardization.covar[j2,i2] 2322 except ValueError: 2323 pass 2324 try: 2325 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2326 CM[3,5] = self.standardization.covar[i2,k2] 2327 CM[5,3] = self.standardization.covar[k2,i2] 2328 except ValueError: 2329 pass 2330 except ValueError: 2331 pass 2332 try: 2333 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2334 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2335 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2336 try: 2337 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2338 CM[4,5] = self.standardization.covar[j2,k2] 2339 CM[5,4] = self.standardization.covar[k2,j2] 2340 except ValueError: 2341 pass 2342 except ValueError: 2343 pass 2344 try: 2345 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2346 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2347 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2348 except ValueError: 2349 pass 2350 2351 self.sessions[session]['CM'] = CM 2352 2353 elif self.standardization_method == 'indep_sessions': 2354 pass # Not implemented yet 2355 2356 2357 @make_verbal 2358 def repeatabilities(self): 2359 ''' 2360 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2361 (for all samples, for anchors, and for unknowns). 2362 ''' 2363 self.msg('Computing reproducibilities for all sessions') 2364 2365 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2366 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2367 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2368 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2369 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2370 2371 2372 @make_verbal 2373 def consolidate(self, tables = True, plots = True): 2374 ''' 2375 Collect information about samples, sessions and repeatabilities. 2376 ''' 2377 self.consolidate_samples() 2378 self.consolidate_sessions() 2379 self.repeatabilities() 2380 2381 if tables: 2382 self.summary() 2383 self.table_of_sessions() 2384 self.table_of_analyses() 2385 self.table_of_samples() 2386 2387 if plots: 2388 self.plot_sessions() 2389 2390 2391 @make_verbal 2392 def rmswd(self, 2393 samples = 'all samples', 2394 sessions = 'all sessions', 2395 ): 2396 ''' 2397 Compute the χ2, root mean squared weighted deviation 2398 (i.e. reduced χ2), and corresponding degrees of freedom of the 2399 Δ4x values for samples in `samples` and sessions in `sessions`. 2400 2401 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2402 ''' 2403 if samples == 'all samples': 2404 mysamples = [k for k in self.samples] 2405 elif samples == 'anchors': 2406 mysamples = [k for k in self.anchors] 2407 elif samples == 'unknowns': 2408 mysamples = [k for k in self.unknowns] 2409 else: 2410 mysamples = samples 2411 2412 if sessions == 'all sessions': 2413 sessions = [k for k in self.sessions] 2414 2415 chisq, Nf = 0, 0 2416 for sample in mysamples : 2417 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2418 if len(G) > 1 : 2419 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2420 Nf += (len(G) - 1) 2421 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2422 r = (chisq / Nf)**.5 if Nf > 0 else 0 2423 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2424 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2425 2426 2427 @make_verbal 2428 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2429 ''' 2430 Compute the repeatability of `[r[key] for r in self]` 2431 ''' 2432 2433 if samples == 'all samples': 2434 mysamples = [k for k in self.samples] 2435 elif samples == 'anchors': 2436 mysamples = [k for k in self.anchors] 2437 elif samples == 'unknowns': 2438 mysamples = [k for k in self.unknowns] 2439 else: 2440 mysamples = samples 2441 2442 if sessions == 'all sessions': 2443 sessions = [k for k in self.sessions] 2444 2445 if key in ['D47', 'D48']: 2446 # Full disclosure: the definition of Nf is tricky/debatable 2447 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2448 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2449 Nf = len(G) 2450# print(f'len(G) = {Nf}') 2451 Nf -= len([s for s in mysamples if s in self.unknowns]) 2452# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2453 for session in sessions: 2454 Np = len([ 2455 _ for _ in self.standardization.params 2456 if ( 2457 self.standardization.params[_].expr is not None 2458 and ( 2459 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2460 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2461 ) 2462 ) 2463 ]) 2464# print(f'session {session}: {Np} parameters to consider') 2465 Na = len({ 2466 r['Sample'] for r in self.sessions[session]['data'] 2467 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2468 }) 2469# print(f'session {session}: {Na} different anchors in that session') 2470 Nf -= min(Np, Na) 2471# print(f'Nf = {Nf}') 2472 2473# for sample in mysamples : 2474# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2475# if len(X) > 1 : 2476# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2477# if sample in self.unknowns: 2478# Nf += len(X) - 1 2479# else: 2480# Nf += len(X) 2481# if samples in ['anchors', 'all samples']: 2482# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2483 r = (chisq / Nf)**.5 if Nf > 0 else 0 2484 2485 else: # if key not in ['D47', 'D48'] 2486 chisq, Nf = 0, 0 2487 for sample in mysamples : 2488 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2489 if len(X) > 1 : 2490 Nf += len(X) - 1 2491 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2492 r = (chisq / Nf)**.5 if Nf > 0 else 0 2493 2494 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2495 return r 2496 2497 def sample_average(self, samples, weights = 'equal', normalize = True): 2498 ''' 2499 Weighted average Δ4x value of a group of samples, accounting for covariance. 2500 2501 Returns the weighed average Δ4x value and associated SE 2502 of a group of samples. Weights are equal by default. If `normalize` is 2503 true, `weights` will be rescaled so that their sum equals 1. 2504 2505 **Examples** 2506 2507 ```python 2508 self.sample_average(['X','Y'], [1, 2]) 2509 ``` 2510 2511 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2512 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2513 values of samples X and Y, respectively. 2514 2515 ```python 2516 self.sample_average(['X','Y'], [1, -1], normalize = False) 2517 ``` 2518 2519 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2520 ''' 2521 if weights == 'equal': 2522 weights = [1/len(samples)] * len(samples) 2523 2524 if normalize: 2525 s = sum(weights) 2526 if s: 2527 weights = [w/s for w in weights] 2528 2529 try: 2530# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2531# C = self.standardization.covar[indices,:][:,indices] 2532 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2533 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2534 return correlated_sum(X, C, weights) 2535 except ValueError: 2536 return (0., 0.) 2537 2538 2539 def sample_D4x_covar(self, sample1, sample2 = None): 2540 ''' 2541 Covariance between Δ4x values of samples 2542 2543 Returns the error covariance between the average Δ4x values of two 2544 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2545 returns the Δ4x variance for that sample. 2546 ''' 2547 if sample2 is None: 2548 sample2 = sample1 2549 if self.standardization_method == 'pooled': 2550 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2551 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2552 return self.standardization.covar[i, j] 2553 elif self.standardization_method == 'indep_sessions': 2554 if sample1 == sample2: 2555 return self.samples[sample1][f'SE_D{self._4x}']**2 2556 else: 2557 c = 0 2558 for session in self.sessions: 2559 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2560 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2561 if sdata1 and sdata2: 2562 a = self.sessions[session]['a'] 2563 # !! TODO: CM below does not account for temporal changes in standardization parameters 2564 CM = self.sessions[session]['CM'][:3,:3] 2565 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2566 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2567 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2568 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2569 c += ( 2570 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2571 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2572 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2573 @ CM 2574 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2575 ) / a**2 2576 return float(c) 2577 2578 def sample_D4x_correl(self, sample1, sample2 = None): 2579 ''' 2580 Correlation between Δ4x errors of samples 2581 2582 Returns the error correlation between the average Δ4x values of two samples. 2583 ''' 2584 if sample2 is None or sample2 == sample1: 2585 return 1. 2586 return ( 2587 self.sample_D4x_covar(sample1, sample2) 2588 / self.unknowns[sample1][f'SE_D{self._4x}'] 2589 / self.unknowns[sample2][f'SE_D{self._4x}'] 2590 ) 2591 2592 def plot_single_session(self, 2593 session, 2594 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2595 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2596 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2597 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2598 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2599 xylimits = 'free', # | 'constant' 2600 x_label = None, 2601 y_label = None, 2602 error_contour_interval = 'auto', 2603 fig = 'new', 2604 ): 2605 ''' 2606 Generate plot for a single session 2607 ''' 2608 if x_label is None: 2609 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2610 if y_label is None: 2611 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2612 2613 out = _SessionPlot() 2614 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2615 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2616 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2617 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2618 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2619 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2620 anchor_avg = (np.array([ np.array([ 2621 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2622 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2623 ]) for sample in anchors]).T, 2624 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2625 unknown_avg = (np.array([ np.array([ 2626 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2627 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2628 ]) for sample in unknowns]).T, 2629 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2630 2631 2632 if fig == 'new': 2633 out.fig = ppl.figure(figsize = (6,6)) 2634 ppl.subplots_adjust(.1,.1,.9,.9) 2635 2636 out.anchor_analyses, = ppl.plot( 2637 anchors_d, 2638 anchors_D, 2639 **kw_plot_anchors) 2640 out.unknown_analyses, = ppl.plot( 2641 unknowns_d, 2642 unknowns_D, 2643 **kw_plot_unknowns) 2644 out.anchor_avg = ppl.plot( 2645 *anchor_avg, 2646 **kw_plot_anchor_avg) 2647 out.unknown_avg = ppl.plot( 2648 *unknown_avg, 2649 **kw_plot_unknown_avg) 2650 if xylimits == 'constant': 2651 x = [r[f'd{self._4x}'] for r in self] 2652 y = [r[f'D{self._4x}'] for r in self] 2653 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2654 w, h = x2-x1, y2-y1 2655 x1 -= w/20 2656 x2 += w/20 2657 y1 -= h/20 2658 y2 += h/20 2659 ppl.axis([x1, x2, y1, y2]) 2660 elif xylimits == 'free': 2661 x1, x2, y1, y2 = ppl.axis() 2662 else: 2663 x1, x2, y1, y2 = ppl.axis(xylimits) 2664 2665 if error_contour_interval != 'none': 2666 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2667 XI,YI = np.meshgrid(xi, yi) 2668 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2669 if error_contour_interval == 'auto': 2670 rng = np.max(SI) - np.min(SI) 2671 if rng <= 0.01: 2672 cinterval = 0.001 2673 elif rng <= 0.03: 2674 cinterval = 0.004 2675 elif rng <= 0.1: 2676 cinterval = 0.01 2677 elif rng <= 0.3: 2678 cinterval = 0.03 2679 elif rng <= 1.: 2680 cinterval = 0.1 2681 else: 2682 cinterval = 0.5 2683 else: 2684 cinterval = error_contour_interval 2685 2686 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2687 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2688 out.clabel = ppl.clabel(out.contour) 2689 contour = (XI, YI, SI, cval, cinterval) 2690 2691 if fig == None: 2692 return { 2693 'anchors':anchors, 2694 'unknowns':unknowns, 2695 'anchors_d':anchors_d, 2696 'anchors_D':anchors_D, 2697 'unknowns_d':unknowns_d, 2698 'unknowns_D':unknowns_D, 2699 'anchor_avg':anchor_avg, 2700 'unknown_avg':unknown_avg, 2701 'contour':contour, 2702 } 2703 2704 ppl.xlabel(x_label) 2705 ppl.ylabel(y_label) 2706 ppl.title(session, weight = 'bold') 2707 ppl.grid(alpha = .2) 2708 out.ax = ppl.gca() 2709 2710 return out 2711 2712 def plot_residuals( 2713 self, 2714 kde = False, 2715 hist = False, 2716 binwidth = 2/3, 2717 dir = 'output', 2718 filename = None, 2719 highlight = [], 2720 colors = None, 2721 figsize = None, 2722 dpi = 100, 2723 yspan = None, 2724 ): 2725 ''' 2726 Plot residuals of each analysis as a function of time (actually, as a function of 2727 the order of analyses in the `D4xdata` object) 2728 2729 + `kde`: whether to add a kernel density estimate of residuals 2730 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2731 + `histbins`: specify bin edges for the histogram 2732 + `dir`: the directory in which to save the plot 2733 + `highlight`: a list of samples to highlight 2734 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2735 + `figsize`: (width, height) of figure 2736 + `dpi`: resolution for PNG output 2737 + `yspan`: factor controlling the range of y values shown in plot 2738 (by default: `yspan = 1.5 if kde else 1.0`) 2739 ''' 2740 2741 from matplotlib import ticker 2742 2743 if yspan is None: 2744 if kde: 2745 yspan = 1.5 2746 else: 2747 yspan = 1.0 2748 2749 # Layout 2750 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2751 if hist or kde: 2752 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2753 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2754 else: 2755 ppl.subplots_adjust(.08,.05,.78,.8) 2756 ax1 = ppl.subplot(111) 2757 2758 # Colors 2759 N = len(self.anchors) 2760 if colors is None: 2761 if len(highlight) > 0: 2762 Nh = len(highlight) 2763 if Nh == 1: 2764 colors = {highlight[0]: (0,0,0)} 2765 elif Nh == 3: 2766 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2767 elif Nh == 4: 2768 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2769 else: 2770 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2771 else: 2772 if N == 3: 2773 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2774 elif N == 4: 2775 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2776 else: 2777 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2778 2779 ppl.sca(ax1) 2780 2781 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2782 2783 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2784 2785 session = self[0]['Session'] 2786 x1 = 0 2787# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2788 x_sessions = {} 2789 one_or_more_singlets = False 2790 one_or_more_multiplets = False 2791 multiplets = set() 2792 for k,r in enumerate(self): 2793 if r['Session'] != session: 2794 x2 = k-1 2795 x_sessions[session] = (x1+x2)/2 2796 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2797 session = r['Session'] 2798 x1 = k 2799 singlet = len(self.samples[r['Sample']]['data']) == 1 2800 if not singlet: 2801 multiplets.add(r['Sample']) 2802 if r['Sample'] in self.unknowns: 2803 if singlet: 2804 one_or_more_singlets = True 2805 else: 2806 one_or_more_multiplets = True 2807 kw = dict( 2808 marker = 'x' if singlet else '+', 2809 ms = 4 if singlet else 5, 2810 ls = 'None', 2811 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2812 mew = 1, 2813 alpha = 0.2 if singlet else 1, 2814 ) 2815 if highlight and r['Sample'] not in highlight: 2816 kw['alpha'] = 0.2 2817 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2818 x2 = k 2819 x_sessions[session] = (x1+x2)/2 2820 2821 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2822 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2823 if not (hist or kde): 2824 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2825 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2826 2827 xmin, xmax, ymin, ymax = ppl.axis() 2828 if yspan != 1: 2829 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2830 for s in x_sessions: 2831 ppl.text( 2832 x_sessions[s], 2833 ymax +1, 2834 s, 2835 va = 'bottom', 2836 **( 2837 dict(ha = 'center') 2838 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2839 else dict(ha = 'left', rotation = 45) 2840 ) 2841 ) 2842 2843 if hist or kde: 2844 ppl.sca(ax2) 2845 2846 for s in colors: 2847 kw['marker'] = '+' 2848 kw['ms'] = 5 2849 kw['mec'] = colors[s] 2850 kw['label'] = s 2851 kw['alpha'] = 1 2852 ppl.plot([], [], **kw) 2853 2854 kw['mec'] = (0,0,0) 2855 2856 if one_or_more_singlets: 2857 kw['marker'] = 'x' 2858 kw['ms'] = 4 2859 kw['alpha'] = .2 2860 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2861 ppl.plot([], [], **kw) 2862 2863 if one_or_more_multiplets: 2864 kw['marker'] = '+' 2865 kw['ms'] = 4 2866 kw['alpha'] = 1 2867 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2868 ppl.plot([], [], **kw) 2869 2870 if hist or kde: 2871 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2872 else: 2873 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2874 leg.set_zorder(-1000) 2875 2876 ppl.sca(ax1) 2877 2878 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2879 ppl.xticks([]) 2880 ppl.axis([-1, len(self), None, None]) 2881 2882 if hist or kde: 2883 ppl.sca(ax2) 2884 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2885 2886 if kde: 2887 from scipy.stats import gaussian_kde 2888 yi = np.linspace(ymin, ymax, 201) 2889 xi = gaussian_kde(X).evaluate(yi) 2890 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2891# ppl.plot(xi, yi, 'k-', lw = 1) 2892 elif hist: 2893 ppl.hist( 2894 X, 2895 orientation = 'horizontal', 2896 histtype = 'stepfilled', 2897 ec = [.4]*3, 2898 fc = [.25]*3, 2899 alpha = .25, 2900 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2901 ) 2902 ppl.text(0, 0, 2903 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2904 size = 7.5, 2905 alpha = 1, 2906 va = 'center', 2907 ha = 'left', 2908 ) 2909 2910 ppl.axis([0, None, ymin, ymax]) 2911 ppl.xticks([]) 2912 ppl.yticks([]) 2913# ax2.spines['left'].set_visible(False) 2914 ax2.spines['right'].set_visible(False) 2915 ax2.spines['top'].set_visible(False) 2916 ax2.spines['bottom'].set_visible(False) 2917 2918 ax1.axis([None, None, ymin, ymax]) 2919 2920 if not os.path.exists(dir): 2921 os.makedirs(dir) 2922 if filename is None: 2923 return fig 2924 elif filename == '': 2925 filename = f'D{self._4x}_residuals.pdf' 2926 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2927 ppl.close(fig) 2928 2929 2930 def simulate(self, *args, **kwargs): 2931 ''' 2932 Legacy function with warning message pointing to `virtual_data()` 2933 ''' 2934 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2935 2936 def plot_anchor_residuals( 2937 self, 2938 dir = 'output', 2939 filename = '', 2940 figsize = None, 2941 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2942 dpi = 100, 2943 colors = None, 2944 ): 2945 ''' 2946 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2947 2948 **Parameters** 2949 2950 + `dir`: the directory in which to save the plot 2951 + `filename`: the file name to save to. 2952 + `dpi`: resolution for PNG output 2953 + `figsize`: (width, height) of figure 2954 + `subplots_adjust`: passed to the figure 2955 + `dpi`: resolution for PNG output 2956 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2957 ''' 2958 2959 # Colors 2960 N = len(self.anchors) 2961 if colors is None: 2962 if N == 3: 2963 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2964 elif N == 4: 2965 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2966 else: 2967 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2968 2969 if figsize is None: 2970 figsize = (4, 1.5*N+1) 2971 fig = ppl.figure(figsize = figsize) 2972 ppl.subplots_adjust(*subplots_adjust) 2973 axs = {} 2974 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2975 sigma = self.repeatability['r_D47a'] * 1000 2976 D = max(np.abs(X)) 2977 2978 for k,a in enumerate(self.anchors): 2979 color = colors[a] 2980 axs[a] = ppl.subplot(N, 1, 1+k) 2981 axs[a].text( 2982 0.02, 1-0.05, a, 2983 va = 'top', 2984 ha = 'left', 2985 weight = 'bold', 2986 size = 9, 2987 color = [_*0.75 for _ in color], 2988 transform = axs[a].transAxes, 2989 ) 2990 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 2991 axs[a].axvline(0, lw = 0.5, color = color) 2992 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 2993 2994 xi = np.linspace(-3*D, 3*D, 601) 2995 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 2996 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 2997 2998 axs[a].errorbar( 2999 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3000 ecolor = color, 3001 marker = 's', 3002 ls = 'None', 3003 mec = color, 3004 mew = 1, 3005 mfc = 'w', 3006 ms = 8, 3007 elinewidth = 1, 3008 capsize = 4, 3009 capthick = 1, 3010 ) 3011 3012 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3013 ppl.yticks([]) 3014 3015 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3016 3017 if not os.path.exists(dir): 3018 os.makedirs(dir) 3019 if filename is None: 3020 return fig 3021 elif filename == '': 3022 filename = f'D{self._4x}_anchor_residuals.pdf' 3023 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3024 ppl.close(fig) 3025 3026 3027 def plot_distribution_of_analyses( 3028 self, 3029 dir = 'output', 3030 filename = None, 3031 vs_time = False, 3032 figsize = (6,4), 3033 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3034 output = None, 3035 dpi = 100, 3036 ): 3037 ''' 3038 Plot temporal distribution of all analyses in the data set. 3039 3040 **Parameters** 3041 3042 + `dir`: the directory in which to save the plot 3043 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3044 + `dpi`: resolution for PNG output 3045 + `figsize`: (width, height) of figure 3046 + `dpi`: resolution for PNG output 3047 ''' 3048 3049 asamples = [s for s in self.anchors] 3050 usamples = [s for s in self.unknowns] 3051 if output is None or output == 'fig': 3052 fig = ppl.figure(figsize = figsize) 3053 ppl.subplots_adjust(*subplots_adjust) 3054 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3055 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3056 Xmax += (Xmax-Xmin)/40 3057 Xmin -= (Xmax-Xmin)/41 3058 for k, s in enumerate(asamples + usamples): 3059 if vs_time: 3060 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3061 else: 3062 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3063 Y = [-k for x in X] 3064 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3065 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3066 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3067 ppl.axis([Xmin, Xmax, -k-1, 1]) 3068 ppl.xlabel('\ntime') 3069 ppl.gca().annotate('', 3070 xy = (0.6, -0.02), 3071 xycoords = 'axes fraction', 3072 xytext = (.4, -0.02), 3073 arrowprops = dict(arrowstyle = "->", color = 'k'), 3074 ) 3075 3076 3077 x2 = -1 3078 for session in self.sessions: 3079 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3080 if vs_time: 3081 ppl.axvline(x1, color = 'k', lw = .75) 3082 if x2 > -1: 3083 if not vs_time: 3084 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3085 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3086# from xlrd import xldate_as_datetime 3087# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3088 if vs_time: 3089 ppl.axvline(x2, color = 'k', lw = .75) 3090 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3091 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3092 3093 ppl.xticks([]) 3094 ppl.yticks([]) 3095 3096 if output is None: 3097 if not os.path.exists(dir): 3098 os.makedirs(dir) 3099 if filename == None: 3100 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3101 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3102 ppl.close(fig) 3103 elif output == 'ax': 3104 return ppl.gca() 3105 elif output == 'fig': 3106 return fig 3107 3108 3109 def plot_bulk_compositions( 3110 self, 3111 samples = None, 3112 dir = 'output/bulk_compositions', 3113 figsize = (6,6), 3114 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3115 show = False, 3116 sample_color = (0,.5,1), 3117 analysis_color = (.7,.7,.7), 3118 labeldist = 0.3, 3119 radius = 0.05, 3120 ): 3121 ''' 3122 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3123 3124 By default, creates a directory `./output/bulk_compositions` where plots for 3125 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3126 3127 3128 **Parameters** 3129 3130 + `samples`: Only these samples are processed (by default: all samples). 3131 + `dir`: where to save the plots 3132 + `figsize`: (width, height) of figure 3133 + `subplots_adjust`: passed to `subplots_adjust()` 3134 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3135 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3136 + `sample_color`: color used for replicate markers/labels 3137 + `analysis_color`: color used for sample markers/labels 3138 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3139 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3140 ''' 3141 3142 from matplotlib.patches import Ellipse 3143 3144 if samples is None: 3145 samples = [_ for _ in self.samples] 3146 3147 saved = {} 3148 3149 for s in samples: 3150 3151 fig = ppl.figure(figsize = figsize) 3152 fig.subplots_adjust(*subplots_adjust) 3153 ax = ppl.subplot(111) 3154 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3155 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3156 ppl.title(s) 3157 3158 3159 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3160 UID = [_['UID'] for _ in self.samples[s]['data']] 3161 XY0 = XY.mean(0) 3162 3163 for xy in XY: 3164 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3165 3166 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3167 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3168 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3169 saved[s] = [XY, XY0] 3170 3171 x1, x2, y1, y2 = ppl.axis() 3172 x0, dx = (x1+x2)/2, (x2-x1)/2 3173 y0, dy = (y1+y2)/2, (y2-y1)/2 3174 dx, dy = [max(max(dx, dy), radius)]*2 3175 3176 ppl.axis([ 3177 x0 - 1.2*dx, 3178 x0 + 1.2*dx, 3179 y0 - 1.2*dy, 3180 y0 + 1.2*dy, 3181 ]) 3182 3183 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3184 3185 for xy, uid in zip(XY, UID): 3186 3187 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3188 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3189 3190 if (vector_in_display_space**2).sum() > 0: 3191 3192 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3193 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3194 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3195 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3196 3197 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3198 3199 else: 3200 3201 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3202 3203 if radius: 3204 ax.add_artist(Ellipse( 3205 xy = XY0, 3206 width = radius*2, 3207 height = radius*2, 3208 ls = (0, (2,2)), 3209 lw = .7, 3210 ec = analysis_color, 3211 fc = 'None', 3212 )) 3213 ppl.text( 3214 XY0[0], 3215 XY0[1]-radius, 3216 f'\n± {radius*1e3:.0f} ppm', 3217 color = analysis_color, 3218 va = 'top', 3219 ha = 'center', 3220 linespacing = 0.4, 3221 size = 8, 3222 ) 3223 3224 if not os.path.exists(dir): 3225 os.makedirs(dir) 3226 fig.savefig(f'{dir}/{s}.pdf') 3227 ppl.close(fig) 3228 3229 fig = ppl.figure(figsize = figsize) 3230 fig.subplots_adjust(*subplots_adjust) 3231 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3232 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3233 3234 for s in saved: 3235 for xy in saved[s][0]: 3236 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3237 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3238 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3239 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3240 3241 x1, x2, y1, y2 = ppl.axis() 3242 ppl.axis([ 3243 x1 - (x2-x1)/10, 3244 x2 + (x2-x1)/10, 3245 y1 - (y2-y1)/10, 3246 y2 + (y2-y1)/10, 3247 ]) 3248 3249 3250 if not os.path.exists(dir): 3251 os.makedirs(dir) 3252 fig.savefig(f'{dir}/__all__.pdf') 3253 if show: 3254 ppl.show() 3255 ppl.close(fig) 3256 3257 3258 def _save_D4x_correl( 3259 self, 3260 samples = None, 3261 dir = 'output', 3262 filename = None, 3263 D4x_precision = 4, 3264 correl_precision = 4, 3265 save_to_file = True, 3266 ): 3267 ''' 3268 Save D4x values along with their SE and correlation matrix. 3269 3270 **Parameters** 3271 3272 + `samples`: Only these samples are output (by default: all samples). 3273 + `dir`: the directory in which to save the faile (by defaut: `output`) 3274 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3275 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3276 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3277 + `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`, 3278 returns the output as a string 3279 ''' 3280 if samples is None: 3281 samples = sorted([s for s in self.unknowns]) 3282 3283 out = [['Sample']] + [[s] for s in samples] 3284 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3285 for k,s in enumerate(samples): 3286 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3287 for s2 in samples: 3288 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3289 3290 if save_to_file: 3291 if not os.path.exists(dir): 3292 os.makedirs(dir) 3293 if filename is None: 3294 filename = f'D{self._4x}_correl.csv' 3295 with open(f'{dir}/{filename}', 'w') as fid: 3296 fid.write(make_csv(out)) 3297 else: 3298 return make_csv(out) 3299 3300 3301class D47data(D4xdata): 3302 ''' 3303 Store and process data for a large set of Δ47 analyses, 3304 usually comprising more than one analytical session. 3305 ''' 3306 3307 Nominal_D4x = { 3308 'ETH-1': 0.2052, 3309 'ETH-2': 0.2085, 3310 'ETH-3': 0.6132, 3311 'ETH-4': 0.4511, 3312 'IAEA-C1': 0.3018, 3313 'IAEA-C2': 0.6409, 3314 'MERCK': 0.5135, 3315 } # I-CDES (Bernasconi et al., 2021) 3316 ''' 3317 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3318 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3319 reference frame. 3320 3321 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3322 ```py 3323 { 3324 'ETH-1' : 0.2052, 3325 'ETH-2' : 0.2085, 3326 'ETH-3' : 0.6132, 3327 'ETH-4' : 0.4511, 3328 'IAEA-C1' : 0.3018, 3329 'IAEA-C2' : 0.6409, 3330 'MERCK' : 0.5135, 3331 } 3332 ``` 3333 ''' 3334 3335 3336 @property 3337 def Nominal_D47(self): 3338 return self.Nominal_D4x 3339 3340 3341 @Nominal_D47.setter 3342 def Nominal_D47(self, new): 3343 self.Nominal_D4x = dict(**new) 3344 self.refresh() 3345 3346 3347 def __init__(self, l = [], **kwargs): 3348 ''' 3349 **Parameters:** same as `D4xdata.__init__()` 3350 ''' 3351 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3352 3353 3354 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3355 ''' 3356 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3357 value for that temperature, and add treat these samples as additional anchors. 3358 3359 **Parameters** 3360 3361 + `fCo2eqD47`: Which CO2 equilibrium law to use 3362 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3363 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3364 + `priority`: if `replace`: forget old anchors and only use the new ones; 3365 if `new`: keep pre-existing anchors but update them in case of conflict 3366 between old and new Δ47 values; 3367 if `old`: keep pre-existing anchors but preserve their original Δ47 3368 values in case of conflict. 3369 ''' 3370 f = { 3371 'petersen': fCO2eqD47_Petersen, 3372 'wang': fCO2eqD47_Wang, 3373 }[fCo2eqD47] 3374 foo = {} 3375 for r in self: 3376 if 'Teq' in r: 3377 if r['Sample'] in foo: 3378 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3379 else: 3380 foo[r['Sample']] = f(r['Teq']) 3381 else: 3382 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3383 3384 if priority == 'replace': 3385 self.Nominal_D47 = {} 3386 for s in foo: 3387 if priority != 'old' or s not in self.Nominal_D47: 3388 self.Nominal_D47[s] = foo[s] 3389 3390 def save_D47_correl(self, *args, **kwargs): 3391 return self._save_D4x_correl(*args, **kwargs) 3392 3393 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47') 3394 3395 3396class D48data(D4xdata): 3397 ''' 3398 Store and process data for a large set of Δ48 analyses, 3399 usually comprising more than one analytical session. 3400 ''' 3401 3402 Nominal_D4x = { 3403 'ETH-1': 0.138, 3404 'ETH-2': 0.138, 3405 'ETH-3': 0.270, 3406 'ETH-4': 0.223, 3407 'GU-1': -0.419, 3408 } # (Fiebig et al., 2019, 2021) 3409 ''' 3410 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3411 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3412 reference frame. 3413 3414 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3415 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3416 3417 ```py 3418 { 3419 'ETH-1' : 0.138, 3420 'ETH-2' : 0.138, 3421 'ETH-3' : 0.270, 3422 'ETH-4' : 0.223, 3423 'GU-1' : -0.419, 3424 } 3425 ``` 3426 ''' 3427 3428 3429 @property 3430 def Nominal_D48(self): 3431 return self.Nominal_D4x 3432 3433 3434 @Nominal_D48.setter 3435 def Nominal_D48(self, new): 3436 self.Nominal_D4x = dict(**new) 3437 self.refresh() 3438 3439 3440 def __init__(self, l = [], **kwargs): 3441 ''' 3442 **Parameters:** same as `D4xdata.__init__()` 3443 ''' 3444 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3445 3446 def save_D48_correl(self, *args, **kwargs): 3447 return self._save_D4x_correl(*args, **kwargs) 3448 3449 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48') 3450 3451 3452class D49data(D4xdata): 3453 ''' 3454 Store and process data for a large set of Δ49 analyses, 3455 usually comprising more than one analytical session. 3456 ''' 3457 3458 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3459 ''' 3460 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3461 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3462 reference frame. 3463 3464 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3465 3466 ```py 3467 { 3468 "1000C": 0.0, 3469 "25C": 2.228 3470 } 3471 ``` 3472 ''' 3473 3474 @property 3475 def Nominal_D49(self): 3476 return self.Nominal_D4x 3477 3478 @Nominal_D49.setter 3479 def Nominal_D49(self, new): 3480 self.Nominal_D4x = dict(**new) 3481 self.refresh() 3482 3483 def __init__(self, l=[], **kwargs): 3484 ''' 3485 **Parameters:** same as `D4xdata.__init__()` 3486 ''' 3487 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3488 3489 def save_D49_correl(self, *args, **kwargs): 3490 return self._save_D4x_correl(*args, **kwargs) 3491 3492 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49') 3493 3494class _SessionPlot(): 3495 ''' 3496 Simple placeholder class 3497 ''' 3498 def __init__(self): 3499 pass 3500 3501_app = typer.Typer( 3502 add_completion = False, 3503 context_settings={'help_option_names': ['-h', '--help']}, 3504 rich_markup_mode = 'rich', 3505 ) 3506 3507@_app.command() 3508def _cli( 3509 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3510 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3511 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3512 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3513 run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False, 3514 ): 3515 """ 3516 Process raw D47 data and return standardized results. 3517 3518 See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details. 3519 3520 Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes 3521 the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different 3522 user-specified anchors. A new directory (named `output` by default) is created to store the results and 3523 the following sequence is applied: 3524 3525 * [b]D47data.wg()[/b] 3526 * [b]D47data.crunch()[/b] 3527 * [b]D47data.standardize()[/b] 3528 * [b]D47data.summary()[/b] 3529 * [b]D47data.table_of_samples()[/b] 3530 * [b]D47data.table_of_sessions()[/b] 3531 * [b]D47data.plot_sessions()[/b] 3532 * [b]D47data.plot_residuals()[/b] 3533 * [b]D47data.table_of_analyses()[/b] 3534 * [b]D47data.plot_distribution_of_analyses()[/b] 3535 * [b]D47data.plot_bulk_compositions()[/b] 3536 * [b]D47data.save_D47_correl()[/b] 3537 3538 Optionally, also apply similar methods for [b]]D48[/b]. 3539 3540 [b]Example CSV file for --anchors option:[/b] 3541 [i] 3542 Sample, d13C_VPDB, d18O_VPDB, D47, D48 3543 ETH-1, 2.02, -2.19, 0.2052, 0.138 3544 ETH-2, -10.17, -18.69, 0.2085, 0.138 3545 ETH-3, 1.71, -1.78, 0.6132, 0.270 3546 ETH-4, , , 0.4511, 0.223 3547 [/i] 3548 Except for [i]Sample[/i], none of the columns above are mandatory. 3549 3550 [b]Example CSV file for --exclude option:[/b] 3551 [i] 3552 Sample, UID 3553 FOO-1, 3554 BAR-2, 3555 , A04 3556 , A17 3557 , A88 3558 [/i] 3559 This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i], 3560 and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i]. 3561 Neither column is mandatory. 3562 """ 3563 3564 data = D47data() 3565 data.read(rawdata) 3566 3567 if exclude != 'none': 3568 exclude = read_csv(exclude) 3569 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3570 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3571 else: 3572 exclude_uid = [] 3573 exclude_sample = [] 3574 3575 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3576 3577 if anchors != 'none': 3578 anchors = read_csv(anchors) 3579 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3580 data.Nominal_d13C_VPDB = { 3581 _['Sample']: _['d13C_VPDB'] 3582 for _ in anchors 3583 if 'd13C_VPDB' in _ 3584 } 3585 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3586 data.Nominal_d18O_VPDB = { 3587 _['Sample']: _['d18O_VPDB'] 3588 for _ in anchors 3589 if 'd18O_VPDB' in _ 3590 } 3591 if len([_ for _ in anchors if 'D47' in _]): 3592 data.Nominal_D4x = { 3593 _['Sample']: _['D47'] 3594 for _ in anchors 3595 if 'D47' in _ 3596 } 3597 3598 data.refresh() 3599 data.wg() 3600 data.crunch() 3601 data.standardize() 3602 data.summary(dir = output_dir) 3603 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3604 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3605 data.plot_sessions(dir = output_dir) 3606 data.save_D47_correl(dir = output_dir) 3607 3608 if not run_D48: 3609 data.table_of_samples(dir = output_dir) 3610 data.table_of_analyses(dir = output_dir) 3611 data.table_of_sessions(dir = output_dir) 3612 3613 3614 if run_D48: 3615 data2 = D48data() 3616 print(rawdata) 3617 data2.read(rawdata) 3618 3619 data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3620 3621 if anchors != 'none': 3622 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3623 data2.Nominal_d13C_VPDB = { 3624 _['Sample']: _['d13C_VPDB'] 3625 for _ in anchors 3626 if 'd13C_VPDB' in _ 3627 } 3628 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3629 data2.Nominal_d18O_VPDB = { 3630 _['Sample']: _['d18O_VPDB'] 3631 for _ in anchors 3632 if 'd18O_VPDB' in _ 3633 } 3634 if len([_ for _ in anchors if 'D48' in _]): 3635 data2.Nominal_D4x = { 3636 _['Sample']: _['D48'] 3637 for _ in anchors 3638 if 'D48' in _ 3639 } 3640 3641 data2.refresh() 3642 data2.wg() 3643 data2.crunch() 3644 data2.standardize() 3645 data2.summary(dir = output_dir) 3646 data2.plot_sessions(dir = output_dir) 3647 data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True) 3648 data2.plot_distribution_of_analyses(dir = output_dir) 3649 data2.save_D48_correl(dir = output_dir) 3650 3651 table_of_analyses(data, data2, dir = output_dir) 3652 table_of_samples(data, data2, dir = output_dir) 3653 table_of_sessions(data, data2, dir = output_dir) 3654 3655def __cli(): 3656 _app()
69def fCO2eqD47_Petersen(T): 70 ''' 71 CO2 equilibrium Δ47 value as a function of T (in degrees C) 72 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 73 74 ''' 75 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
80def fCO2eqD47_Wang(T): 81 ''' 82 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 83 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 84 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 85 ''' 86 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
108def make_csv(x, hsep = ',', vsep = '\n'): 109 ''' 110 Formats a list of lists of strings as a CSV 111 112 **Parameters** 113 114 + `x`: the list of lists of strings to format 115 + `hsep`: the field separator (`,` by default) 116 + `vsep`: the line-ending convention to use (`\\n` by default) 117 118 **Example** 119 120 ```py 121 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 122 ``` 123 124 outputs: 125 126 ```py 127 a,b,c 128 d,e,f 129 ``` 130 ''' 131 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x: the list of lists of strings to formathsep: the field separator (,by default)vsep: the line-ending convention to use (\nby default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
134def pf(txt): 135 ''' 136 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 137 ''' 138 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt to follow lmfit.Parameter() naming rules.
141def smart_type(x): 142 ''' 143 Tries to convert string `x` to a float if it includes a decimal point, or 144 to an integer if it does not. If both attempts fail, return the original 145 string unchanged. 146 ''' 147 try: 148 y = float(x) 149 except ValueError: 150 return x 151 if '.' not in x: 152 return int(y) 153 return y
Tries to convert string x to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
162def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 163 ''' 164 Reads a list of lists of strings and outputs an ascii table 165 166 **Parameters** 167 168 + `x`: a list of lists of strings 169 + `header`: the number of lines to treat as header lines 170 + `hsep`: the horizontal separator between columns 171 + `vsep`: the character to use as vertical separator 172 + `align`: string of left (`<`) or right (`>`) alignment characters. 173 174 **Example** 175 176 ```py 177 print(pretty_table([ 178 ['A', 'B', 'C'], 179 ['1', '1.9999', 'foo'], 180 ['10', 'x', 'bar'], 181 ])) 182 ``` 183 yields: 184 ``` 185 —— —————— ——— 186 A B C 187 —— —————— ——— 188 1 1.9999 foo 189 10 x bar 190 —— —————— ——— 191 ``` 192 193 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 194 195 ```py 196 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 197 print(pretty_table([ 198 ['A', 'B', 'C'], 199 ['1', '1.9999', 'foo'], 200 ['10', 'x', 'bar'], 201 ])) 202 ``` 203 yields: 204 ``` 205 == ====== === 206 A B C 207 == ====== === 208 1 1.9999 foo 209 10 x bar 210 == ====== === 211 ``` 212 ''' 213 214 if vsep is None: 215 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 216 217 txt = [] 218 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 219 220 if len(widths) > len(align): 221 align += '>' * (len(widths)-len(align)) 222 sepline = hsep.join([vsep*w for w in widths]) 223 txt += [sepline] 224 for k,l in enumerate(x): 225 if k and k == header: 226 txt += [sepline] 227 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 228 txt += [sepline] 229 txt += [''] 230 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x: a list of lists of stringsheader: the number of lines to treat as header lineshsep: the horizontal separator between columnsvsep: the character to use as vertical separatoralign: string of left (<) or right (>) alignment characters.
Example
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
—— —————— ———
A B C
—— —————— ———
1 1.9999 foo
10 x bar
—— —————— ———
To change the default vsep globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP:
D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
== ====== ===
A B C
== ====== ===
1 1.9999 foo
10 x bar
== ====== ===
233def transpose_table(x): 234 ''' 235 Transpose a list if lists 236 237 **Parameters** 238 239 + `x`: a list of lists 240 241 **Example** 242 243 ```py 244 x = [[1, 2], [3, 4]] 245 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 246 ``` 247 ''' 248 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
251def w_avg(X, sX) : 252 ''' 253 Compute variance-weighted average 254 255 Returns the value and SE of the weighted average of the elements of `X`, 256 with relative weights equal to their inverse variances (`1/sX**2`). 257 258 **Parameters** 259 260 + `X`: array-like of elements to average 261 + `sX`: array-like of the corresponding SE values 262 263 **Tip** 264 265 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 266 they may be rearranged using `zip()`: 267 268 ```python 269 foo = [(0, 1), (1, 0.5), (2, 0.5)] 270 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 271 ``` 272 ''' 273 X = [ x for x in X ] 274 sX = [ sx for sx in sX ] 275 W = [ sx**-2 for sx in sX ] 276 W = [ w/sum(W) for w in W ] 277 Xavg = sum([ w*x for w,x in zip(W,X) ]) 278 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 279 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X,
with relative weights equal to their inverse variances (1/sX**2).
Parameters
X: array-like of elements to averagesX: array-like of the corresponding SE values
Tip
If X and sX are initially arranged as a list of (x, sx) doublets,
they may be rearranged using zip():
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
282def read_csv(filename, sep = ''): 283 ''' 284 Read contents of `filename` in csv format and return a list of dictionaries. 285 286 In the csv string, spaces before and after field separators (`','` by default) 287 are optional. 288 289 **Parameters** 290 291 + `filename`: the csv file to read 292 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 293 whichever appers most often in the contents of `filename`. 294 ''' 295 with open(filename) as fid: 296 txt = fid.read() 297 298 if sep == '': 299 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 300 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 301 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (',' by default)
are optional.
Parameters
filename: the csv file to readsep: csv separator delimiting the fields. By default, use,,;, or, whichever appers most often in the contents offilename.
304def simulate_single_analysis( 305 sample = 'MYSAMPLE', 306 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 307 d13C_VPDB = None, d18O_VPDB = None, 308 D47 = None, D48 = None, D49 = 0., D17O = 0., 309 a47 = 1., b47 = 0., c47 = -0.9, 310 a48 = 1., b48 = 0., c48 = -0.45, 311 Nominal_D47 = None, 312 Nominal_D48 = None, 313 Nominal_d13C_VPDB = None, 314 Nominal_d18O_VPDB = None, 315 ALPHA_18O_ACID_REACTION = None, 316 R13_VPDB = None, 317 R17_VSMOW = None, 318 R18_VSMOW = None, 319 LAMBDA_17 = None, 320 R18_VPDB = None, 321 ): 322 ''' 323 Compute working-gas delta values for a single analysis, assuming a stochastic working 324 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 325 326 **Parameters** 327 328 + `sample`: sample name 329 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 330 (respectively –4 and +26 ‰ by default) 331 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 332 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 333 of the carbonate sample 334 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 335 Δ48 values if `D47` or `D48` are not specified 336 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 337 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 338 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 339 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 340 correction parameters (by default equal to the `D4xdata` default values) 341 342 Returns a dictionary with fields 343 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 344 ''' 345 346 if Nominal_d13C_VPDB is None: 347 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 348 349 if Nominal_d18O_VPDB is None: 350 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 351 352 if ALPHA_18O_ACID_REACTION is None: 353 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 354 355 if R13_VPDB is None: 356 R13_VPDB = D4xdata().R13_VPDB 357 358 if R17_VSMOW is None: 359 R17_VSMOW = D4xdata().R17_VSMOW 360 361 if R18_VSMOW is None: 362 R18_VSMOW = D4xdata().R18_VSMOW 363 364 if LAMBDA_17 is None: 365 LAMBDA_17 = D4xdata().LAMBDA_17 366 367 if R18_VPDB is None: 368 R18_VPDB = D4xdata().R18_VPDB 369 370 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 371 372 if Nominal_D47 is None: 373 Nominal_D47 = D47data().Nominal_D47 374 375 if Nominal_D48 is None: 376 Nominal_D48 = D48data().Nominal_D48 377 378 if d13C_VPDB is None: 379 if sample in Nominal_d13C_VPDB: 380 d13C_VPDB = Nominal_d13C_VPDB[sample] 381 else: 382 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 383 384 if d18O_VPDB is None: 385 if sample in Nominal_d18O_VPDB: 386 d18O_VPDB = Nominal_d18O_VPDB[sample] 387 else: 388 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 389 390 if D47 is None: 391 if sample in Nominal_D47: 392 D47 = Nominal_D47[sample] 393 else: 394 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 395 396 if D48 is None: 397 if sample in Nominal_D48: 398 D48 = Nominal_D48[sample] 399 else: 400 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 401 402 X = D4xdata() 403 X.R13_VPDB = R13_VPDB 404 X.R17_VSMOW = R17_VSMOW 405 X.R18_VSMOW = R18_VSMOW 406 X.LAMBDA_17 = LAMBDA_17 407 X.R18_VPDB = R18_VPDB 408 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 409 410 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 411 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 412 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 413 ) 414 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 415 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 416 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 417 D17O=D17O, D47=D47, D48=D48, D49=D49, 418 ) 419 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 420 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 421 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 422 D17O=D17O, 423 ) 424 425 d45 = 1000 * (R45/R45wg - 1) 426 d46 = 1000 * (R46/R46wg - 1) 427 d47 = 1000 * (R47/R47wg - 1) 428 d48 = 1000 * (R48/R48wg - 1) 429 d49 = 1000 * (R49/R49wg - 1) 430 431 for k in range(3): # dumb iteration to adjust for small changes in d47 432 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 433 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 434 d47 = 1000 * (R47raw/R47wg - 1) 435 d48 = 1000 * (R48raw/R48wg - 1) 436 437 return dict( 438 Sample = sample, 439 D17O = D17O, 440 d13Cwg_VPDB = d13Cwg_VPDB, 441 d18Owg_VSMOW = d18Owg_VSMOW, 442 d45 = d45, 443 d46 = d46, 444 d47 = d47, 445 d48 = d48, 446 d49 = d49, 447 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample: sample named13Cwg_VPDB,d18Owg_VSMOW: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB,d18O_VPDB: bulk composition of the carbonate sampleD47,D48,D49,D17O: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47,Nominal_D48: where to lookup Δ47 and Δ48 values ifD47orD48are not specifiedNominal_d13C_VPDB,Nominal_d18O_VPDB: where to lookup δ13C and δ18O values ifd13C_VPDBord18O_VPDBare not specifiedALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factorR13_VPDB,R17_VSMOW,R18_VSMOW,LAMBDA_17,R18_VPDB: oxygen-17 correction parameters (by default equal to theD4xdatadefault values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49'].
450def virtual_data( 451 samples = [], 452 a47 = 1., b47 = 0., c47 = -0.9, 453 a48 = 1., b48 = 0., c48 = -0.45, 454 rd45 = 0.020, rd46 = 0.060, 455 rD47 = 0.015, rD48 = 0.045, 456 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 457 session = None, 458 Nominal_D47 = None, Nominal_D48 = None, 459 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 460 ALPHA_18O_ACID_REACTION = None, 461 R13_VPDB = None, 462 R17_VSMOW = None, 463 R18_VSMOW = None, 464 LAMBDA_17 = None, 465 R18_VPDB = None, 466 seed = 0, 467 shuffle = True, 468 ): 469 ''' 470 Return list with simulated analyses from a single session. 471 472 **Parameters** 473 474 + `samples`: a list of entries; each entry is a dictionary with the following fields: 475 * `Sample`: the name of the sample 476 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 477 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 478 * `N`: how many analyses to generate for this sample 479 + `a47`: scrambling factor for Δ47 480 + `b47`: compositional nonlinearity for Δ47 481 + `c47`: working gas offset for Δ47 482 + `a48`: scrambling factor for Δ48 483 + `b48`: compositional nonlinearity for Δ48 484 + `c48`: working gas offset for Δ48 485 + `rd45`: analytical repeatability of δ45 486 + `rd46`: analytical repeatability of δ46 487 + `rD47`: analytical repeatability of Δ47 488 + `rD48`: analytical repeatability of Δ48 489 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 490 (by default equal to the `simulate_single_analysis` default values) 491 + `session`: name of the session (no name by default) 492 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 493 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 494 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 495 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 496 (by default equal to the `simulate_single_analysis` defaults) 497 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 498 (by default equal to the `simulate_single_analysis` defaults) 499 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 500 correction parameters (by default equal to the `simulate_single_analysis` default) 501 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 502 + `shuffle`: randomly reorder the sequence of analyses 503 504 505 Here is an example of using this method to generate an arbitrary combination of 506 anchors and unknowns for a bunch of sessions: 507 508 ```py 509 .. include:: ../../code_examples/virtual_data/example.py 510 ``` 511 512 This should output something like: 513 514 ``` 515 .. include:: ../../code_examples/virtual_data/output.txt 516 ``` 517 ''' 518 519 kwargs = locals().copy() 520 521 from numpy import random as nprandom 522 if seed: 523 nprandom.seed(seed) 524 rng = nprandom.default_rng(seed) 525 else: 526 rng = nprandom.default_rng() 527 528 N = sum([s['N'] for s in samples]) 529 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 530 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 531 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 532 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 533 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 534 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 535 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 536 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 537 538 k = 0 539 out = [] 540 for s in samples: 541 kw = {} 542 kw['sample'] = s['Sample'] 543 kw = { 544 **kw, 545 **{var: kwargs[var] 546 for var in [ 547 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 548 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 549 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 550 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 551 ] 552 if kwargs[var] is not None}, 553 **{var: s[var] 554 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 555 if var in s}, 556 } 557 558 sN = s['N'] 559 while sN: 560 out.append(simulate_single_analysis(**kw)) 561 out[-1]['d45'] += errors45[k] 562 out[-1]['d46'] += errors46[k] 563 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 564 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 565 sN -= 1 566 k += 1 567 568 if session is not None: 569 for r in out: 570 r['Session'] = session 571 572 if shuffle: 573 nprandom.shuffle(out) 574 575 return out
Return list with simulated analyses from a single session.
Parameters
samples: a list of entries; each entry is a dictionary with the following fields:Sample: the name of the sampled13C_VPDB,d18O_VPDB: bulk composition of the carbonate sampleD47,D48,D49,D17O(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN: how many analyses to generate for this sample
a47: scrambling factor for Δ47b47: compositional nonlinearity for Δ47c47: working gas offset for Δ47a48: scrambling factor for Δ48b48: compositional nonlinearity for Δ48c48: working gas offset for Δ48rd45: analytical repeatability of δ45rd46: analytical repeatability of δ46rD47: analytical repeatability of Δ47rD48: analytical repeatability of Δ48d13Cwg_VPDB,d18Owg_VSMOW: bulk composition of the working gas (by default equal to thesimulate_single_analysisdefault values)session: name of the session (no name by default)Nominal_D47,Nominal_D48: where to lookup Δ47 and Δ48 values ifD47orD48are not specified (by default equal to thesimulate_single_analysisdefaults)Nominal_d13C_VPDB,Nominal_d18O_VPDB: where to lookup δ13C and δ18O values ifd13C_VPDBord18O_VPDBare not specified (by default equal to thesimulate_single_analysisdefaults)ALPHA_18O_ACID_REACTION: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysisdefaults)R13_VPDB,R17_VSMOW,R18_VSMOW,LAMBDA_17,R18_VPDB: oxygen-17 correction parameters (by default equal to thesimulate_single_analysisdefault)seed: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0075 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0082 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0091 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0070 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
[table_of_samples]
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
[table_of_analyses]
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
1 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
2 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
3 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
4 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
5 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
6 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
7 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
8 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
9 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
10 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
11 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
12 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
13 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
14 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
15 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
16 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
17 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
18 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
19 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
20 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
21 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
22 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
23 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
24 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
25 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
26 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
27 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
28 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
29 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
30 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
31 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
32 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
33 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
34 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
35 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
36 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
37 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
38 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
39 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
40 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
41 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
42 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
43 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
44 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
45 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
46 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
47 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
48 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
49 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
50 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
51 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
52 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
53 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
54 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
55 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
56 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
57 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
58 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
59 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
60 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
577def table_of_samples( 578 data47 = None, 579 data48 = None, 580 dir = 'output', 581 filename = None, 582 save_to_file = True, 583 print_out = True, 584 output = None, 585 ): 586 ''' 587 Print out, save to disk and/or return a combined table of samples 588 for a pair of `D47data` and `D48data` objects. 589 590 **Parameters** 591 592 + `data47`: `D47data` instance 593 + `data48`: `D48data` instance 594 + `dir`: the directory in which to save the table 595 + `filename`: the name to the csv file to write to 596 + `save_to_file`: whether to save the table to disk 597 + `print_out`: whether to print out the table 598 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 599 if set to `'raw'`: return a list of list of strings 600 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 601 ''' 602 if data47 is None: 603 if data48 is None: 604 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 605 else: 606 return data48.table_of_samples( 607 dir = dir, 608 filename = filename, 609 save_to_file = save_to_file, 610 print_out = print_out, 611 output = output 612 ) 613 else: 614 if data48 is None: 615 return data47.table_of_samples( 616 dir = dir, 617 filename = filename, 618 save_to_file = save_to_file, 619 print_out = print_out, 620 output = output 621 ) 622 else: 623 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 624 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 625 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 626 627 if save_to_file: 628 if not os.path.exists(dir): 629 os.makedirs(dir) 630 if filename is None: 631 filename = f'D47D48_samples.csv' 632 with open(f'{dir}/{filename}', 'w') as fid: 633 fid.write(make_csv(out)) 634 if print_out: 635 print('\n'+pretty_table(out)) 636 if output == 'raw': 637 return out 638 elif output == 'pretty': 639 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data and D48data objects.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
642def table_of_sessions( 643 data47 = None, 644 data48 = None, 645 dir = 'output', 646 filename = None, 647 save_to_file = True, 648 print_out = True, 649 output = None, 650 ): 651 ''' 652 Print out, save to disk and/or return a combined table of sessions 653 for a pair of `D47data` and `D48data` objects. 654 ***Only applicable if the sessions in `data47` and those in `data48` 655 consist of the exact same sets of analyses.*** 656 657 **Parameters** 658 659 + `data47`: `D47data` instance 660 + `data48`: `D48data` instance 661 + `dir`: the directory in which to save the table 662 + `filename`: the name to the csv file to write to 663 + `save_to_file`: whether to save the table to disk 664 + `print_out`: whether to print out the table 665 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 666 if set to `'raw'`: return a list of list of strings 667 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 668 ''' 669 if data47 is None: 670 if data48 is None: 671 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 672 else: 673 return data48.table_of_sessions( 674 dir = dir, 675 filename = filename, 676 save_to_file = save_to_file, 677 print_out = print_out, 678 output = output 679 ) 680 else: 681 if data48 is None: 682 return data47.table_of_sessions( 683 dir = dir, 684 filename = filename, 685 save_to_file = save_to_file, 686 print_out = print_out, 687 output = output 688 ) 689 else: 690 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 691 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 692 for k,x in enumerate(out47[0]): 693 if k>7: 694 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 695 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 696 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 697 698 if save_to_file: 699 if not os.path.exists(dir): 700 os.makedirs(dir) 701 if filename is None: 702 filename = f'D47D48_sessions.csv' 703 with open(f'{dir}/{filename}', 'w') as fid: 704 fid.write(make_csv(out)) 705 if print_out: 706 print('\n'+pretty_table(out)) 707 if output == 'raw': 708 return out 709 elif output == 'pretty': 710 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data and D48data objects.
Only applicable if the sessions in data47 and those in data48
consist of the exact same sets of analyses.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
713def table_of_analyses( 714 data47 = None, 715 data48 = None, 716 dir = 'output', 717 filename = None, 718 save_to_file = True, 719 print_out = True, 720 output = None, 721 ): 722 ''' 723 Print out, save to disk and/or return a combined table of analyses 724 for a pair of `D47data` and `D48data` objects. 725 726 If the sessions in `data47` and those in `data48` do not consist of 727 the exact same sets of analyses, the table will have two columns 728 `Session_47` and `Session_48` instead of a single `Session` column. 729 730 **Parameters** 731 732 + `data47`: `D47data` instance 733 + `data48`: `D48data` instance 734 + `dir`: the directory in which to save the table 735 + `filename`: the name to the csv file to write to 736 + `save_to_file`: whether to save the table to disk 737 + `print_out`: whether to print out the table 738 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 739 if set to `'raw'`: return a list of list of strings 740 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 741 ''' 742 if data47 is None: 743 if data48 is None: 744 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 745 else: 746 return data48.table_of_analyses( 747 dir = dir, 748 filename = filename, 749 save_to_file = save_to_file, 750 print_out = print_out, 751 output = output 752 ) 753 else: 754 if data48 is None: 755 return data47.table_of_analyses( 756 dir = dir, 757 filename = filename, 758 save_to_file = save_to_file, 759 print_out = print_out, 760 output = output 761 ) 762 else: 763 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 764 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 765 766 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 767 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 768 else: 769 out47[0][1] = 'Session_47' 770 out48[0][1] = 'Session_48' 771 out47 = transpose_table(out47) 772 out48 = transpose_table(out48) 773 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 774 775 if save_to_file: 776 if not os.path.exists(dir): 777 os.makedirs(dir) 778 if filename is None: 779 filename = f'D47D48_sessions.csv' 780 with open(f'{dir}/{filename}', 'w') as fid: 781 fid.write(make_csv(out)) 782 if print_out: 783 print('\n'+pretty_table(out)) 784 if output == 'raw': 785 return out 786 elif output == 'pretty': 787 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data and D48data objects.
If the sessions in data47 and those in data48 do not consist of
the exact same sets of analyses, the table will have two columns
Session_47 and Session_48 instead of a single Session column.
Parameters
data47:D47datainstancedata48:D48datainstancedir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
835class D4xdata(list): 836 ''' 837 Store and process data for a large set of Δ47 and/or Δ48 838 analyses, usually comprising more than one analytical session. 839 ''' 840 841 ### 17O CORRECTION PARAMETERS 842 R13_VPDB = 0.01118 # (Chang & Li, 1990) 843 ''' 844 Absolute (13C/12C) ratio of VPDB. 845 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 846 ''' 847 848 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 849 ''' 850 Absolute (18O/16C) ratio of VSMOW. 851 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 852 ''' 853 854 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 855 ''' 856 Mass-dependent exponent for triple oxygen isotopes. 857 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 858 ''' 859 860 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 861 ''' 862 Absolute (17O/16C) ratio of VSMOW. 863 By default equal to 0.00038475 864 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 865 rescaled to `R13_VPDB`) 866 ''' 867 868 R18_VPDB = R18_VSMOW * 1.03092 869 ''' 870 Absolute (18O/16C) ratio of VPDB. 871 By definition equal to `R18_VSMOW * 1.03092`. 872 ''' 873 874 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 875 ''' 876 Absolute (17O/16C) ratio of VPDB. 877 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 878 ''' 879 880 LEVENE_REF_SAMPLE = 'ETH-3' 881 ''' 882 After the Δ4x standardization step, each sample is tested to 883 assess whether the Δ4x variance within all analyses for that 884 sample differs significantly from that observed for a given reference 885 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 886 which yields a p-value corresponding to the null hypothesis that the 887 underlying variances are equal). 888 889 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 890 sample should be used as a reference for this test. 891 ''' 892 893 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 894 ''' 895 Specifies the 18O/16O fractionation factor generally applicable 896 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 897 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 898 899 By default equal to 1.008129 (calcite reacted at 90 °C, 900 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 901 ''' 902 903 Nominal_d13C_VPDB = { 904 'ETH-1': 2.02, 905 'ETH-2': -10.17, 906 'ETH-3': 1.71, 907 } # (Bernasconi et al., 2018) 908 ''' 909 Nominal δ13C_VPDB values assigned to carbonate standards, used by 910 `D4xdata.standardize_d13C()`. 911 912 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 913 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 914 ''' 915 916 Nominal_d18O_VPDB = { 917 'ETH-1': -2.19, 918 'ETH-2': -18.69, 919 'ETH-3': -1.78, 920 } # (Bernasconi et al., 2018) 921 ''' 922 Nominal δ18O_VPDB values assigned to carbonate standards, used by 923 `D4xdata.standardize_d18O()`. 924 925 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 926 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 927 ''' 928 929 d13C_STANDARDIZATION_METHOD = '2pt' 930 ''' 931 Method by which to standardize δ13C values: 932 933 + `none`: do not apply any δ13C standardization. 934 + `'1pt'`: within each session, offset all initial δ13C values so as to 935 minimize the difference between final δ13C_VPDB values and 936 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 937 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 938 values so as to minimize the difference between final δ13C_VPDB 939 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 940 is defined). 941 ''' 942 943 d18O_STANDARDIZATION_METHOD = '2pt' 944 ''' 945 Method by which to standardize δ18O values: 946 947 + `none`: do not apply any δ18O standardization. 948 + `'1pt'`: within each session, offset all initial δ18O values so as to 949 minimize the difference between final δ18O_VPDB values and 950 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 951 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 952 values so as to minimize the difference between final δ18O_VPDB 953 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 954 is defined). 955 ''' 956 957 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 958 ''' 959 **Parameters** 960 961 + `l`: a list of dictionaries, with each dictionary including at least the keys 962 `Sample`, `d45`, `d46`, and `d47` or `d48`. 963 + `mass`: `'47'` or `'48'` 964 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 965 + `session`: define session name for analyses without a `Session` key 966 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 967 968 Returns a `D4xdata` object derived from `list`. 969 ''' 970 self._4x = mass 971 self.verbose = verbose 972 self.prefix = 'D4xdata' 973 self.logfile = logfile 974 list.__init__(self, l) 975 self.Nf = None 976 self.repeatability = {} 977 self.refresh(session = session) 978 979 980 def make_verbal(oldfun): 981 ''' 982 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 983 ''' 984 @wraps(oldfun) 985 def newfun(*args, verbose = '', **kwargs): 986 myself = args[0] 987 oldprefix = myself.prefix 988 myself.prefix = oldfun.__name__ 989 if verbose != '': 990 oldverbose = myself.verbose 991 myself.verbose = verbose 992 out = oldfun(*args, **kwargs) 993 myself.prefix = oldprefix 994 if verbose != '': 995 myself.verbose = oldverbose 996 return out 997 return newfun 998 999 1000 def msg(self, txt): 1001 ''' 1002 Log a message to `self.logfile`, and print it out if `verbose = True` 1003 ''' 1004 self.log(txt) 1005 if self.verbose: 1006 print(f'{f"[{self.prefix}]":<16} {txt}') 1007 1008 1009 def vmsg(self, txt): 1010 ''' 1011 Log a message to `self.logfile` and print it out 1012 ''' 1013 self.log(txt) 1014 print(txt) 1015 1016 1017 def log(self, *txts): 1018 ''' 1019 Log a message to `self.logfile` 1020 ''' 1021 if self.logfile: 1022 with open(self.logfile, 'a') as fid: 1023 for txt in txts: 1024 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1025 1026 1027 def refresh(self, session = 'mySession'): 1028 ''' 1029 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1030 ''' 1031 self.fill_in_missing_info(session = session) 1032 self.refresh_sessions() 1033 self.refresh_samples() 1034 1035 1036 def refresh_sessions(self): 1037 ''' 1038 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1039 to `False` for all sessions. 1040 ''' 1041 self.sessions = { 1042 s: {'data': [r for r in self if r['Session'] == s]} 1043 for s in sorted({r['Session'] for r in self}) 1044 } 1045 for s in self.sessions: 1046 self.sessions[s]['scrambling_drift'] = False 1047 self.sessions[s]['slope_drift'] = False 1048 self.sessions[s]['wg_drift'] = False 1049 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1050 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1051 1052 1053 def refresh_samples(self): 1054 ''' 1055 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1056 ''' 1057 self.samples = { 1058 s: {'data': [r for r in self if r['Sample'] == s]} 1059 for s in sorted({r['Sample'] for r in self}) 1060 } 1061 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1062 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1063 1064 1065 def read(self, filename, sep = '', session = ''): 1066 ''' 1067 Read file in csv format to load data into a `D47data` object. 1068 1069 In the csv file, spaces before and after field separators (`','` by default) 1070 are optional. Each line corresponds to a single analysis. 1071 1072 The required fields are: 1073 1074 + `UID`: a unique identifier 1075 + `Session`: an identifier for the analytical session 1076 + `Sample`: a sample identifier 1077 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1078 1079 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1080 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1081 and `d49` are optional, and set to NaN by default. 1082 1083 **Parameters** 1084 1085 + `fileneme`: the path of the file to read 1086 + `sep`: csv separator delimiting the fields 1087 + `session`: set `Session` field to this string for all analyses 1088 ''' 1089 with open(filename) as fid: 1090 self.input(fid.read(), sep = sep, session = session) 1091 1092 1093 def input(self, txt, sep = '', session = ''): 1094 ''' 1095 Read `txt` string in csv format to load analysis data into a `D47data` object. 1096 1097 In the csv string, spaces before and after field separators (`','` by default) 1098 are optional. Each line corresponds to a single analysis. 1099 1100 The required fields are: 1101 1102 + `UID`: a unique identifier 1103 + `Session`: an identifier for the analytical session 1104 + `Sample`: a sample identifier 1105 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1106 1107 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1108 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1109 and `d49` are optional, and set to NaN by default. 1110 1111 **Parameters** 1112 1113 + `txt`: the csv string to read 1114 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1115 whichever appers most often in `txt`. 1116 + `session`: set `Session` field to this string for all analyses 1117 ''' 1118 if sep == '': 1119 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1120 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1121 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1122 1123 if session != '': 1124 for r in data: 1125 r['Session'] = session 1126 1127 self += data 1128 self.refresh() 1129 1130 1131 @make_verbal 1132 def wg(self, 1133 samples = None, 1134 session_groups = None, 1135 ): 1136 ''' 1137 Compute bulk composition of the working gas for each session based (by default) 1138 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1139 `self.Nominal_d18O_VPDB`. 1140 1141 **Parameters** 1142 1143 + `samples`: A list of samples specifying the subset of samples (defined in both 1144 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1145 when computing the working gas. By default, use all samples defined both in 1146 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1147 + `session_groups`: a list of lists of sessions 1148 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1149 specifying which sessions groups, if any, have the exact same WG composition. 1150 If set to `'all'`, force all sessions to have the same WG composition (use with 1151 caution and on short time scales, since the WG may drift slowly a long time scales). 1152 ''' 1153 1154 self.msg('Computing WG composition:') 1155 1156 a18_acid = self.ALPHA_18O_ACID_REACTION 1157 1158 if samples is None: 1159 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1160 if session_groups is None: 1161 session_groups = [[s] for s in self.sessions] 1162 elif session_groups == 'all': 1163 session_groups = [[s for s in self.sessions]] 1164 1165 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1166 R45R46_standards = {} 1167 for sample in samples: 1168 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1169 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1170 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1171 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1172 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1173 1174 C12_s = 1 / (1 + R13_s) 1175 C13_s = R13_s / (1 + R13_s) 1176 C16_s = 1 / (1 + R17_s + R18_s) 1177 C17_s = R17_s / (1 + R17_s + R18_s) 1178 C18_s = R18_s / (1 + R17_s + R18_s) 1179 1180 C626_s = C12_s * C16_s ** 2 1181 C627_s = 2 * C12_s * C16_s * C17_s 1182 C628_s = 2 * C12_s * C16_s * C18_s 1183 C636_s = C13_s * C16_s ** 2 1184 C637_s = 2 * C13_s * C16_s * C17_s 1185 C727_s = C12_s * C17_s ** 2 1186 1187 R45_s = (C627_s + C636_s) / C626_s 1188 R46_s = (C628_s + C637_s + C727_s) / C626_s 1189 R45R46_standards[sample] = (R45_s, R46_s) 1190 1191 for sg in session_groups: 1192 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1193 assert db, f'No sample from {samples} found in session group {sg}.' 1194 1195 X = [r['d45'] for r in db] 1196 Y = [R45R46_standards[r['Sample']][0] for r in db] 1197 x1, x2 = np.min(X), np.max(X) 1198 1199 if x1 < x2: 1200 wgcoord = x1/(x1-x2) 1201 else: 1202 wgcoord = 999 1203 1204 if wgcoord < -.5 or wgcoord > 1.5: 1205 # unreasonable to extrapolate to d45 = 0 1206 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1207 else : 1208 # d45 = 0 is reasonably well bracketed 1209 R45_wg = np.polyfit(X, Y, 1)[1] 1210 1211 X = [r['d46'] for r in db] 1212 Y = [R45R46_standards[r['Sample']][1] for r in db] 1213 x1, x2 = np.min(X), np.max(X) 1214 1215 if x1 < x2: 1216 wgcoord = x1/(x1-x2) 1217 else: 1218 wgcoord = 999 1219 1220 if wgcoord < -.5 or wgcoord > 1.5: 1221 # unreasonable to extrapolate to d46 = 0 1222 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1223 else : 1224 # d46 = 0 is reasonably well bracketed 1225 R46_wg = np.polyfit(X, Y, 1)[1] 1226 1227 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1228 1229 for s in sg: 1230 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1231 1232 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1233 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1234 for r in self.sessions[s]['data']: 1235 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1236 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1237 1238 1239 def compute_bulk_delta(self, R45, R46, D17O = 0): 1240 ''' 1241 Compute δ13C_VPDB and δ18O_VSMOW, 1242 by solving the generalized form of equation (17) from 1243 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1244 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1245 solving the corresponding second-order Taylor polynomial. 1246 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1247 ''' 1248 1249 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1250 1251 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1252 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1253 C = 2 * self.R18_VSMOW 1254 D = -R46 1255 1256 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1257 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1258 cc = A + B + C + D 1259 1260 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1261 1262 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1263 R17 = K * R18 ** self.LAMBDA_17 1264 R13 = R45 - 2 * R17 1265 1266 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1267 1268 return d13C_VPDB, d18O_VSMOW 1269 1270 1271 @make_verbal 1272 def crunch(self, verbose = ''): 1273 ''' 1274 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1275 ''' 1276 for r in self: 1277 self.compute_bulk_and_clumping_deltas(r) 1278 self.standardize_d13C() 1279 self.standardize_d18O() 1280 self.msg(f"Crunched {len(self)} analyses.") 1281 1282 1283 def fill_in_missing_info(self, session = 'mySession'): 1284 ''' 1285 Fill in optional fields with default values 1286 ''' 1287 for i,r in enumerate(self): 1288 if 'D17O' not in r: 1289 r['D17O'] = 0. 1290 if 'UID' not in r: 1291 r['UID'] = f'{i+1}' 1292 if 'Session' not in r: 1293 r['Session'] = session 1294 for k in ['d47', 'd48', 'd49']: 1295 if k not in r: 1296 r[k] = np.nan 1297 1298 1299 def standardize_d13C(self): 1300 ''' 1301 Perform δ13C standadization within each session `s` according to 1302 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1303 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1304 may be redefined abitrarily at a later stage. 1305 ''' 1306 for s in self.sessions: 1307 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1308 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1309 X,Y = zip(*XY) 1310 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1311 offset = np.mean(Y) - np.mean(X) 1312 for r in self.sessions[s]['data']: 1313 r['d13C_VPDB'] += offset 1314 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1315 a,b = np.polyfit(X,Y,1) 1316 for r in self.sessions[s]['data']: 1317 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1318 1319 def standardize_d18O(self): 1320 ''' 1321 Perform δ18O standadization within each session `s` according to 1322 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1323 which is defined by default by `D47data.refresh_sessions()`as equal to 1324 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1325 ''' 1326 for s in self.sessions: 1327 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1328 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1329 X,Y = zip(*XY) 1330 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1331 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1332 offset = np.mean(Y) - np.mean(X) 1333 for r in self.sessions[s]['data']: 1334 r['d18O_VSMOW'] += offset 1335 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1336 a,b = np.polyfit(X,Y,1) 1337 for r in self.sessions[s]['data']: 1338 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1339 1340 1341 def compute_bulk_and_clumping_deltas(self, r): 1342 ''' 1343 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1344 ''' 1345 1346 # Compute working gas R13, R18, and isobar ratios 1347 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1348 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1349 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1350 1351 # Compute analyte isobar ratios 1352 R45 = (1 + r['d45'] / 1000) * R45_wg 1353 R46 = (1 + r['d46'] / 1000) * R46_wg 1354 R47 = (1 + r['d47'] / 1000) * R47_wg 1355 R48 = (1 + r['d48'] / 1000) * R48_wg 1356 R49 = (1 + r['d49'] / 1000) * R49_wg 1357 1358 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1359 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1360 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1361 1362 # Compute stochastic isobar ratios of the analyte 1363 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1364 R13, R18, D17O = r['D17O'] 1365 ) 1366 1367 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1368 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1369 if (R45 / R45stoch - 1) > 5e-8: 1370 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1371 if (R46 / R46stoch - 1) > 5e-8: 1372 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1373 1374 # Compute raw clumped isotope anomalies 1375 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1376 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1377 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1378 1379 1380 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1381 ''' 1382 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1383 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1384 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1385 ''' 1386 1387 # Compute R17 1388 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1389 1390 # Compute isotope concentrations 1391 C12 = (1 + R13) ** -1 1392 C13 = C12 * R13 1393 C16 = (1 + R17 + R18) ** -1 1394 C17 = C16 * R17 1395 C18 = C16 * R18 1396 1397 # Compute stochastic isotopologue concentrations 1398 C626 = C16 * C12 * C16 1399 C627 = C16 * C12 * C17 * 2 1400 C628 = C16 * C12 * C18 * 2 1401 C636 = C16 * C13 * C16 1402 C637 = C16 * C13 * C17 * 2 1403 C638 = C16 * C13 * C18 * 2 1404 C727 = C17 * C12 * C17 1405 C728 = C17 * C12 * C18 * 2 1406 C737 = C17 * C13 * C17 1407 C738 = C17 * C13 * C18 * 2 1408 C828 = C18 * C12 * C18 1409 C838 = C18 * C13 * C18 1410 1411 # Compute stochastic isobar ratios 1412 R45 = (C636 + C627) / C626 1413 R46 = (C628 + C637 + C727) / C626 1414 R47 = (C638 + C728 + C737) / C626 1415 R48 = (C738 + C828) / C626 1416 R49 = C838 / C626 1417 1418 # Account for stochastic anomalies 1419 R47 *= 1 + D47 / 1000 1420 R48 *= 1 + D48 / 1000 1421 R49 *= 1 + D49 / 1000 1422 1423 # Return isobar ratios 1424 return R45, R46, R47, R48, R49 1425 1426 1427 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1428 ''' 1429 Split unknown samples by UID (treat all analyses as different samples) 1430 or by session (treat analyses of a given sample in different sessions as 1431 different samples). 1432 1433 **Parameters** 1434 1435 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1436 + `grouping`: `by_uid` | `by_session` 1437 ''' 1438 if samples_to_split == 'all': 1439 samples_to_split = [s for s in self.unknowns] 1440 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1441 self.grouping = grouping.lower() 1442 if self.grouping in gkeys: 1443 gkey = gkeys[self.grouping] 1444 for r in self: 1445 if r['Sample'] in samples_to_split: 1446 r['Sample_original'] = r['Sample'] 1447 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1448 elif r['Sample'] in self.unknowns: 1449 r['Sample_original'] = r['Sample'] 1450 self.refresh_samples() 1451 1452 1453 def unsplit_samples(self, tables = False): 1454 ''' 1455 Reverse the effects of `D47data.split_samples()`. 1456 1457 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1458 1459 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1460 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1461 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1462 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1463 that case session-averaged Δ4x values are statistically independent). 1464 ''' 1465 unknowns_old = sorted({s for s in self.unknowns}) 1466 CM_old = self.standardization.covar[:,:] 1467 VD_old = self.standardization.params.valuesdict().copy() 1468 vars_old = self.standardization.var_names 1469 1470 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1471 1472 Ns = len(vars_old) - len(unknowns_old) 1473 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1474 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1475 1476 W = np.zeros((len(vars_new), len(vars_old))) 1477 W[:Ns,:Ns] = np.eye(Ns) 1478 for u in unknowns_new: 1479 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1480 if self.grouping == 'by_session': 1481 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1482 elif self.grouping == 'by_uid': 1483 weights = [1 for s in splits] 1484 sw = sum(weights) 1485 weights = [w/sw for w in weights] 1486 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1487 1488 CM_new = W @ CM_old @ W.T 1489 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1490 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1491 1492 self.standardization.covar = CM_new 1493 self.standardization.params.valuesdict = lambda : VD_new 1494 self.standardization.var_names = vars_new 1495 1496 for r in self: 1497 if r['Sample'] in self.unknowns: 1498 r['Sample_split'] = r['Sample'] 1499 r['Sample'] = r['Sample_original'] 1500 1501 self.refresh_samples() 1502 self.consolidate_samples() 1503 self.repeatabilities() 1504 1505 if tables: 1506 self.table_of_analyses() 1507 self.table_of_samples() 1508 1509 def assign_timestamps(self): 1510 ''' 1511 Assign a time field `t` of type `float` to each analysis. 1512 1513 If `TimeTag` is one of the data fields, `t` is equal within a given session 1514 to `TimeTag` minus the mean value of `TimeTag` for that session. 1515 Otherwise, `TimeTag` is by default equal to the index of each analysis 1516 in the dataset and `t` is defined as above. 1517 ''' 1518 for session in self.sessions: 1519 sdata = self.sessions[session]['data'] 1520 try: 1521 t0 = np.mean([r['TimeTag'] for r in sdata]) 1522 for r in sdata: 1523 r['t'] = r['TimeTag'] - t0 1524 except KeyError: 1525 t0 = (len(sdata)-1)/2 1526 for t,r in enumerate(sdata): 1527 r['t'] = t - t0 1528 1529 1530 def report(self): 1531 ''' 1532 Prints a report on the standardization fit. 1533 Only applicable after `D4xdata.standardize(method='pooled')`. 1534 ''' 1535 report_fit(self.standardization) 1536 1537 1538 def combine_samples(self, sample_groups): 1539 ''' 1540 Combine analyses of different samples to compute weighted average Δ4x 1541 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1542 dictionary. 1543 1544 Caution: samples are weighted by number of replicate analyses, which is a 1545 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1546 correlated analytical errors for one or more samples). 1547 1548 Returns a tuplet of: 1549 1550 + the list of group names 1551 + an array of the corresponding Δ4x values 1552 + the corresponding (co)variance matrix 1553 1554 **Parameters** 1555 1556 + `sample_groups`: a dictionary of the form: 1557 ```py 1558 {'group1': ['sample_1', 'sample_2'], 1559 'group2': ['sample_3', 'sample_4', 'sample_5']} 1560 ``` 1561 ''' 1562 1563 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1564 groups = sorted(sample_groups.keys()) 1565 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1566 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1567 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1568 W = np.array([ 1569 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1570 for j in groups]) 1571 D4x_new = W @ D4x_old 1572 CM_new = W @ CM_old @ W.T 1573 1574 return groups, D4x_new[:,0], CM_new 1575 1576 1577 @make_verbal 1578 def standardize(self, 1579 method = 'pooled', 1580 weighted_sessions = [], 1581 consolidate = True, 1582 consolidate_tables = False, 1583 consolidate_plots = False, 1584 constraints = {}, 1585 ): 1586 ''' 1587 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1588 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1589 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1590 i.e. that their true Δ4x value does not change between sessions, 1591 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1592 `'indep_sessions'`, the standardization processes each session independently, based only 1593 on anchors analyses. 1594 ''' 1595 1596 self.standardization_method = method 1597 self.assign_timestamps() 1598 1599 if method == 'pooled': 1600 if weighted_sessions: 1601 for session_group in weighted_sessions: 1602 if self._4x == '47': 1603 X = D47data([r for r in self if r['Session'] in session_group]) 1604 elif self._4x == '48': 1605 X = D48data([r for r in self if r['Session'] in session_group]) 1606 X.Nominal_D4x = self.Nominal_D4x.copy() 1607 X.refresh() 1608 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1609 w = np.sqrt(result.redchi) 1610 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1611 for r in X: 1612 r[f'wD{self._4x}raw'] *= w 1613 else: 1614 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1615 for r in self: 1616 r[f'wD{self._4x}raw'] = 1. 1617 1618 params = Parameters() 1619 for k,session in enumerate(self.sessions): 1620 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1621 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1622 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1623 s = pf(session) 1624 params.add(f'a_{s}', value = 0.9) 1625 params.add(f'b_{s}', value = 0.) 1626 params.add(f'c_{s}', value = -0.9) 1627 params.add(f'a2_{s}', value = 0., 1628# vary = self.sessions[session]['scrambling_drift'], 1629 ) 1630 params.add(f'b2_{s}', value = 0., 1631# vary = self.sessions[session]['slope_drift'], 1632 ) 1633 params.add(f'c2_{s}', value = 0., 1634# vary = self.sessions[session]['wg_drift'], 1635 ) 1636 if not self.sessions[session]['scrambling_drift']: 1637 params[f'a2_{s}'].expr = '0' 1638 if not self.sessions[session]['slope_drift']: 1639 params[f'b2_{s}'].expr = '0' 1640 if not self.sessions[session]['wg_drift']: 1641 params[f'c2_{s}'].expr = '0' 1642 1643 for sample in self.unknowns: 1644 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1645 1646 for k in constraints: 1647 params[k].expr = constraints[k] 1648 1649 def residuals(p): 1650 R = [] 1651 for r in self: 1652 session = pf(r['Session']) 1653 sample = pf(r['Sample']) 1654 if r['Sample'] in self.Nominal_D4x: 1655 R += [ ( 1656 r[f'D{self._4x}raw'] - ( 1657 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1658 + p[f'b_{session}'] * r[f'd{self._4x}'] 1659 + p[f'c_{session}'] 1660 + r['t'] * ( 1661 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1662 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1663 + p[f'c2_{session}'] 1664 ) 1665 ) 1666 ) / r[f'wD{self._4x}raw'] ] 1667 else: 1668 R += [ ( 1669 r[f'D{self._4x}raw'] - ( 1670 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1671 + p[f'b_{session}'] * r[f'd{self._4x}'] 1672 + p[f'c_{session}'] 1673 + r['t'] * ( 1674 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1675 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1676 + p[f'c2_{session}'] 1677 ) 1678 ) 1679 ) / r[f'wD{self._4x}raw'] ] 1680 return R 1681 1682 M = Minimizer(residuals, params) 1683 result = M.least_squares() 1684 self.Nf = result.nfree 1685 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1686 new_names, new_covar, new_se = _fullcovar(result)[:3] 1687 result.var_names = new_names 1688 result.covar = new_covar 1689 1690 for r in self: 1691 s = pf(r["Session"]) 1692 a = result.params.valuesdict()[f'a_{s}'] 1693 b = result.params.valuesdict()[f'b_{s}'] 1694 c = result.params.valuesdict()[f'c_{s}'] 1695 a2 = result.params.valuesdict()[f'a2_{s}'] 1696 b2 = result.params.valuesdict()[f'b2_{s}'] 1697 c2 = result.params.valuesdict()[f'c2_{s}'] 1698 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1699 1700 1701 self.standardization = result 1702 1703 for session in self.sessions: 1704 self.sessions[session]['Np'] = 3 1705 for k in ['scrambling', 'slope', 'wg']: 1706 if self.sessions[session][f'{k}_drift']: 1707 self.sessions[session]['Np'] += 1 1708 1709 if consolidate: 1710 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1711 return result 1712 1713 1714 elif method == 'indep_sessions': 1715 1716 if weighted_sessions: 1717 for session_group in weighted_sessions: 1718 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1719 X.Nominal_D4x = self.Nominal_D4x.copy() 1720 X.refresh() 1721 # This is only done to assign r['wD47raw'] for r in X: 1722 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1723 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1724 else: 1725 self.msg('All weights set to 1 ‰') 1726 for r in self: 1727 r[f'wD{self._4x}raw'] = 1 1728 1729 for session in self.sessions: 1730 s = self.sessions[session] 1731 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1732 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1733 s['Np'] = sum(p_active) 1734 sdata = s['data'] 1735 1736 A = np.array([ 1737 [ 1738 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1739 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1740 1 / r[f'wD{self._4x}raw'], 1741 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1742 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1743 r['t'] / r[f'wD{self._4x}raw'] 1744 ] 1745 for r in sdata if r['Sample'] in self.anchors 1746 ])[:,p_active] # only keep columns for the active parameters 1747 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1748 s['Na'] = Y.size 1749 CM = linalg.inv(A.T @ A) 1750 bf = (CM @ A.T @ Y).T[0,:] 1751 k = 0 1752 for n,a in zip(p_names, p_active): 1753 if a: 1754 s[n] = bf[k] 1755# self.msg(f'{n} = {bf[k]}') 1756 k += 1 1757 else: 1758 s[n] = 0. 1759# self.msg(f'{n} = 0.0') 1760 1761 for r in sdata : 1762 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1763 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1764 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1765 1766 s['CM'] = np.zeros((6,6)) 1767 i = 0 1768 k_active = [j for j,a in enumerate(p_active) if a] 1769 for j,a in enumerate(p_active): 1770 if a: 1771 s['CM'][j,k_active] = CM[i,:] 1772 i += 1 1773 1774 if not weighted_sessions: 1775 w = self.rmswd()['rmswd'] 1776 for r in self: 1777 r[f'wD{self._4x}'] *= w 1778 r[f'wD{self._4x}raw'] *= w 1779 for session in self.sessions: 1780 self.sessions[session]['CM'] *= w**2 1781 1782 for session in self.sessions: 1783 s = self.sessions[session] 1784 s['SE_a'] = s['CM'][0,0]**.5 1785 s['SE_b'] = s['CM'][1,1]**.5 1786 s['SE_c'] = s['CM'][2,2]**.5 1787 s['SE_a2'] = s['CM'][3,3]**.5 1788 s['SE_b2'] = s['CM'][4,4]**.5 1789 s['SE_c2'] = s['CM'][5,5]**.5 1790 1791 if not weighted_sessions: 1792 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1793 else: 1794 self.Nf = 0 1795 for sg in weighted_sessions: 1796 self.Nf += self.rmswd(sessions = sg)['Nf'] 1797 1798 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1799 1800 avgD4x = { 1801 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1802 for sample in self.samples 1803 } 1804 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1805 rD4x = (chi2/self.Nf)**.5 1806 self.repeatability[f'sigma_{self._4x}'] = rD4x 1807 1808 if consolidate: 1809 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1810 1811 1812 def standardization_error(self, session, d4x, D4x, t = 0): 1813 ''' 1814 Compute standardization error for a given session and 1815 (δ47, Δ47) composition. 1816 ''' 1817 a = self.sessions[session]['a'] 1818 b = self.sessions[session]['b'] 1819 c = self.sessions[session]['c'] 1820 a2 = self.sessions[session]['a2'] 1821 b2 = self.sessions[session]['b2'] 1822 c2 = self.sessions[session]['c2'] 1823 CM = self.sessions[session]['CM'] 1824 1825 x, y = D4x, d4x 1826 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1827# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1828 dxdy = -(b+b2*t) / (a+a2*t) 1829 dxdz = 1. / (a+a2*t) 1830 dxda = -x / (a+a2*t) 1831 dxdb = -y / (a+a2*t) 1832 dxdc = -1. / (a+a2*t) 1833 dxda2 = -x * a2 / (a+a2*t) 1834 dxdb2 = -y * t / (a+a2*t) 1835 dxdc2 = -t / (a+a2*t) 1836 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1837 sx = (V @ CM @ V.T) ** .5 1838 return sx 1839 1840 1841 @make_verbal 1842 def summary(self, 1843 dir = 'output', 1844 filename = None, 1845 save_to_file = True, 1846 print_out = True, 1847 ): 1848 ''' 1849 Print out an/or save to disk a summary of the standardization results. 1850 1851 **Parameters** 1852 1853 + `dir`: the directory in which to save the table 1854 + `filename`: the name to the csv file to write to 1855 + `save_to_file`: whether to save the table to disk 1856 + `print_out`: whether to print out the table 1857 ''' 1858 1859 out = [] 1860 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1861 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1862 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1863 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1864 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1865 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1866 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1867 out += [['Model degrees of freedom', f"{self.Nf}"]] 1868 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1869 out += [['Standardization method', self.standardization_method]] 1870 1871 if save_to_file: 1872 if not os.path.exists(dir): 1873 os.makedirs(dir) 1874 if filename is None: 1875 filename = f'D{self._4x}_summary.csv' 1876 with open(f'{dir}/{filename}', 'w') as fid: 1877 fid.write(make_csv(out)) 1878 if print_out: 1879 self.msg('\n' + pretty_table(out, header = 0)) 1880 1881 1882 @make_verbal 1883 def table_of_sessions(self, 1884 dir = 'output', 1885 filename = None, 1886 save_to_file = True, 1887 print_out = True, 1888 output = None, 1889 ): 1890 ''' 1891 Print out an/or save to disk a table of sessions. 1892 1893 **Parameters** 1894 1895 + `dir`: the directory in which to save the table 1896 + `filename`: the name to the csv file to write to 1897 + `save_to_file`: whether to save the table to disk 1898 + `print_out`: whether to print out the table 1899 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1900 if set to `'raw'`: return a list of list of strings 1901 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1902 ''' 1903 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1904 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1905 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1906 1907 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1908 if include_a2: 1909 out[-1] += ['a2 ± SE'] 1910 if include_b2: 1911 out[-1] += ['b2 ± SE'] 1912 if include_c2: 1913 out[-1] += ['c2 ± SE'] 1914 for session in self.sessions: 1915 out += [[ 1916 session, 1917 f"{self.sessions[session]['Na']}", 1918 f"{self.sessions[session]['Nu']}", 1919 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1920 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1921 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1922 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1923 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1924 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1925 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1926 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1927 ]] 1928 if include_a2: 1929 if self.sessions[session]['scrambling_drift']: 1930 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1931 else: 1932 out[-1] += [''] 1933 if include_b2: 1934 if self.sessions[session]['slope_drift']: 1935 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1936 else: 1937 out[-1] += [''] 1938 if include_c2: 1939 if self.sessions[session]['wg_drift']: 1940 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1941 else: 1942 out[-1] += [''] 1943 1944 if save_to_file: 1945 if not os.path.exists(dir): 1946 os.makedirs(dir) 1947 if filename is None: 1948 filename = f'D{self._4x}_sessions.csv' 1949 with open(f'{dir}/{filename}', 'w') as fid: 1950 fid.write(make_csv(out)) 1951 if print_out: 1952 self.msg('\n' + pretty_table(out)) 1953 if output == 'raw': 1954 return out 1955 elif output == 'pretty': 1956 return pretty_table(out) 1957 1958 1959 @make_verbal 1960 def table_of_analyses( 1961 self, 1962 dir = 'output', 1963 filename = None, 1964 save_to_file = True, 1965 print_out = True, 1966 output = None, 1967 ): 1968 ''' 1969 Print out an/or save to disk a table of analyses. 1970 1971 **Parameters** 1972 1973 + `dir`: the directory in which to save the table 1974 + `filename`: the name to the csv file to write to 1975 + `save_to_file`: whether to save the table to disk 1976 + `print_out`: whether to print out the table 1977 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1978 if set to `'raw'`: return a list of list of strings 1979 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1980 ''' 1981 1982 out = [['UID','Session','Sample']] 1983 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1984 for f in extra_fields: 1985 out[-1] += [f[0]] 1986 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1987 for r in self: 1988 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1989 for f in extra_fields: 1990 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1991 out[-1] += [ 1992 f"{r['d13Cwg_VPDB']:.3f}", 1993 f"{r['d18Owg_VSMOW']:.3f}", 1994 f"{r['d45']:.6f}", 1995 f"{r['d46']:.6f}", 1996 f"{r['d47']:.6f}", 1997 f"{r['d48']:.6f}", 1998 f"{r['d49']:.6f}", 1999 f"{r['d13C_VPDB']:.6f}", 2000 f"{r['d18O_VSMOW']:.6f}", 2001 f"{r['D47raw']:.6f}", 2002 f"{r['D48raw']:.6f}", 2003 f"{r['D49raw']:.6f}", 2004 f"{r[f'D{self._4x}']:.6f}" 2005 ] 2006 if save_to_file: 2007 if not os.path.exists(dir): 2008 os.makedirs(dir) 2009 if filename is None: 2010 filename = f'D{self._4x}_analyses.csv' 2011 with open(f'{dir}/{filename}', 'w') as fid: 2012 fid.write(make_csv(out)) 2013 if print_out: 2014 self.msg('\n' + pretty_table(out)) 2015 return out 2016 2017 @make_verbal 2018 def covar_table( 2019 self, 2020 correl = False, 2021 dir = 'output', 2022 filename = None, 2023 save_to_file = True, 2024 print_out = True, 2025 output = None, 2026 ): 2027 ''' 2028 Print out, save to disk and/or return the variance-covariance matrix of D4x 2029 for all unknown samples. 2030 2031 **Parameters** 2032 2033 + `dir`: the directory in which to save the csv 2034 + `filename`: the name of the csv file to write to 2035 + `save_to_file`: whether to save the csv 2036 + `print_out`: whether to print out the matrix 2037 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2038 if set to `'raw'`: return a list of list of strings 2039 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2040 ''' 2041 samples = sorted([u for u in self.unknowns]) 2042 out = [[''] + samples] 2043 for s1 in samples: 2044 out.append([s1]) 2045 for s2 in samples: 2046 if correl: 2047 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2048 else: 2049 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2050 2051 if save_to_file: 2052 if not os.path.exists(dir): 2053 os.makedirs(dir) 2054 if filename is None: 2055 if correl: 2056 filename = f'D{self._4x}_correl.csv' 2057 else: 2058 filename = f'D{self._4x}_covar.csv' 2059 with open(f'{dir}/{filename}', 'w') as fid: 2060 fid.write(make_csv(out)) 2061 if print_out: 2062 self.msg('\n'+pretty_table(out)) 2063 if output == 'raw': 2064 return out 2065 elif output == 'pretty': 2066 return pretty_table(out) 2067 2068 @make_verbal 2069 def table_of_samples( 2070 self, 2071 dir = 'output', 2072 filename = None, 2073 save_to_file = True, 2074 print_out = True, 2075 output = None, 2076 ): 2077 ''' 2078 Print out, save to disk and/or return a table of samples. 2079 2080 **Parameters** 2081 2082 + `dir`: the directory in which to save the csv 2083 + `filename`: the name of the csv file to write to 2084 + `save_to_file`: whether to save the csv 2085 + `print_out`: whether to print out the table 2086 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2087 if set to `'raw'`: return a list of list of strings 2088 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2089 ''' 2090 2091 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2092 for sample in self.anchors: 2093 out += [[ 2094 f"{sample}", 2095 f"{self.samples[sample]['N']}", 2096 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2097 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2098 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2099 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2100 ]] 2101 for sample in self.unknowns: 2102 out += [[ 2103 f"{sample}", 2104 f"{self.samples[sample]['N']}", 2105 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2106 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2107 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2108 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2109 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2110 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2111 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2112 ]] 2113 if save_to_file: 2114 if not os.path.exists(dir): 2115 os.makedirs(dir) 2116 if filename is None: 2117 filename = f'D{self._4x}_samples.csv' 2118 with open(f'{dir}/{filename}', 'w') as fid: 2119 fid.write(make_csv(out)) 2120 if print_out: 2121 self.msg('\n'+pretty_table(out)) 2122 if output == 'raw': 2123 return out 2124 elif output == 'pretty': 2125 return pretty_table(out) 2126 2127 2128 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2129 ''' 2130 Generate session plots and save them to disk. 2131 2132 **Parameters** 2133 2134 + `dir`: the directory in which to save the plots 2135 + `figsize`: the width and height (in inches) of each plot 2136 + `filetype`: 'pdf' or 'png' 2137 + `dpi`: resolution for PNG output 2138 ''' 2139 if not os.path.exists(dir): 2140 os.makedirs(dir) 2141 2142 for session in self.sessions: 2143 sp = self.plot_single_session(session, xylimits = 'constant') 2144 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2145 ppl.close(sp.fig) 2146 2147 2148 2149 @make_verbal 2150 def consolidate_samples(self): 2151 ''' 2152 Compile various statistics for each sample. 2153 2154 For each anchor sample: 2155 2156 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2157 + `SE_D47` or `SE_D48`: set to zero by definition 2158 2159 For each unknown sample: 2160 2161 + `D47` or `D48`: the standardized Δ4x value for this unknown 2162 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2163 2164 For each anchor and unknown: 2165 2166 + `N`: the total number of analyses of this sample 2167 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2168 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2169 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2170 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2171 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2172 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2173 ''' 2174 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2175 for sample in self.samples: 2176 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2177 if self.samples[sample]['N'] > 1: 2178 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2179 2180 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2181 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2182 2183 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2184 if len(D4x_pop) > 2: 2185 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2186 2187 if self.standardization_method == 'pooled': 2188 for sample in self.anchors: 2189 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2190 self.samples[sample][f'SE_D{self._4x}'] = 0. 2191 for sample in self.unknowns: 2192 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2193 try: 2194 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2195 except ValueError: 2196 # when `sample` is constrained by self.standardize(constraints = {...}), 2197 # it is no longer listed in self.standardization.var_names. 2198 # Temporary fix: define SE as zero for now 2199 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2200 2201 elif self.standardization_method == 'indep_sessions': 2202 for sample in self.anchors: 2203 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2204 self.samples[sample][f'SE_D{self._4x}'] = 0. 2205 for sample in self.unknowns: 2206 self.msg(f'Consolidating sample {sample}') 2207 self.unknowns[sample][f'session_D{self._4x}'] = {} 2208 session_avg = [] 2209 for session in self.sessions: 2210 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2211 if sdata: 2212 self.msg(f'{sample} found in session {session}') 2213 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2214 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2215 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2216 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2217 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2218 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2219 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2220 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2221 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2222 wsum = sum([weights[s] for s in weights]) 2223 for s in weights: 2224 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2225 2226 for r in self: 2227 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2228 2229 2230 2231 def consolidate_sessions(self): 2232 ''' 2233 Compute various statistics for each session. 2234 2235 + `Na`: Number of anchor analyses in the session 2236 + `Nu`: Number of unknown analyses in the session 2237 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2238 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2239 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2240 + `a`: scrambling factor 2241 + `b`: compositional slope 2242 + `c`: WG offset 2243 + `SE_a`: Model stadard erorr of `a` 2244 + `SE_b`: Model stadard erorr of `b` 2245 + `SE_c`: Model stadard erorr of `c` 2246 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2247 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2248 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2249 + `a2`: scrambling factor drift 2250 + `b2`: compositional slope drift 2251 + `c2`: WG offset drift 2252 + `Np`: Number of standardization parameters to fit 2253 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2254 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2255 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2256 ''' 2257 for session in self.sessions: 2258 if 'd13Cwg_VPDB' not in self.sessions[session]: 2259 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2260 if 'd18Owg_VSMOW' not in self.sessions[session]: 2261 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2262 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2263 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2264 2265 self.msg(f'Computing repeatabilities for session {session}') 2266 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2267 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2268 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2269 2270 if self.standardization_method == 'pooled': 2271 for session in self.sessions: 2272 2273 # different (better?) computation of D4x repeatability for each session: 2274 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2275 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2276 2277 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2278 i = self.standardization.var_names.index(f'a_{pf(session)}') 2279 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2280 2281 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2282 i = self.standardization.var_names.index(f'b_{pf(session)}') 2283 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2284 2285 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2286 i = self.standardization.var_names.index(f'c_{pf(session)}') 2287 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2288 2289 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2290 if self.sessions[session]['scrambling_drift']: 2291 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2292 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2293 else: 2294 self.sessions[session]['SE_a2'] = 0. 2295 2296 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2297 if self.sessions[session]['slope_drift']: 2298 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2299 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2300 else: 2301 self.sessions[session]['SE_b2'] = 0. 2302 2303 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2304 if self.sessions[session]['wg_drift']: 2305 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2306 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2307 else: 2308 self.sessions[session]['SE_c2'] = 0. 2309 2310 i = self.standardization.var_names.index(f'a_{pf(session)}') 2311 j = self.standardization.var_names.index(f'b_{pf(session)}') 2312 k = self.standardization.var_names.index(f'c_{pf(session)}') 2313 CM = np.zeros((6,6)) 2314 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2315 try: 2316 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2317 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2318 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2319 try: 2320 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2321 CM[3,4] = self.standardization.covar[i2,j2] 2322 CM[4,3] = self.standardization.covar[j2,i2] 2323 except ValueError: 2324 pass 2325 try: 2326 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2327 CM[3,5] = self.standardization.covar[i2,k2] 2328 CM[5,3] = self.standardization.covar[k2,i2] 2329 except ValueError: 2330 pass 2331 except ValueError: 2332 pass 2333 try: 2334 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2335 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2336 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2337 try: 2338 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2339 CM[4,5] = self.standardization.covar[j2,k2] 2340 CM[5,4] = self.standardization.covar[k2,j2] 2341 except ValueError: 2342 pass 2343 except ValueError: 2344 pass 2345 try: 2346 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2347 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2348 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2349 except ValueError: 2350 pass 2351 2352 self.sessions[session]['CM'] = CM 2353 2354 elif self.standardization_method == 'indep_sessions': 2355 pass # Not implemented yet 2356 2357 2358 @make_verbal 2359 def repeatabilities(self): 2360 ''' 2361 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2362 (for all samples, for anchors, and for unknowns). 2363 ''' 2364 self.msg('Computing reproducibilities for all sessions') 2365 2366 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2367 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2368 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2369 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2370 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2371 2372 2373 @make_verbal 2374 def consolidate(self, tables = True, plots = True): 2375 ''' 2376 Collect information about samples, sessions and repeatabilities. 2377 ''' 2378 self.consolidate_samples() 2379 self.consolidate_sessions() 2380 self.repeatabilities() 2381 2382 if tables: 2383 self.summary() 2384 self.table_of_sessions() 2385 self.table_of_analyses() 2386 self.table_of_samples() 2387 2388 if plots: 2389 self.plot_sessions() 2390 2391 2392 @make_verbal 2393 def rmswd(self, 2394 samples = 'all samples', 2395 sessions = 'all sessions', 2396 ): 2397 ''' 2398 Compute the χ2, root mean squared weighted deviation 2399 (i.e. reduced χ2), and corresponding degrees of freedom of the 2400 Δ4x values for samples in `samples` and sessions in `sessions`. 2401 2402 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2403 ''' 2404 if samples == 'all samples': 2405 mysamples = [k for k in self.samples] 2406 elif samples == 'anchors': 2407 mysamples = [k for k in self.anchors] 2408 elif samples == 'unknowns': 2409 mysamples = [k for k in self.unknowns] 2410 else: 2411 mysamples = samples 2412 2413 if sessions == 'all sessions': 2414 sessions = [k for k in self.sessions] 2415 2416 chisq, Nf = 0, 0 2417 for sample in mysamples : 2418 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2419 if len(G) > 1 : 2420 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2421 Nf += (len(G) - 1) 2422 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2423 r = (chisq / Nf)**.5 if Nf > 0 else 0 2424 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2425 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2426 2427 2428 @make_verbal 2429 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2430 ''' 2431 Compute the repeatability of `[r[key] for r in self]` 2432 ''' 2433 2434 if samples == 'all samples': 2435 mysamples = [k for k in self.samples] 2436 elif samples == 'anchors': 2437 mysamples = [k for k in self.anchors] 2438 elif samples == 'unknowns': 2439 mysamples = [k for k in self.unknowns] 2440 else: 2441 mysamples = samples 2442 2443 if sessions == 'all sessions': 2444 sessions = [k for k in self.sessions] 2445 2446 if key in ['D47', 'D48']: 2447 # Full disclosure: the definition of Nf is tricky/debatable 2448 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2449 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2450 Nf = len(G) 2451# print(f'len(G) = {Nf}') 2452 Nf -= len([s for s in mysamples if s in self.unknowns]) 2453# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2454 for session in sessions: 2455 Np = len([ 2456 _ for _ in self.standardization.params 2457 if ( 2458 self.standardization.params[_].expr is not None 2459 and ( 2460 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2461 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2462 ) 2463 ) 2464 ]) 2465# print(f'session {session}: {Np} parameters to consider') 2466 Na = len({ 2467 r['Sample'] for r in self.sessions[session]['data'] 2468 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2469 }) 2470# print(f'session {session}: {Na} different anchors in that session') 2471 Nf -= min(Np, Na) 2472# print(f'Nf = {Nf}') 2473 2474# for sample in mysamples : 2475# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2476# if len(X) > 1 : 2477# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2478# if sample in self.unknowns: 2479# Nf += len(X) - 1 2480# else: 2481# Nf += len(X) 2482# if samples in ['anchors', 'all samples']: 2483# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2484 r = (chisq / Nf)**.5 if Nf > 0 else 0 2485 2486 else: # if key not in ['D47', 'D48'] 2487 chisq, Nf = 0, 0 2488 for sample in mysamples : 2489 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2490 if len(X) > 1 : 2491 Nf += len(X) - 1 2492 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2493 r = (chisq / Nf)**.5 if Nf > 0 else 0 2494 2495 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2496 return r 2497 2498 def sample_average(self, samples, weights = 'equal', normalize = True): 2499 ''' 2500 Weighted average Δ4x value of a group of samples, accounting for covariance. 2501 2502 Returns the weighed average Δ4x value and associated SE 2503 of a group of samples. Weights are equal by default. If `normalize` is 2504 true, `weights` will be rescaled so that their sum equals 1. 2505 2506 **Examples** 2507 2508 ```python 2509 self.sample_average(['X','Y'], [1, 2]) 2510 ``` 2511 2512 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2513 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2514 values of samples X and Y, respectively. 2515 2516 ```python 2517 self.sample_average(['X','Y'], [1, -1], normalize = False) 2518 ``` 2519 2520 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2521 ''' 2522 if weights == 'equal': 2523 weights = [1/len(samples)] * len(samples) 2524 2525 if normalize: 2526 s = sum(weights) 2527 if s: 2528 weights = [w/s for w in weights] 2529 2530 try: 2531# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2532# C = self.standardization.covar[indices,:][:,indices] 2533 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2534 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2535 return correlated_sum(X, C, weights) 2536 except ValueError: 2537 return (0., 0.) 2538 2539 2540 def sample_D4x_covar(self, sample1, sample2 = None): 2541 ''' 2542 Covariance between Δ4x values of samples 2543 2544 Returns the error covariance between the average Δ4x values of two 2545 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2546 returns the Δ4x variance for that sample. 2547 ''' 2548 if sample2 is None: 2549 sample2 = sample1 2550 if self.standardization_method == 'pooled': 2551 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2552 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2553 return self.standardization.covar[i, j] 2554 elif self.standardization_method == 'indep_sessions': 2555 if sample1 == sample2: 2556 return self.samples[sample1][f'SE_D{self._4x}']**2 2557 else: 2558 c = 0 2559 for session in self.sessions: 2560 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2561 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2562 if sdata1 and sdata2: 2563 a = self.sessions[session]['a'] 2564 # !! TODO: CM below does not account for temporal changes in standardization parameters 2565 CM = self.sessions[session]['CM'][:3,:3] 2566 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2567 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2568 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2569 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2570 c += ( 2571 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2572 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2573 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2574 @ CM 2575 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2576 ) / a**2 2577 return float(c) 2578 2579 def sample_D4x_correl(self, sample1, sample2 = None): 2580 ''' 2581 Correlation between Δ4x errors of samples 2582 2583 Returns the error correlation between the average Δ4x values of two samples. 2584 ''' 2585 if sample2 is None or sample2 == sample1: 2586 return 1. 2587 return ( 2588 self.sample_D4x_covar(sample1, sample2) 2589 / self.unknowns[sample1][f'SE_D{self._4x}'] 2590 / self.unknowns[sample2][f'SE_D{self._4x}'] 2591 ) 2592 2593 def plot_single_session(self, 2594 session, 2595 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2596 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2597 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2598 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2599 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2600 xylimits = 'free', # | 'constant' 2601 x_label = None, 2602 y_label = None, 2603 error_contour_interval = 'auto', 2604 fig = 'new', 2605 ): 2606 ''' 2607 Generate plot for a single session 2608 ''' 2609 if x_label is None: 2610 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2611 if y_label is None: 2612 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2613 2614 out = _SessionPlot() 2615 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2616 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2617 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2618 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2619 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2620 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2621 anchor_avg = (np.array([ np.array([ 2622 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2623 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2624 ]) for sample in anchors]).T, 2625 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2626 unknown_avg = (np.array([ np.array([ 2627 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2628 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2629 ]) for sample in unknowns]).T, 2630 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2631 2632 2633 if fig == 'new': 2634 out.fig = ppl.figure(figsize = (6,6)) 2635 ppl.subplots_adjust(.1,.1,.9,.9) 2636 2637 out.anchor_analyses, = ppl.plot( 2638 anchors_d, 2639 anchors_D, 2640 **kw_plot_anchors) 2641 out.unknown_analyses, = ppl.plot( 2642 unknowns_d, 2643 unknowns_D, 2644 **kw_plot_unknowns) 2645 out.anchor_avg = ppl.plot( 2646 *anchor_avg, 2647 **kw_plot_anchor_avg) 2648 out.unknown_avg = ppl.plot( 2649 *unknown_avg, 2650 **kw_plot_unknown_avg) 2651 if xylimits == 'constant': 2652 x = [r[f'd{self._4x}'] for r in self] 2653 y = [r[f'D{self._4x}'] for r in self] 2654 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2655 w, h = x2-x1, y2-y1 2656 x1 -= w/20 2657 x2 += w/20 2658 y1 -= h/20 2659 y2 += h/20 2660 ppl.axis([x1, x2, y1, y2]) 2661 elif xylimits == 'free': 2662 x1, x2, y1, y2 = ppl.axis() 2663 else: 2664 x1, x2, y1, y2 = ppl.axis(xylimits) 2665 2666 if error_contour_interval != 'none': 2667 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2668 XI,YI = np.meshgrid(xi, yi) 2669 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2670 if error_contour_interval == 'auto': 2671 rng = np.max(SI) - np.min(SI) 2672 if rng <= 0.01: 2673 cinterval = 0.001 2674 elif rng <= 0.03: 2675 cinterval = 0.004 2676 elif rng <= 0.1: 2677 cinterval = 0.01 2678 elif rng <= 0.3: 2679 cinterval = 0.03 2680 elif rng <= 1.: 2681 cinterval = 0.1 2682 else: 2683 cinterval = 0.5 2684 else: 2685 cinterval = error_contour_interval 2686 2687 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2688 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2689 out.clabel = ppl.clabel(out.contour) 2690 contour = (XI, YI, SI, cval, cinterval) 2691 2692 if fig == None: 2693 return { 2694 'anchors':anchors, 2695 'unknowns':unknowns, 2696 'anchors_d':anchors_d, 2697 'anchors_D':anchors_D, 2698 'unknowns_d':unknowns_d, 2699 'unknowns_D':unknowns_D, 2700 'anchor_avg':anchor_avg, 2701 'unknown_avg':unknown_avg, 2702 'contour':contour, 2703 } 2704 2705 ppl.xlabel(x_label) 2706 ppl.ylabel(y_label) 2707 ppl.title(session, weight = 'bold') 2708 ppl.grid(alpha = .2) 2709 out.ax = ppl.gca() 2710 2711 return out 2712 2713 def plot_residuals( 2714 self, 2715 kde = False, 2716 hist = False, 2717 binwidth = 2/3, 2718 dir = 'output', 2719 filename = None, 2720 highlight = [], 2721 colors = None, 2722 figsize = None, 2723 dpi = 100, 2724 yspan = None, 2725 ): 2726 ''' 2727 Plot residuals of each analysis as a function of time (actually, as a function of 2728 the order of analyses in the `D4xdata` object) 2729 2730 + `kde`: whether to add a kernel density estimate of residuals 2731 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2732 + `histbins`: specify bin edges for the histogram 2733 + `dir`: the directory in which to save the plot 2734 + `highlight`: a list of samples to highlight 2735 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2736 + `figsize`: (width, height) of figure 2737 + `dpi`: resolution for PNG output 2738 + `yspan`: factor controlling the range of y values shown in plot 2739 (by default: `yspan = 1.5 if kde else 1.0`) 2740 ''' 2741 2742 from matplotlib import ticker 2743 2744 if yspan is None: 2745 if kde: 2746 yspan = 1.5 2747 else: 2748 yspan = 1.0 2749 2750 # Layout 2751 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2752 if hist or kde: 2753 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2754 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2755 else: 2756 ppl.subplots_adjust(.08,.05,.78,.8) 2757 ax1 = ppl.subplot(111) 2758 2759 # Colors 2760 N = len(self.anchors) 2761 if colors is None: 2762 if len(highlight) > 0: 2763 Nh = len(highlight) 2764 if Nh == 1: 2765 colors = {highlight[0]: (0,0,0)} 2766 elif Nh == 3: 2767 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2768 elif Nh == 4: 2769 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2770 else: 2771 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2772 else: 2773 if N == 3: 2774 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2775 elif N == 4: 2776 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2777 else: 2778 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2779 2780 ppl.sca(ax1) 2781 2782 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2783 2784 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2785 2786 session = self[0]['Session'] 2787 x1 = 0 2788# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2789 x_sessions = {} 2790 one_or_more_singlets = False 2791 one_or_more_multiplets = False 2792 multiplets = set() 2793 for k,r in enumerate(self): 2794 if r['Session'] != session: 2795 x2 = k-1 2796 x_sessions[session] = (x1+x2)/2 2797 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2798 session = r['Session'] 2799 x1 = k 2800 singlet = len(self.samples[r['Sample']]['data']) == 1 2801 if not singlet: 2802 multiplets.add(r['Sample']) 2803 if r['Sample'] in self.unknowns: 2804 if singlet: 2805 one_or_more_singlets = True 2806 else: 2807 one_or_more_multiplets = True 2808 kw = dict( 2809 marker = 'x' if singlet else '+', 2810 ms = 4 if singlet else 5, 2811 ls = 'None', 2812 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2813 mew = 1, 2814 alpha = 0.2 if singlet else 1, 2815 ) 2816 if highlight and r['Sample'] not in highlight: 2817 kw['alpha'] = 0.2 2818 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2819 x2 = k 2820 x_sessions[session] = (x1+x2)/2 2821 2822 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2823 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2824 if not (hist or kde): 2825 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2826 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2827 2828 xmin, xmax, ymin, ymax = ppl.axis() 2829 if yspan != 1: 2830 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2831 for s in x_sessions: 2832 ppl.text( 2833 x_sessions[s], 2834 ymax +1, 2835 s, 2836 va = 'bottom', 2837 **( 2838 dict(ha = 'center') 2839 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2840 else dict(ha = 'left', rotation = 45) 2841 ) 2842 ) 2843 2844 if hist or kde: 2845 ppl.sca(ax2) 2846 2847 for s in colors: 2848 kw['marker'] = '+' 2849 kw['ms'] = 5 2850 kw['mec'] = colors[s] 2851 kw['label'] = s 2852 kw['alpha'] = 1 2853 ppl.plot([], [], **kw) 2854 2855 kw['mec'] = (0,0,0) 2856 2857 if one_or_more_singlets: 2858 kw['marker'] = 'x' 2859 kw['ms'] = 4 2860 kw['alpha'] = .2 2861 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2862 ppl.plot([], [], **kw) 2863 2864 if one_or_more_multiplets: 2865 kw['marker'] = '+' 2866 kw['ms'] = 4 2867 kw['alpha'] = 1 2868 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2869 ppl.plot([], [], **kw) 2870 2871 if hist or kde: 2872 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2873 else: 2874 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2875 leg.set_zorder(-1000) 2876 2877 ppl.sca(ax1) 2878 2879 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2880 ppl.xticks([]) 2881 ppl.axis([-1, len(self), None, None]) 2882 2883 if hist or kde: 2884 ppl.sca(ax2) 2885 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2886 2887 if kde: 2888 from scipy.stats import gaussian_kde 2889 yi = np.linspace(ymin, ymax, 201) 2890 xi = gaussian_kde(X).evaluate(yi) 2891 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2892# ppl.plot(xi, yi, 'k-', lw = 1) 2893 elif hist: 2894 ppl.hist( 2895 X, 2896 orientation = 'horizontal', 2897 histtype = 'stepfilled', 2898 ec = [.4]*3, 2899 fc = [.25]*3, 2900 alpha = .25, 2901 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2902 ) 2903 ppl.text(0, 0, 2904 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2905 size = 7.5, 2906 alpha = 1, 2907 va = 'center', 2908 ha = 'left', 2909 ) 2910 2911 ppl.axis([0, None, ymin, ymax]) 2912 ppl.xticks([]) 2913 ppl.yticks([]) 2914# ax2.spines['left'].set_visible(False) 2915 ax2.spines['right'].set_visible(False) 2916 ax2.spines['top'].set_visible(False) 2917 ax2.spines['bottom'].set_visible(False) 2918 2919 ax1.axis([None, None, ymin, ymax]) 2920 2921 if not os.path.exists(dir): 2922 os.makedirs(dir) 2923 if filename is None: 2924 return fig 2925 elif filename == '': 2926 filename = f'D{self._4x}_residuals.pdf' 2927 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2928 ppl.close(fig) 2929 2930 2931 def simulate(self, *args, **kwargs): 2932 ''' 2933 Legacy function with warning message pointing to `virtual_data()` 2934 ''' 2935 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2936 2937 def plot_anchor_residuals( 2938 self, 2939 dir = 'output', 2940 filename = '', 2941 figsize = None, 2942 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2943 dpi = 100, 2944 colors = None, 2945 ): 2946 ''' 2947 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2948 2949 **Parameters** 2950 2951 + `dir`: the directory in which to save the plot 2952 + `filename`: the file name to save to. 2953 + `dpi`: resolution for PNG output 2954 + `figsize`: (width, height) of figure 2955 + `subplots_adjust`: passed to the figure 2956 + `dpi`: resolution for PNG output 2957 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2958 ''' 2959 2960 # Colors 2961 N = len(self.anchors) 2962 if colors is None: 2963 if N == 3: 2964 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2965 elif N == 4: 2966 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2967 else: 2968 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2969 2970 if figsize is None: 2971 figsize = (4, 1.5*N+1) 2972 fig = ppl.figure(figsize = figsize) 2973 ppl.subplots_adjust(*subplots_adjust) 2974 axs = {} 2975 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2976 sigma = self.repeatability['r_D47a'] * 1000 2977 D = max(np.abs(X)) 2978 2979 for k,a in enumerate(self.anchors): 2980 color = colors[a] 2981 axs[a] = ppl.subplot(N, 1, 1+k) 2982 axs[a].text( 2983 0.02, 1-0.05, a, 2984 va = 'top', 2985 ha = 'left', 2986 weight = 'bold', 2987 size = 9, 2988 color = [_*0.75 for _ in color], 2989 transform = axs[a].transAxes, 2990 ) 2991 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 2992 axs[a].axvline(0, lw = 0.5, color = color) 2993 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 2994 2995 xi = np.linspace(-3*D, 3*D, 601) 2996 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 2997 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 2998 2999 axs[a].errorbar( 3000 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3001 ecolor = color, 3002 marker = 's', 3003 ls = 'None', 3004 mec = color, 3005 mew = 1, 3006 mfc = 'w', 3007 ms = 8, 3008 elinewidth = 1, 3009 capsize = 4, 3010 capthick = 1, 3011 ) 3012 3013 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3014 ppl.yticks([]) 3015 3016 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3017 3018 if not os.path.exists(dir): 3019 os.makedirs(dir) 3020 if filename is None: 3021 return fig 3022 elif filename == '': 3023 filename = f'D{self._4x}_anchor_residuals.pdf' 3024 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3025 ppl.close(fig) 3026 3027 3028 def plot_distribution_of_analyses( 3029 self, 3030 dir = 'output', 3031 filename = None, 3032 vs_time = False, 3033 figsize = (6,4), 3034 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3035 output = None, 3036 dpi = 100, 3037 ): 3038 ''' 3039 Plot temporal distribution of all analyses in the data set. 3040 3041 **Parameters** 3042 3043 + `dir`: the directory in which to save the plot 3044 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3045 + `dpi`: resolution for PNG output 3046 + `figsize`: (width, height) of figure 3047 + `dpi`: resolution for PNG output 3048 ''' 3049 3050 asamples = [s for s in self.anchors] 3051 usamples = [s for s in self.unknowns] 3052 if output is None or output == 'fig': 3053 fig = ppl.figure(figsize = figsize) 3054 ppl.subplots_adjust(*subplots_adjust) 3055 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3056 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3057 Xmax += (Xmax-Xmin)/40 3058 Xmin -= (Xmax-Xmin)/41 3059 for k, s in enumerate(asamples + usamples): 3060 if vs_time: 3061 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3062 else: 3063 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3064 Y = [-k for x in X] 3065 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3066 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3067 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3068 ppl.axis([Xmin, Xmax, -k-1, 1]) 3069 ppl.xlabel('\ntime') 3070 ppl.gca().annotate('', 3071 xy = (0.6, -0.02), 3072 xycoords = 'axes fraction', 3073 xytext = (.4, -0.02), 3074 arrowprops = dict(arrowstyle = "->", color = 'k'), 3075 ) 3076 3077 3078 x2 = -1 3079 for session in self.sessions: 3080 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3081 if vs_time: 3082 ppl.axvline(x1, color = 'k', lw = .75) 3083 if x2 > -1: 3084 if not vs_time: 3085 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3086 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3087# from xlrd import xldate_as_datetime 3088# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3089 if vs_time: 3090 ppl.axvline(x2, color = 'k', lw = .75) 3091 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3092 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3093 3094 ppl.xticks([]) 3095 ppl.yticks([]) 3096 3097 if output is None: 3098 if not os.path.exists(dir): 3099 os.makedirs(dir) 3100 if filename == None: 3101 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3102 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3103 ppl.close(fig) 3104 elif output == 'ax': 3105 return ppl.gca() 3106 elif output == 'fig': 3107 return fig 3108 3109 3110 def plot_bulk_compositions( 3111 self, 3112 samples = None, 3113 dir = 'output/bulk_compositions', 3114 figsize = (6,6), 3115 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3116 show = False, 3117 sample_color = (0,.5,1), 3118 analysis_color = (.7,.7,.7), 3119 labeldist = 0.3, 3120 radius = 0.05, 3121 ): 3122 ''' 3123 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3124 3125 By default, creates a directory `./output/bulk_compositions` where plots for 3126 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3127 3128 3129 **Parameters** 3130 3131 + `samples`: Only these samples are processed (by default: all samples). 3132 + `dir`: where to save the plots 3133 + `figsize`: (width, height) of figure 3134 + `subplots_adjust`: passed to `subplots_adjust()` 3135 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3136 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3137 + `sample_color`: color used for replicate markers/labels 3138 + `analysis_color`: color used for sample markers/labels 3139 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3140 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3141 ''' 3142 3143 from matplotlib.patches import Ellipse 3144 3145 if samples is None: 3146 samples = [_ for _ in self.samples] 3147 3148 saved = {} 3149 3150 for s in samples: 3151 3152 fig = ppl.figure(figsize = figsize) 3153 fig.subplots_adjust(*subplots_adjust) 3154 ax = ppl.subplot(111) 3155 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3156 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3157 ppl.title(s) 3158 3159 3160 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3161 UID = [_['UID'] for _ in self.samples[s]['data']] 3162 XY0 = XY.mean(0) 3163 3164 for xy in XY: 3165 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3166 3167 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3168 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3169 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3170 saved[s] = [XY, XY0] 3171 3172 x1, x2, y1, y2 = ppl.axis() 3173 x0, dx = (x1+x2)/2, (x2-x1)/2 3174 y0, dy = (y1+y2)/2, (y2-y1)/2 3175 dx, dy = [max(max(dx, dy), radius)]*2 3176 3177 ppl.axis([ 3178 x0 - 1.2*dx, 3179 x0 + 1.2*dx, 3180 y0 - 1.2*dy, 3181 y0 + 1.2*dy, 3182 ]) 3183 3184 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3185 3186 for xy, uid in zip(XY, UID): 3187 3188 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3189 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3190 3191 if (vector_in_display_space**2).sum() > 0: 3192 3193 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3194 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3195 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3196 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3197 3198 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3199 3200 else: 3201 3202 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3203 3204 if radius: 3205 ax.add_artist(Ellipse( 3206 xy = XY0, 3207 width = radius*2, 3208 height = radius*2, 3209 ls = (0, (2,2)), 3210 lw = .7, 3211 ec = analysis_color, 3212 fc = 'None', 3213 )) 3214 ppl.text( 3215 XY0[0], 3216 XY0[1]-radius, 3217 f'\n± {radius*1e3:.0f} ppm', 3218 color = analysis_color, 3219 va = 'top', 3220 ha = 'center', 3221 linespacing = 0.4, 3222 size = 8, 3223 ) 3224 3225 if not os.path.exists(dir): 3226 os.makedirs(dir) 3227 fig.savefig(f'{dir}/{s}.pdf') 3228 ppl.close(fig) 3229 3230 fig = ppl.figure(figsize = figsize) 3231 fig.subplots_adjust(*subplots_adjust) 3232 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3233 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3234 3235 for s in saved: 3236 for xy in saved[s][0]: 3237 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3238 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3239 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3240 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3241 3242 x1, x2, y1, y2 = ppl.axis() 3243 ppl.axis([ 3244 x1 - (x2-x1)/10, 3245 x2 + (x2-x1)/10, 3246 y1 - (y2-y1)/10, 3247 y2 + (y2-y1)/10, 3248 ]) 3249 3250 3251 if not os.path.exists(dir): 3252 os.makedirs(dir) 3253 fig.savefig(f'{dir}/__all__.pdf') 3254 if show: 3255 ppl.show() 3256 ppl.close(fig) 3257 3258 3259 def _save_D4x_correl( 3260 self, 3261 samples = None, 3262 dir = 'output', 3263 filename = None, 3264 D4x_precision = 4, 3265 correl_precision = 4, 3266 save_to_file = True, 3267 ): 3268 ''' 3269 Save D4x values along with their SE and correlation matrix. 3270 3271 **Parameters** 3272 3273 + `samples`: Only these samples are output (by default: all samples). 3274 + `dir`: the directory in which to save the faile (by defaut: `output`) 3275 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3276 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3277 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3278 + `save_to_file`: whether to write the output to a file factor values (by default: True). If `False`, 3279 returns the output as a string 3280 ''' 3281 if samples is None: 3282 samples = sorted([s for s in self.unknowns]) 3283 3284 out = [['Sample']] + [[s] for s in samples] 3285 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3286 for k,s in enumerate(samples): 3287 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3288 for s2 in samples: 3289 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3290 3291 if save_to_file: 3292 if not os.path.exists(dir): 3293 os.makedirs(dir) 3294 if filename is None: 3295 filename = f'D{self._4x}_correl.csv' 3296 with open(f'{dir}/{filename}', 'w') as fid: 3297 fid.write(make_csv(out)) 3298 else: 3299 return make_csv(out)
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
957 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 958 ''' 959 **Parameters** 960 961 + `l`: a list of dictionaries, with each dictionary including at least the keys 962 `Sample`, `d45`, `d46`, and `d47` or `d48`. 963 + `mass`: `'47'` or `'48'` 964 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 965 + `session`: define session name for analyses without a `Session` key 966 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 967 968 Returns a `D4xdata` object derived from `list`. 969 ''' 970 self._4x = mass 971 self.verbose = verbose 972 self.prefix = 'D4xdata' 973 self.logfile = logfile 974 list.__init__(self, l) 975 self.Nf = None 976 self.repeatability = {} 977 self.refresh(session = session)
Parameters
l: a list of dictionaries, with each dictionary including at least the keysSample,d45,d46, andd47ord48.mass:'47'or'48'logfile: if specified, write detailed logs to this file path when callingD4xdatamethods.session: define session name for analyses without aSessionkeyverbose: ifTrue, print out detailed logs when callingD4xdatamethods.
Returns a D4xdata object derived from list.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE (by default equal to 'ETH-3') specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg(),
D4xdata.standardize_d13C, and D4xdata.standardize_d18O.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C().
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71} after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O().
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78} after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none: do not apply any δ13C standardization.'1pt': within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB(averaged over all analyses for whichNominal_d13C_VPDBis defined).'2pt': within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB(averaged over all analyses for whichNominal_d13C_VPDBis defined).
Method by which to standardize δ18O values:
none: do not apply any δ18O standardization.'1pt': within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB(averaged over all analyses for whichNominal_d18O_VPDBis defined).'2pt': within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB(averaged over all analyses for whichNominal_d18O_VPDBis defined).
980 def make_verbal(oldfun): 981 ''' 982 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 983 ''' 984 @wraps(oldfun) 985 def newfun(*args, verbose = '', **kwargs): 986 myself = args[0] 987 oldprefix = myself.prefix 988 myself.prefix = oldfun.__name__ 989 if verbose != '': 990 oldverbose = myself.verbose 991 myself.verbose = verbose 992 out = oldfun(*args, **kwargs) 993 myself.prefix = oldprefix 994 if verbose != '': 995 myself.verbose = oldverbose 996 return out 997 return newfun
Decorator: allow temporarily changing self.prefix and overriding self.verbose.
1000 def msg(self, txt): 1001 ''' 1002 Log a message to `self.logfile`, and print it out if `verbose = True` 1003 ''' 1004 self.log(txt) 1005 if self.verbose: 1006 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile, and print it out if verbose = True
1009 def vmsg(self, txt): 1010 ''' 1011 Log a message to `self.logfile` and print it out 1012 ''' 1013 self.log(txt) 1014 print(txt)
Log a message to self.logfile and print it out
1017 def log(self, *txts): 1018 ''' 1019 Log a message to `self.logfile` 1020 ''' 1021 if self.logfile: 1022 with open(self.logfile, 'a') as fid: 1023 for txt in txts: 1024 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
1027 def refresh(self, session = 'mySession'): 1028 ''' 1029 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1030 ''' 1031 self.fill_in_missing_info(session = session) 1032 self.refresh_sessions() 1033 self.refresh_samples()
Update self.sessions, self.samples, self.anchors, and self.unknowns.
1036 def refresh_sessions(self): 1037 ''' 1038 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1039 to `False` for all sessions. 1040 ''' 1041 self.sessions = { 1042 s: {'data': [r for r in self if r['Session'] == s]} 1043 for s in sorted({r['Session'] for r in self}) 1044 } 1045 for s in self.sessions: 1046 self.sessions[s]['scrambling_drift'] = False 1047 self.sessions[s]['slope_drift'] = False 1048 self.sessions[s]['wg_drift'] = False 1049 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1050 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions and set scrambling_drift, slope_drift, and wg_drift
to False for all sessions.
1053 def refresh_samples(self): 1054 ''' 1055 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1056 ''' 1057 self.samples = { 1058 s: {'data': [r for r in self if r['Sample'] == s]} 1059 for s in sorted({r['Sample'] for r in self}) 1060 } 1061 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1062 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples, self.anchors, and self.unknowns.
1065 def read(self, filename, sep = '', session = ''): 1066 ''' 1067 Read file in csv format to load data into a `D47data` object. 1068 1069 In the csv file, spaces before and after field separators (`','` by default) 1070 are optional. Each line corresponds to a single analysis. 1071 1072 The required fields are: 1073 1074 + `UID`: a unique identifier 1075 + `Session`: an identifier for the analytical session 1076 + `Sample`: a sample identifier 1077 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1078 1079 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1080 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1081 and `d49` are optional, and set to NaN by default. 1082 1083 **Parameters** 1084 1085 + `fileneme`: the path of the file to read 1086 + `sep`: csv separator delimiting the fields 1087 + `session`: set `Session` field to this string for all analyses 1088 ''' 1089 with open(filename) as fid: 1090 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data object.
In the csv file, spaces before and after field separators (',' by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID: a unique identifierSession: an identifier for the analytical sessionSample: a sample identifierd45,d46, and at least one ofd47ord48: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to
VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48
and d49 are optional, and set to NaN by default.
Parameters
fileneme: the path of the file to readsep: csv separator delimiting the fieldssession: setSessionfield to this string for all analyses
1093 def input(self, txt, sep = '', session = ''): 1094 ''' 1095 Read `txt` string in csv format to load analysis data into a `D47data` object. 1096 1097 In the csv string, spaces before and after field separators (`','` by default) 1098 are optional. Each line corresponds to a single analysis. 1099 1100 The required fields are: 1101 1102 + `UID`: a unique identifier 1103 + `Session`: an identifier for the analytical session 1104 + `Sample`: a sample identifier 1105 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1106 1107 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1108 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1109 and `d49` are optional, and set to NaN by default. 1110 1111 **Parameters** 1112 1113 + `txt`: the csv string to read 1114 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1115 whichever appers most often in `txt`. 1116 + `session`: set `Session` field to this string for all analyses 1117 ''' 1118 if sep == '': 1119 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1120 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1121 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1122 1123 if session != '': 1124 for r in data: 1125 r['Session'] = session 1126 1127 self += data 1128 self.refresh()
Read txt string in csv format to load analysis data into a D47data object.
In the csv string, spaces before and after field separators (',' by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID: a unique identifierSession: an identifier for the analytical sessionSample: a sample identifierd45,d46, and at least one ofd47ord48: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O (in ‰ relative to
VSMOW, λ = self.LAMBDA_17), and are otherwise assumed to be zero. Working-gas deltas d47, d48
and d49 are optional, and set to NaN by default.
Parameters
txt: the csv string to readsep: csv separator delimiting the fields. By default, use,,;, or, whichever appers most often intxt.session: setSessionfield to this string for all analyses
1131 @make_verbal 1132 def wg(self, 1133 samples = None, 1134 session_groups = None, 1135 ): 1136 ''' 1137 Compute bulk composition of the working gas for each session based (by default) 1138 on the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1139 `self.Nominal_d18O_VPDB`. 1140 1141 **Parameters** 1142 1143 + `samples`: A list of samples specifying the subset of samples (defined in both 1144 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`) which will be considered 1145 when computing the working gas. By default, use all samples defined both in 1146 `self.Nominal_d13C_VPDB` and `self.Nominal_d18O_VPDB`. 1147 + `session_groups`: a list of lists of sessions 1148 (e.g., `[['session1', 'session2'], ['session3', 'session4', 'session5']]`) 1149 specifying which sessions groups, if any, have the exact same WG composition. 1150 If set to `'all'`, force all sessions to have the same WG composition (use with 1151 caution and on short time scales, since the WG may drift slowly a long time scales). 1152 ''' 1153 1154 self.msg('Computing WG composition:') 1155 1156 a18_acid = self.ALPHA_18O_ACID_REACTION 1157 1158 if samples is None: 1159 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1160 if session_groups is None: 1161 session_groups = [[s] for s in self.sessions] 1162 elif session_groups == 'all': 1163 session_groups = [[s for s in self.sessions]] 1164 1165 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1166 R45R46_standards = {} 1167 for sample in samples: 1168 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1169 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1170 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1171 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1172 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1173 1174 C12_s = 1 / (1 + R13_s) 1175 C13_s = R13_s / (1 + R13_s) 1176 C16_s = 1 / (1 + R17_s + R18_s) 1177 C17_s = R17_s / (1 + R17_s + R18_s) 1178 C18_s = R18_s / (1 + R17_s + R18_s) 1179 1180 C626_s = C12_s * C16_s ** 2 1181 C627_s = 2 * C12_s * C16_s * C17_s 1182 C628_s = 2 * C12_s * C16_s * C18_s 1183 C636_s = C13_s * C16_s ** 2 1184 C637_s = 2 * C13_s * C16_s * C17_s 1185 C727_s = C12_s * C17_s ** 2 1186 1187 R45_s = (C627_s + C636_s) / C626_s 1188 R46_s = (C628_s + C637_s + C727_s) / C626_s 1189 R45R46_standards[sample] = (R45_s, R46_s) 1190 1191 for sg in session_groups: 1192 db = [r for s in sg for r in self.sessions[s]['data'] if r['Sample'] in samples] 1193 assert db, f'No sample from {samples} found in session group {sg}.' 1194 1195 X = [r['d45'] for r in db] 1196 Y = [R45R46_standards[r['Sample']][0] for r in db] 1197 x1, x2 = np.min(X), np.max(X) 1198 1199 if x1 < x2: 1200 wgcoord = x1/(x1-x2) 1201 else: 1202 wgcoord = 999 1203 1204 if wgcoord < -.5 or wgcoord > 1.5: 1205 # unreasonable to extrapolate to d45 = 0 1206 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1207 else : 1208 # d45 = 0 is reasonably well bracketed 1209 R45_wg = np.polyfit(X, Y, 1)[1] 1210 1211 X = [r['d46'] for r in db] 1212 Y = [R45R46_standards[r['Sample']][1] for r in db] 1213 x1, x2 = np.min(X), np.max(X) 1214 1215 if x1 < x2: 1216 wgcoord = x1/(x1-x2) 1217 else: 1218 wgcoord = 999 1219 1220 if wgcoord < -.5 or wgcoord > 1.5: 1221 # unreasonable to extrapolate to d46 = 0 1222 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1223 else : 1224 # d46 = 0 is reasonably well bracketed 1225 R46_wg = np.polyfit(X, Y, 1)[1] 1226 1227 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1228 1229 for s in sg: 1230 self.msg(f'Sessions {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1231 1232 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1233 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1234 for r in self.sessions[s]['data']: 1235 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1236 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based (by default)
on the carbonate standards defined in both self.Nominal_d13C_VPDB and
self.Nominal_d18O_VPDB.
Parameters
samples: A list of samples specifying the subset of samples (defined in bothself.Nominal_d13C_VPDBandself.Nominal_d18O_VPDB) which will be considered when computing the working gas. By default, use all samples defined both inself.Nominal_d13C_VPDBandself.Nominal_d18O_VPDB.session_groups: a list of lists of sessions (e.g.,[['session1', 'session2'], ['session3', 'session4', 'session5']]) specifying which sessions groups, if any, have the exact same WG composition. If set to'all', force all sessions to have the same WG composition (use with caution and on short time scales, since the WG may drift slowly a long time scales).
1239 def compute_bulk_delta(self, R45, R46, D17O = 0): 1240 ''' 1241 Compute δ13C_VPDB and δ18O_VSMOW, 1242 by solving the generalized form of equation (17) from 1243 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1244 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1245 solving the corresponding second-order Taylor polynomial. 1246 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1247 ''' 1248 1249 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1250 1251 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1252 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1253 C = 2 * self.R18_VSMOW 1254 D = -R46 1255 1256 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1257 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1258 cc = A + B + C + D 1259 1260 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1261 1262 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1263 R17 = K * R18 ** self.LAMBDA_17 1264 R13 = R45 - 2 * R17 1265 1266 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1267 1268 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1271 @make_verbal 1272 def crunch(self, verbose = ''): 1273 ''' 1274 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1275 ''' 1276 for r in self: 1277 self.compute_bulk_and_clumping_deltas(r) 1278 self.standardize_d13C() 1279 self.standardize_d18O() 1280 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1283 def fill_in_missing_info(self, session = 'mySession'): 1284 ''' 1285 Fill in optional fields with default values 1286 ''' 1287 for i,r in enumerate(self): 1288 if 'D17O' not in r: 1289 r['D17O'] = 0. 1290 if 'UID' not in r: 1291 r['UID'] = f'{i+1}' 1292 if 'Session' not in r: 1293 r['Session'] = session 1294 for k in ['d47', 'd48', 'd49']: 1295 if k not in r: 1296 r[k] = np.nan
Fill in optional fields with default values
1299 def standardize_d13C(self): 1300 ''' 1301 Perform δ13C standadization within each session `s` according to 1302 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1303 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1304 may be redefined abitrarily at a later stage. 1305 ''' 1306 for s in self.sessions: 1307 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1308 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1309 X,Y = zip(*XY) 1310 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1311 offset = np.mean(Y) - np.mean(X) 1312 for r in self.sessions[s]['data']: 1313 r['d13C_VPDB'] += offset 1314 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1315 a,b = np.polyfit(X,Y,1) 1316 for r in self.sessions[s]['data']: 1317 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s according to
self.sessions[s]['d13C_standardization_method'], which is defined by default
by D47data.refresh_sessions()as equal to self.d13C_STANDARDIZATION_METHOD, but
may be redefined abitrarily at a later stage.
1319 def standardize_d18O(self): 1320 ''' 1321 Perform δ18O standadization within each session `s` according to 1322 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1323 which is defined by default by `D47data.refresh_sessions()`as equal to 1324 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1325 ''' 1326 for s in self.sessions: 1327 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1328 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1329 X,Y = zip(*XY) 1330 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1331 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1332 offset = np.mean(Y) - np.mean(X) 1333 for r in self.sessions[s]['data']: 1334 r['d18O_VSMOW'] += offset 1335 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1336 a,b = np.polyfit(X,Y,1) 1337 for r in self.sessions[s]['data']: 1338 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s according to
self.ALPHA_18O_ACID_REACTION and self.sessions[s]['d18O_standardization_method'],
which is defined by default by D47data.refresh_sessions()as equal to
self.d18O_STANDARDIZATION_METHOD, but may be redefined abitrarily at a later stage.
1341 def compute_bulk_and_clumping_deltas(self, r): 1342 ''' 1343 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1344 ''' 1345 1346 # Compute working gas R13, R18, and isobar ratios 1347 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1348 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1349 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1350 1351 # Compute analyte isobar ratios 1352 R45 = (1 + r['d45'] / 1000) * R45_wg 1353 R46 = (1 + r['d46'] / 1000) * R46_wg 1354 R47 = (1 + r['d47'] / 1000) * R47_wg 1355 R48 = (1 + r['d48'] / 1000) * R48_wg 1356 R49 = (1 + r['d49'] / 1000) * R49_wg 1357 1358 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1359 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1360 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1361 1362 # Compute stochastic isobar ratios of the analyte 1363 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1364 R13, R18, D17O = r['D17O'] 1365 ) 1366 1367 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1368 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1369 if (R45 / R45stoch - 1) > 5e-8: 1370 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1371 if (R46 / R46stoch - 1) > 5e-8: 1372 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1373 1374 # Compute raw clumped isotope anomalies 1375 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1376 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1377 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r.
1380 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1381 ''' 1382 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1383 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1384 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1385 ''' 1386 1387 # Compute R17 1388 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1389 1390 # Compute isotope concentrations 1391 C12 = (1 + R13) ** -1 1392 C13 = C12 * R13 1393 C16 = (1 + R17 + R18) ** -1 1394 C17 = C16 * R17 1395 C18 = C16 * R18 1396 1397 # Compute stochastic isotopologue concentrations 1398 C626 = C16 * C12 * C16 1399 C627 = C16 * C12 * C17 * 2 1400 C628 = C16 * C12 * C18 * 2 1401 C636 = C16 * C13 * C16 1402 C637 = C16 * C13 * C17 * 2 1403 C638 = C16 * C13 * C18 * 2 1404 C727 = C17 * C12 * C17 1405 C728 = C17 * C12 * C18 * 2 1406 C737 = C17 * C13 * C17 1407 C738 = C17 * C13 * C18 * 2 1408 C828 = C18 * C12 * C18 1409 C838 = C18 * C13 * C18 1410 1411 # Compute stochastic isobar ratios 1412 R45 = (C636 + C627) / C626 1413 R46 = (C628 + C637 + C727) / C626 1414 R47 = (C638 + C728 + C737) / C626 1415 R48 = (C738 + C828) / C626 1416 R49 = C838 / C626 1417 1418 # Account for stochastic anomalies 1419 R47 *= 1 + D47 / 1000 1420 R48 *= 1 + D48 / 1000 1421 R49 *= 1 + D49 / 1000 1422 1423 # Return isobar ratios 1424 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13 and R18,
optionally accounting for non-zero values of Δ17O (D17O) and clumped isotope
anomalies (D47, D48, D49), all expressed in permil.
1427 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1428 ''' 1429 Split unknown samples by UID (treat all analyses as different samples) 1430 or by session (treat analyses of a given sample in different sessions as 1431 different samples). 1432 1433 **Parameters** 1434 1435 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1436 + `grouping`: `by_uid` | `by_session` 1437 ''' 1438 if samples_to_split == 'all': 1439 samples_to_split = [s for s in self.unknowns] 1440 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1441 self.grouping = grouping.lower() 1442 if self.grouping in gkeys: 1443 gkey = gkeys[self.grouping] 1444 for r in self: 1445 if r['Sample'] in samples_to_split: 1446 r['Sample_original'] = r['Sample'] 1447 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1448 elif r['Sample'] in self.unknowns: 1449 r['Sample_original'] = r['Sample'] 1450 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']grouping:by_uid|by_session
1453 def unsplit_samples(self, tables = False): 1454 ''' 1455 Reverse the effects of `D47data.split_samples()`. 1456 1457 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1458 1459 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1460 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1461 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1462 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1463 that case session-averaged Δ4x values are statistically independent). 1464 ''' 1465 unknowns_old = sorted({s for s in self.unknowns}) 1466 CM_old = self.standardization.covar[:,:] 1467 VD_old = self.standardization.params.valuesdict().copy() 1468 vars_old = self.standardization.var_names 1469 1470 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1471 1472 Ns = len(vars_old) - len(unknowns_old) 1473 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1474 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1475 1476 W = np.zeros((len(vars_new), len(vars_old))) 1477 W[:Ns,:Ns] = np.eye(Ns) 1478 for u in unknowns_new: 1479 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1480 if self.grouping == 'by_session': 1481 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1482 elif self.grouping == 'by_uid': 1483 weights = [1 for s in splits] 1484 sw = sum(weights) 1485 weights = [w/sw for w in weights] 1486 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1487 1488 CM_new = W @ CM_old @ W.T 1489 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1490 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1491 1492 self.standardization.covar = CM_new 1493 self.standardization.params.valuesdict = lambda : VD_new 1494 self.standardization.var_names = vars_new 1495 1496 for r in self: 1497 if r['Sample'] in self.unknowns: 1498 r['Sample_split'] = r['Sample'] 1499 r['Sample'] = r['Sample_original'] 1500 1501 self.refresh_samples() 1502 self.consolidate_samples() 1503 self.repeatabilities() 1504 1505 if tables: 1506 self.table_of_analyses() 1507 self.table_of_samples()
Reverse the effects of D47data.split_samples().
This should only be used after D4xdata.standardize() with method='pooled'.
After D4xdata.standardize() with method='indep_sessions', one should
probably use D4xdata.combine_samples() instead to reverse the effects of
D47data.split_samples() with grouping='by_uid', or w_avg() to reverse the
effects of D47data.split_samples() with grouping='by_sessions' (because in
that case session-averaged Δ4x values are statistically independent).
1509 def assign_timestamps(self): 1510 ''' 1511 Assign a time field `t` of type `float` to each analysis. 1512 1513 If `TimeTag` is one of the data fields, `t` is equal within a given session 1514 to `TimeTag` minus the mean value of `TimeTag` for that session. 1515 Otherwise, `TimeTag` is by default equal to the index of each analysis 1516 in the dataset and `t` is defined as above. 1517 ''' 1518 for session in self.sessions: 1519 sdata = self.sessions[session]['data'] 1520 try: 1521 t0 = np.mean([r['TimeTag'] for r in sdata]) 1522 for r in sdata: 1523 r['t'] = r['TimeTag'] - t0 1524 except KeyError: 1525 t0 = (len(sdata)-1)/2 1526 for t,r in enumerate(sdata): 1527 r['t'] = t - t0
Assign a time field t of type float to each analysis.
If TimeTag is one of the data fields, t is equal within a given session
to TimeTag minus the mean value of TimeTag for that session.
Otherwise, TimeTag is by default equal to the index of each analysis
in the dataset and t is defined as above.
1530 def report(self): 1531 ''' 1532 Prints a report on the standardization fit. 1533 Only applicable after `D4xdata.standardize(method='pooled')`. 1534 ''' 1535 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled').
1538 def combine_samples(self, sample_groups): 1539 ''' 1540 Combine analyses of different samples to compute weighted average Δ4x 1541 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1542 dictionary. 1543 1544 Caution: samples are weighted by number of replicate analyses, which is a 1545 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1546 correlated analytical errors for one or more samples). 1547 1548 Returns a tuplet of: 1549 1550 + the list of group names 1551 + an array of the corresponding Δ4x values 1552 + the corresponding (co)variance matrix 1553 1554 **Parameters** 1555 1556 + `sample_groups`: a dictionary of the form: 1557 ```py 1558 {'group1': ['sample_1', 'sample_2'], 1559 'group2': ['sample_3', 'sample_4', 'sample_5']} 1560 ``` 1561 ''' 1562 1563 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1564 groups = sorted(sample_groups.keys()) 1565 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1566 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1567 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1568 W = np.array([ 1569 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1570 for j in groups]) 1571 D4x_new = W @ D4x_old 1572 CM_new = W @ CM_old @ W.T 1573 1574 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1577 @make_verbal 1578 def standardize(self, 1579 method = 'pooled', 1580 weighted_sessions = [], 1581 consolidate = True, 1582 consolidate_tables = False, 1583 consolidate_plots = False, 1584 constraints = {}, 1585 ): 1586 ''' 1587 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1588 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1589 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1590 i.e. that their true Δ4x value does not change between sessions, 1591 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1592 `'indep_sessions'`, the standardization processes each session independently, based only 1593 on anchors analyses. 1594 ''' 1595 1596 self.standardization_method = method 1597 self.assign_timestamps() 1598 1599 if method == 'pooled': 1600 if weighted_sessions: 1601 for session_group in weighted_sessions: 1602 if self._4x == '47': 1603 X = D47data([r for r in self if r['Session'] in session_group]) 1604 elif self._4x == '48': 1605 X = D48data([r for r in self if r['Session'] in session_group]) 1606 X.Nominal_D4x = self.Nominal_D4x.copy() 1607 X.refresh() 1608 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1609 w = np.sqrt(result.redchi) 1610 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1611 for r in X: 1612 r[f'wD{self._4x}raw'] *= w 1613 else: 1614 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1615 for r in self: 1616 r[f'wD{self._4x}raw'] = 1. 1617 1618 params = Parameters() 1619 for k,session in enumerate(self.sessions): 1620 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1621 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1622 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1623 s = pf(session) 1624 params.add(f'a_{s}', value = 0.9) 1625 params.add(f'b_{s}', value = 0.) 1626 params.add(f'c_{s}', value = -0.9) 1627 params.add(f'a2_{s}', value = 0., 1628# vary = self.sessions[session]['scrambling_drift'], 1629 ) 1630 params.add(f'b2_{s}', value = 0., 1631# vary = self.sessions[session]['slope_drift'], 1632 ) 1633 params.add(f'c2_{s}', value = 0., 1634# vary = self.sessions[session]['wg_drift'], 1635 ) 1636 if not self.sessions[session]['scrambling_drift']: 1637 params[f'a2_{s}'].expr = '0' 1638 if not self.sessions[session]['slope_drift']: 1639 params[f'b2_{s}'].expr = '0' 1640 if not self.sessions[session]['wg_drift']: 1641 params[f'c2_{s}'].expr = '0' 1642 1643 for sample in self.unknowns: 1644 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1645 1646 for k in constraints: 1647 params[k].expr = constraints[k] 1648 1649 def residuals(p): 1650 R = [] 1651 for r in self: 1652 session = pf(r['Session']) 1653 sample = pf(r['Sample']) 1654 if r['Sample'] in self.Nominal_D4x: 1655 R += [ ( 1656 r[f'D{self._4x}raw'] - ( 1657 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1658 + p[f'b_{session}'] * r[f'd{self._4x}'] 1659 + p[f'c_{session}'] 1660 + r['t'] * ( 1661 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1662 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1663 + p[f'c2_{session}'] 1664 ) 1665 ) 1666 ) / r[f'wD{self._4x}raw'] ] 1667 else: 1668 R += [ ( 1669 r[f'D{self._4x}raw'] - ( 1670 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1671 + p[f'b_{session}'] * r[f'd{self._4x}'] 1672 + p[f'c_{session}'] 1673 + r['t'] * ( 1674 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1675 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1676 + p[f'c2_{session}'] 1677 ) 1678 ) 1679 ) / r[f'wD{self._4x}raw'] ] 1680 return R 1681 1682 M = Minimizer(residuals, params) 1683 result = M.least_squares() 1684 self.Nf = result.nfree 1685 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1686 new_names, new_covar, new_se = _fullcovar(result)[:3] 1687 result.var_names = new_names 1688 result.covar = new_covar 1689 1690 for r in self: 1691 s = pf(r["Session"]) 1692 a = result.params.valuesdict()[f'a_{s}'] 1693 b = result.params.valuesdict()[f'b_{s}'] 1694 c = result.params.valuesdict()[f'c_{s}'] 1695 a2 = result.params.valuesdict()[f'a2_{s}'] 1696 b2 = result.params.valuesdict()[f'b2_{s}'] 1697 c2 = result.params.valuesdict()[f'c2_{s}'] 1698 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1699 1700 1701 self.standardization = result 1702 1703 for session in self.sessions: 1704 self.sessions[session]['Np'] = 3 1705 for k in ['scrambling', 'slope', 'wg']: 1706 if self.sessions[session][f'{k}_drift']: 1707 self.sessions[session]['Np'] += 1 1708 1709 if consolidate: 1710 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1711 return result 1712 1713 1714 elif method == 'indep_sessions': 1715 1716 if weighted_sessions: 1717 for session_group in weighted_sessions: 1718 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1719 X.Nominal_D4x = self.Nominal_D4x.copy() 1720 X.refresh() 1721 # This is only done to assign r['wD47raw'] for r in X: 1722 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1723 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1724 else: 1725 self.msg('All weights set to 1 ‰') 1726 for r in self: 1727 r[f'wD{self._4x}raw'] = 1 1728 1729 for session in self.sessions: 1730 s = self.sessions[session] 1731 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1732 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1733 s['Np'] = sum(p_active) 1734 sdata = s['data'] 1735 1736 A = np.array([ 1737 [ 1738 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1739 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1740 1 / r[f'wD{self._4x}raw'], 1741 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1742 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1743 r['t'] / r[f'wD{self._4x}raw'] 1744 ] 1745 for r in sdata if r['Sample'] in self.anchors 1746 ])[:,p_active] # only keep columns for the active parameters 1747 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1748 s['Na'] = Y.size 1749 CM = linalg.inv(A.T @ A) 1750 bf = (CM @ A.T @ Y).T[0,:] 1751 k = 0 1752 for n,a in zip(p_names, p_active): 1753 if a: 1754 s[n] = bf[k] 1755# self.msg(f'{n} = {bf[k]}') 1756 k += 1 1757 else: 1758 s[n] = 0. 1759# self.msg(f'{n} = 0.0') 1760 1761 for r in sdata : 1762 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1763 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1764 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1765 1766 s['CM'] = np.zeros((6,6)) 1767 i = 0 1768 k_active = [j for j,a in enumerate(p_active) if a] 1769 for j,a in enumerate(p_active): 1770 if a: 1771 s['CM'][j,k_active] = CM[i,:] 1772 i += 1 1773 1774 if not weighted_sessions: 1775 w = self.rmswd()['rmswd'] 1776 for r in self: 1777 r[f'wD{self._4x}'] *= w 1778 r[f'wD{self._4x}raw'] *= w 1779 for session in self.sessions: 1780 self.sessions[session]['CM'] *= w**2 1781 1782 for session in self.sessions: 1783 s = self.sessions[session] 1784 s['SE_a'] = s['CM'][0,0]**.5 1785 s['SE_b'] = s['CM'][1,1]**.5 1786 s['SE_c'] = s['CM'][2,2]**.5 1787 s['SE_a2'] = s['CM'][3,3]**.5 1788 s['SE_b2'] = s['CM'][4,4]**.5 1789 s['SE_c2'] = s['CM'][5,5]**.5 1790 1791 if not weighted_sessions: 1792 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1793 else: 1794 self.Nf = 0 1795 for sg in weighted_sessions: 1796 self.Nf += self.rmswd(sessions = sg)['Nf'] 1797 1798 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1799 1800 avgD4x = { 1801 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1802 for sample in self.samples 1803 } 1804 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1805 rD4x = (chi2/self.Nf)**.5 1806 self.repeatability[f'sigma_{self._4x}'] = rD4x 1807 1808 if consolidate: 1809 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method argument is set to 'pooled', the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method argument is set to
'indep_sessions', the standardization processes each session independently, based only
on anchors analyses.
1812 def standardization_error(self, session, d4x, D4x, t = 0): 1813 ''' 1814 Compute standardization error for a given session and 1815 (δ47, Δ47) composition. 1816 ''' 1817 a = self.sessions[session]['a'] 1818 b = self.sessions[session]['b'] 1819 c = self.sessions[session]['c'] 1820 a2 = self.sessions[session]['a2'] 1821 b2 = self.sessions[session]['b2'] 1822 c2 = self.sessions[session]['c2'] 1823 CM = self.sessions[session]['CM'] 1824 1825 x, y = D4x, d4x 1826 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1827# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1828 dxdy = -(b+b2*t) / (a+a2*t) 1829 dxdz = 1. / (a+a2*t) 1830 dxda = -x / (a+a2*t) 1831 dxdb = -y / (a+a2*t) 1832 dxdc = -1. / (a+a2*t) 1833 dxda2 = -x * a2 / (a+a2*t) 1834 dxdb2 = -y * t / (a+a2*t) 1835 dxdc2 = -t / (a+a2*t) 1836 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1837 sx = (V @ CM @ V.T) ** .5 1838 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1841 @make_verbal 1842 def summary(self, 1843 dir = 'output', 1844 filename = None, 1845 save_to_file = True, 1846 print_out = True, 1847 ): 1848 ''' 1849 Print out an/or save to disk a summary of the standardization results. 1850 1851 **Parameters** 1852 1853 + `dir`: the directory in which to save the table 1854 + `filename`: the name to the csv file to write to 1855 + `save_to_file`: whether to save the table to disk 1856 + `print_out`: whether to print out the table 1857 ''' 1858 1859 out = [] 1860 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1861 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1862 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1863 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1864 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1865 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1866 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1867 out += [['Model degrees of freedom', f"{self.Nf}"]] 1868 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1869 out += [['Standardization method', self.standardization_method]] 1870 1871 if save_to_file: 1872 if not os.path.exists(dir): 1873 os.makedirs(dir) 1874 if filename is None: 1875 filename = f'D{self._4x}_summary.csv' 1876 with open(f'{dir}/{filename}', 'w') as fid: 1877 fid.write(make_csv(out)) 1878 if print_out: 1879 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the table
1882 @make_verbal 1883 def table_of_sessions(self, 1884 dir = 'output', 1885 filename = None, 1886 save_to_file = True, 1887 print_out = True, 1888 output = None, 1889 ): 1890 ''' 1891 Print out an/or save to disk a table of sessions. 1892 1893 **Parameters** 1894 1895 + `dir`: the directory in which to save the table 1896 + `filename`: the name to the csv file to write to 1897 + `save_to_file`: whether to save the table to disk 1898 + `print_out`: whether to print out the table 1899 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1900 if set to `'raw'`: return a list of list of strings 1901 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1902 ''' 1903 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1904 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1905 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1906 1907 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1908 if include_a2: 1909 out[-1] += ['a2 ± SE'] 1910 if include_b2: 1911 out[-1] += ['b2 ± SE'] 1912 if include_c2: 1913 out[-1] += ['c2 ± SE'] 1914 for session in self.sessions: 1915 out += [[ 1916 session, 1917 f"{self.sessions[session]['Na']}", 1918 f"{self.sessions[session]['Nu']}", 1919 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1920 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1921 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1922 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1923 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1924 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1925 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1926 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1927 ]] 1928 if include_a2: 1929 if self.sessions[session]['scrambling_drift']: 1930 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1931 else: 1932 out[-1] += [''] 1933 if include_b2: 1934 if self.sessions[session]['slope_drift']: 1935 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1936 else: 1937 out[-1] += [''] 1938 if include_c2: 1939 if self.sessions[session]['wg_drift']: 1940 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1941 else: 1942 out[-1] += [''] 1943 1944 if save_to_file: 1945 if not os.path.exists(dir): 1946 os.makedirs(dir) 1947 if filename is None: 1948 filename = f'D{self._4x}_sessions.csv' 1949 with open(f'{dir}/{filename}', 'w') as fid: 1950 fid.write(make_csv(out)) 1951 if print_out: 1952 self.msg('\n' + pretty_table(out)) 1953 if output == 'raw': 1954 return out 1955 elif output == 'pretty': 1956 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
1959 @make_verbal 1960 def table_of_analyses( 1961 self, 1962 dir = 'output', 1963 filename = None, 1964 save_to_file = True, 1965 print_out = True, 1966 output = None, 1967 ): 1968 ''' 1969 Print out an/or save to disk a table of analyses. 1970 1971 **Parameters** 1972 1973 + `dir`: the directory in which to save the table 1974 + `filename`: the name to the csv file to write to 1975 + `save_to_file`: whether to save the table to disk 1976 + `print_out`: whether to print out the table 1977 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1978 if set to `'raw'`: return a list of list of strings 1979 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1980 ''' 1981 1982 out = [['UID','Session','Sample']] 1983 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1984 for f in extra_fields: 1985 out[-1] += [f[0]] 1986 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1987 for r in self: 1988 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1989 for f in extra_fields: 1990 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1991 out[-1] += [ 1992 f"{r['d13Cwg_VPDB']:.3f}", 1993 f"{r['d18Owg_VSMOW']:.3f}", 1994 f"{r['d45']:.6f}", 1995 f"{r['d46']:.6f}", 1996 f"{r['d47']:.6f}", 1997 f"{r['d48']:.6f}", 1998 f"{r['d49']:.6f}", 1999 f"{r['d13C_VPDB']:.6f}", 2000 f"{r['d18O_VSMOW']:.6f}", 2001 f"{r['D47raw']:.6f}", 2002 f"{r['D48raw']:.6f}", 2003 f"{r['D49raw']:.6f}", 2004 f"{r[f'D{self._4x}']:.6f}" 2005 ] 2006 if save_to_file: 2007 if not os.path.exists(dir): 2008 os.makedirs(dir) 2009 if filename is None: 2010 filename = f'D{self._4x}_analyses.csv' 2011 with open(f'{dir}/{filename}', 'w') as fid: 2012 fid.write(make_csv(out)) 2013 if print_out: 2014 self.msg('\n' + pretty_table(out)) 2015 return out
Print out an/or save to disk a table of analyses.
Parameters
dir: the directory in which to save the tablefilename: the name to the csv file to write tosave_to_file: whether to save the table to diskprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2017 @make_verbal 2018 def covar_table( 2019 self, 2020 correl = False, 2021 dir = 'output', 2022 filename = None, 2023 save_to_file = True, 2024 print_out = True, 2025 output = None, 2026 ): 2027 ''' 2028 Print out, save to disk and/or return the variance-covariance matrix of D4x 2029 for all unknown samples. 2030 2031 **Parameters** 2032 2033 + `dir`: the directory in which to save the csv 2034 + `filename`: the name of the csv file to write to 2035 + `save_to_file`: whether to save the csv 2036 + `print_out`: whether to print out the matrix 2037 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2038 if set to `'raw'`: return a list of list of strings 2039 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2040 ''' 2041 samples = sorted([u for u in self.unknowns]) 2042 out = [[''] + samples] 2043 for s1 in samples: 2044 out.append([s1]) 2045 for s2 in samples: 2046 if correl: 2047 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2048 else: 2049 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2050 2051 if save_to_file: 2052 if not os.path.exists(dir): 2053 os.makedirs(dir) 2054 if filename is None: 2055 if correl: 2056 filename = f'D{self._4x}_correl.csv' 2057 else: 2058 filename = f'D{self._4x}_covar.csv' 2059 with open(f'{dir}/{filename}', 'w') as fid: 2060 fid.write(make_csv(out)) 2061 if print_out: 2062 self.msg('\n'+pretty_table(out)) 2063 if output == 'raw': 2064 return out 2065 elif output == 'pretty': 2066 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir: the directory in which to save the csvfilename: the name of the csv file to write tosave_to_file: whether to save the csvprint_out: whether to print out the matrixoutput: if set to'pretty': return a pretty text matrix (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2068 @make_verbal 2069 def table_of_samples( 2070 self, 2071 dir = 'output', 2072 filename = None, 2073 save_to_file = True, 2074 print_out = True, 2075 output = None, 2076 ): 2077 ''' 2078 Print out, save to disk and/or return a table of samples. 2079 2080 **Parameters** 2081 2082 + `dir`: the directory in which to save the csv 2083 + `filename`: the name of the csv file to write to 2084 + `save_to_file`: whether to save the csv 2085 + `print_out`: whether to print out the table 2086 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2087 if set to `'raw'`: return a list of list of strings 2088 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2089 ''' 2090 2091 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2092 for sample in self.anchors: 2093 out += [[ 2094 f"{sample}", 2095 f"{self.samples[sample]['N']}", 2096 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2097 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2098 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2099 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2100 ]] 2101 for sample in self.unknowns: 2102 out += [[ 2103 f"{sample}", 2104 f"{self.samples[sample]['N']}", 2105 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2106 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2107 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2108 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2109 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2110 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2111 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2112 ]] 2113 if save_to_file: 2114 if not os.path.exists(dir): 2115 os.makedirs(dir) 2116 if filename is None: 2117 filename = f'D{self._4x}_samples.csv' 2118 with open(f'{dir}/{filename}', 'w') as fid: 2119 fid.write(make_csv(out)) 2120 if print_out: 2121 self.msg('\n'+pretty_table(out)) 2122 if output == 'raw': 2123 return out 2124 elif output == 'pretty': 2125 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir: the directory in which to save the csvfilename: the name of the csv file to write tosave_to_file: whether to save the csvprint_out: whether to print out the tableoutput: if set to'pretty': return a pretty text table (seepretty_table()); if set to'raw': return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']])
2128 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2129 ''' 2130 Generate session plots and save them to disk. 2131 2132 **Parameters** 2133 2134 + `dir`: the directory in which to save the plots 2135 + `figsize`: the width and height (in inches) of each plot 2136 + `filetype`: 'pdf' or 'png' 2137 + `dpi`: resolution for PNG output 2138 ''' 2139 if not os.path.exists(dir): 2140 os.makedirs(dir) 2141 2142 for session in self.sessions: 2143 sp = self.plot_single_session(session, xylimits = 'constant') 2144 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2145 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir: the directory in which to save the plotsfigsize: the width and height (in inches) of each plotfiletype: 'pdf' or 'png'dpi: resolution for PNG output
2149 @make_verbal 2150 def consolidate_samples(self): 2151 ''' 2152 Compile various statistics for each sample. 2153 2154 For each anchor sample: 2155 2156 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2157 + `SE_D47` or `SE_D48`: set to zero by definition 2158 2159 For each unknown sample: 2160 2161 + `D47` or `D48`: the standardized Δ4x value for this unknown 2162 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2163 2164 For each anchor and unknown: 2165 2166 + `N`: the total number of analyses of this sample 2167 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2168 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2169 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2170 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2171 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2172 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2173 ''' 2174 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2175 for sample in self.samples: 2176 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2177 if self.samples[sample]['N'] > 1: 2178 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2179 2180 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2181 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2182 2183 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2184 if len(D4x_pop) > 2: 2185 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2186 2187 if self.standardization_method == 'pooled': 2188 for sample in self.anchors: 2189 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2190 self.samples[sample][f'SE_D{self._4x}'] = 0. 2191 for sample in self.unknowns: 2192 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2193 try: 2194 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2195 except ValueError: 2196 # when `sample` is constrained by self.standardize(constraints = {...}), 2197 # it is no longer listed in self.standardization.var_names. 2198 # Temporary fix: define SE as zero for now 2199 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2200 2201 elif self.standardization_method == 'indep_sessions': 2202 for sample in self.anchors: 2203 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2204 self.samples[sample][f'SE_D{self._4x}'] = 0. 2205 for sample in self.unknowns: 2206 self.msg(f'Consolidating sample {sample}') 2207 self.unknowns[sample][f'session_D{self._4x}'] = {} 2208 session_avg = [] 2209 for session in self.sessions: 2210 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2211 if sdata: 2212 self.msg(f'{sample} found in session {session}') 2213 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2214 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2215 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2216 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2217 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2218 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2219 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2220 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2221 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2222 wsum = sum([weights[s] for s in weights]) 2223 for s in weights: 2224 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2225 2226 for r in self: 2227 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47orD48: the nominal Δ4x value for this anchor, specified byself.Nominal_D4xSE_D47orSE_D48: set to zero by definition
For each unknown sample:
D47orD48: the standardized Δ4x value for this unknownSE_D47orSE_D48: the standard error of Δ4x for this unknown
For each anchor and unknown:
N: the total number of analyses of this sampleSD_D47orSD_D48: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB: the average δ13CVPDB value for this sampled18O_VSMOW: the average δ18OVSMOW value for this sample (as CO2)p_Levene: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE.
2231 def consolidate_sessions(self): 2232 ''' 2233 Compute various statistics for each session. 2234 2235 + `Na`: Number of anchor analyses in the session 2236 + `Nu`: Number of unknown analyses in the session 2237 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2238 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2239 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2240 + `a`: scrambling factor 2241 + `b`: compositional slope 2242 + `c`: WG offset 2243 + `SE_a`: Model stadard erorr of `a` 2244 + `SE_b`: Model stadard erorr of `b` 2245 + `SE_c`: Model stadard erorr of `c` 2246 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2247 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2248 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2249 + `a2`: scrambling factor drift 2250 + `b2`: compositional slope drift 2251 + `c2`: WG offset drift 2252 + `Np`: Number of standardization parameters to fit 2253 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2254 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2255 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2256 ''' 2257 for session in self.sessions: 2258 if 'd13Cwg_VPDB' not in self.sessions[session]: 2259 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2260 if 'd18Owg_VSMOW' not in self.sessions[session]: 2261 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2262 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2263 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2264 2265 self.msg(f'Computing repeatabilities for session {session}') 2266 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2267 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2268 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2269 2270 if self.standardization_method == 'pooled': 2271 for session in self.sessions: 2272 2273 # different (better?) computation of D4x repeatability for each session: 2274 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2275 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2276 2277 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2278 i = self.standardization.var_names.index(f'a_{pf(session)}') 2279 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2280 2281 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2282 i = self.standardization.var_names.index(f'b_{pf(session)}') 2283 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2284 2285 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2286 i = self.standardization.var_names.index(f'c_{pf(session)}') 2287 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2288 2289 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2290 if self.sessions[session]['scrambling_drift']: 2291 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2292 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2293 else: 2294 self.sessions[session]['SE_a2'] = 0. 2295 2296 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2297 if self.sessions[session]['slope_drift']: 2298 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2299 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2300 else: 2301 self.sessions[session]['SE_b2'] = 0. 2302 2303 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2304 if self.sessions[session]['wg_drift']: 2305 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2306 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2307 else: 2308 self.sessions[session]['SE_c2'] = 0. 2309 2310 i = self.standardization.var_names.index(f'a_{pf(session)}') 2311 j = self.standardization.var_names.index(f'b_{pf(session)}') 2312 k = self.standardization.var_names.index(f'c_{pf(session)}') 2313 CM = np.zeros((6,6)) 2314 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2315 try: 2316 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2317 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2318 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2319 try: 2320 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2321 CM[3,4] = self.standardization.covar[i2,j2] 2322 CM[4,3] = self.standardization.covar[j2,i2] 2323 except ValueError: 2324 pass 2325 try: 2326 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2327 CM[3,5] = self.standardization.covar[i2,k2] 2328 CM[5,3] = self.standardization.covar[k2,i2] 2329 except ValueError: 2330 pass 2331 except ValueError: 2332 pass 2333 try: 2334 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2335 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2336 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2337 try: 2338 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2339 CM[4,5] = self.standardization.covar[j2,k2] 2340 CM[5,4] = self.standardization.covar[k2,j2] 2341 except ValueError: 2342 pass 2343 except ValueError: 2344 pass 2345 try: 2346 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2347 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2348 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2349 except ValueError: 2350 pass 2351 2352 self.sessions[session]['CM'] = CM 2353 2354 elif self.standardization_method == 'indep_sessions': 2355 pass # Not implemented yet
Compute various statistics for each session.
Na: Number of anchor analyses in the sessionNu: Number of unknown analyses in the sessionr_d13C_VPDB: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW: δ18OVSMOW repeatability of analyses within the sessionr_D47orr_D48: Δ4x repeatability of analyses within the sessiona: scrambling factorb: compositional slopec: WG offsetSE_a: Model stadard erorr ofaSE_b: Model stadard erorr ofbSE_c: Model stadard erorr ofcscrambling_drift(boolean): whether to allow a temporal drift in the scrambling factor (a)slope_drift(boolean): whether to allow a temporal drift in the compositional slope (b)wg_drift(boolean): whether to allow a temporal drift in the WG offset (c)a2: scrambling factor driftb2: compositional slope driftc2: WG offset driftNp: Number of standardization parameters to fitCM: model covariance matrix for (a,b,c,a2,b2,c2)d13Cwg_VPDB: δ13CVPDB of WGd18Owg_VSMOW: δ18OVSMOW of WG
2358 @make_verbal 2359 def repeatabilities(self): 2360 ''' 2361 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2362 (for all samples, for anchors, and for unknowns). 2363 ''' 2364 self.msg('Computing reproducibilities for all sessions') 2365 2366 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2367 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2368 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2369 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2370 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2373 @make_verbal 2374 def consolidate(self, tables = True, plots = True): 2375 ''' 2376 Collect information about samples, sessions and repeatabilities. 2377 ''' 2378 self.consolidate_samples() 2379 self.consolidate_sessions() 2380 self.repeatabilities() 2381 2382 if tables: 2383 self.summary() 2384 self.table_of_sessions() 2385 self.table_of_analyses() 2386 self.table_of_samples() 2387 2388 if plots: 2389 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2392 @make_verbal 2393 def rmswd(self, 2394 samples = 'all samples', 2395 sessions = 'all sessions', 2396 ): 2397 ''' 2398 Compute the χ2, root mean squared weighted deviation 2399 (i.e. reduced χ2), and corresponding degrees of freedom of the 2400 Δ4x values for samples in `samples` and sessions in `sessions`. 2401 2402 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2403 ''' 2404 if samples == 'all samples': 2405 mysamples = [k for k in self.samples] 2406 elif samples == 'anchors': 2407 mysamples = [k for k in self.anchors] 2408 elif samples == 'unknowns': 2409 mysamples = [k for k in self.unknowns] 2410 else: 2411 mysamples = samples 2412 2413 if sessions == 'all sessions': 2414 sessions = [k for k in self.sessions] 2415 2416 chisq, Nf = 0, 0 2417 for sample in mysamples : 2418 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2419 if len(G) > 1 : 2420 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2421 Nf += (len(G) - 1) 2422 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2423 r = (chisq / Nf)**.5 if Nf > 0 else 0 2424 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2425 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples and sessions in sessions.
Only used in D4xdata.standardize() with method='indep_sessions'.
2428 @make_verbal 2429 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2430 ''' 2431 Compute the repeatability of `[r[key] for r in self]` 2432 ''' 2433 2434 if samples == 'all samples': 2435 mysamples = [k for k in self.samples] 2436 elif samples == 'anchors': 2437 mysamples = [k for k in self.anchors] 2438 elif samples == 'unknowns': 2439 mysamples = [k for k in self.unknowns] 2440 else: 2441 mysamples = samples 2442 2443 if sessions == 'all sessions': 2444 sessions = [k for k in self.sessions] 2445 2446 if key in ['D47', 'D48']: 2447 # Full disclosure: the definition of Nf is tricky/debatable 2448 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2449 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2450 Nf = len(G) 2451# print(f'len(G) = {Nf}') 2452 Nf -= len([s for s in mysamples if s in self.unknowns]) 2453# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2454 for session in sessions: 2455 Np = len([ 2456 _ for _ in self.standardization.params 2457 if ( 2458 self.standardization.params[_].expr is not None 2459 and ( 2460 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2461 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2462 ) 2463 ) 2464 ]) 2465# print(f'session {session}: {Np} parameters to consider') 2466 Na = len({ 2467 r['Sample'] for r in self.sessions[session]['data'] 2468 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2469 }) 2470# print(f'session {session}: {Na} different anchors in that session') 2471 Nf -= min(Np, Na) 2472# print(f'Nf = {Nf}') 2473 2474# for sample in mysamples : 2475# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2476# if len(X) > 1 : 2477# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2478# if sample in self.unknowns: 2479# Nf += len(X) - 1 2480# else: 2481# Nf += len(X) 2482# if samples in ['anchors', 'all samples']: 2483# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2484 r = (chisq / Nf)**.5 if Nf > 0 else 0 2485 2486 else: # if key not in ['D47', 'D48'] 2487 chisq, Nf = 0, 0 2488 for sample in mysamples : 2489 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2490 if len(X) > 1 : 2491 Nf += len(X) - 1 2492 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2493 r = (chisq / Nf)**.5 if Nf > 0 else 0 2494 2495 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2496 return r
Compute the repeatability of [r[key] for r in self]
2498 def sample_average(self, samples, weights = 'equal', normalize = True): 2499 ''' 2500 Weighted average Δ4x value of a group of samples, accounting for covariance. 2501 2502 Returns the weighed average Δ4x value and associated SE 2503 of a group of samples. Weights are equal by default. If `normalize` is 2504 true, `weights` will be rescaled so that their sum equals 1. 2505 2506 **Examples** 2507 2508 ```python 2509 self.sample_average(['X','Y'], [1, 2]) 2510 ``` 2511 2512 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2513 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2514 values of samples X and Y, respectively. 2515 2516 ```python 2517 self.sample_average(['X','Y'], [1, -1], normalize = False) 2518 ``` 2519 2520 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2521 ''' 2522 if weights == 'equal': 2523 weights = [1/len(samples)] * len(samples) 2524 2525 if normalize: 2526 s = sum(weights) 2527 if s: 2528 weights = [w/s for w in weights] 2529 2530 try: 2531# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2532# C = self.standardization.covar[indices,:][:,indices] 2533 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2534 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2535 return correlated_sum(X, C, weights) 2536 except ValueError: 2537 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize is
true, weights will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2540 def sample_D4x_covar(self, sample1, sample2 = None): 2541 ''' 2542 Covariance between Δ4x values of samples 2543 2544 Returns the error covariance between the average Δ4x values of two 2545 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2546 returns the Δ4x variance for that sample. 2547 ''' 2548 if sample2 is None: 2549 sample2 = sample1 2550 if self.standardization_method == 'pooled': 2551 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2552 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2553 return self.standardization.covar[i, j] 2554 elif self.standardization_method == 'indep_sessions': 2555 if sample1 == sample2: 2556 return self.samples[sample1][f'SE_D{self._4x}']**2 2557 else: 2558 c = 0 2559 for session in self.sessions: 2560 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2561 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2562 if sdata1 and sdata2: 2563 a = self.sessions[session]['a'] 2564 # !! TODO: CM below does not account for temporal changes in standardization parameters 2565 CM = self.sessions[session]['CM'][:3,:3] 2566 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2567 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2568 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2569 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2570 c += ( 2571 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2572 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2573 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2574 @ CM 2575 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2576 ) / a**2 2577 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1 is specified, or if sample_1 == sample_2),
returns the Δ4x variance for that sample.
2579 def sample_D4x_correl(self, sample1, sample2 = None): 2580 ''' 2581 Correlation between Δ4x errors of samples 2582 2583 Returns the error correlation between the average Δ4x values of two samples. 2584 ''' 2585 if sample2 is None or sample2 == sample1: 2586 return 1. 2587 return ( 2588 self.sample_D4x_covar(sample1, sample2) 2589 / self.unknowns[sample1][f'SE_D{self._4x}'] 2590 / self.unknowns[sample2][f'SE_D{self._4x}'] 2591 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2593 def plot_single_session(self, 2594 session, 2595 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2596 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2597 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2598 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2599 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2600 xylimits = 'free', # | 'constant' 2601 x_label = None, 2602 y_label = None, 2603 error_contour_interval = 'auto', 2604 fig = 'new', 2605 ): 2606 ''' 2607 Generate plot for a single session 2608 ''' 2609 if x_label is None: 2610 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2611 if y_label is None: 2612 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2613 2614 out = _SessionPlot() 2615 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2616 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2617 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2618 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2619 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2620 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2621 anchor_avg = (np.array([ np.array([ 2622 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2623 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2624 ]) for sample in anchors]).T, 2625 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2626 unknown_avg = (np.array([ np.array([ 2627 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2628 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2629 ]) for sample in unknowns]).T, 2630 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2631 2632 2633 if fig == 'new': 2634 out.fig = ppl.figure(figsize = (6,6)) 2635 ppl.subplots_adjust(.1,.1,.9,.9) 2636 2637 out.anchor_analyses, = ppl.plot( 2638 anchors_d, 2639 anchors_D, 2640 **kw_plot_anchors) 2641 out.unknown_analyses, = ppl.plot( 2642 unknowns_d, 2643 unknowns_D, 2644 **kw_plot_unknowns) 2645 out.anchor_avg = ppl.plot( 2646 *anchor_avg, 2647 **kw_plot_anchor_avg) 2648 out.unknown_avg = ppl.plot( 2649 *unknown_avg, 2650 **kw_plot_unknown_avg) 2651 if xylimits == 'constant': 2652 x = [r[f'd{self._4x}'] for r in self] 2653 y = [r[f'D{self._4x}'] for r in self] 2654 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2655 w, h = x2-x1, y2-y1 2656 x1 -= w/20 2657 x2 += w/20 2658 y1 -= h/20 2659 y2 += h/20 2660 ppl.axis([x1, x2, y1, y2]) 2661 elif xylimits == 'free': 2662 x1, x2, y1, y2 = ppl.axis() 2663 else: 2664 x1, x2, y1, y2 = ppl.axis(xylimits) 2665 2666 if error_contour_interval != 'none': 2667 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2668 XI,YI = np.meshgrid(xi, yi) 2669 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2670 if error_contour_interval == 'auto': 2671 rng = np.max(SI) - np.min(SI) 2672 if rng <= 0.01: 2673 cinterval = 0.001 2674 elif rng <= 0.03: 2675 cinterval = 0.004 2676 elif rng <= 0.1: 2677 cinterval = 0.01 2678 elif rng <= 0.3: 2679 cinterval = 0.03 2680 elif rng <= 1.: 2681 cinterval = 0.1 2682 else: 2683 cinterval = 0.5 2684 else: 2685 cinterval = error_contour_interval 2686 2687 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2688 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2689 out.clabel = ppl.clabel(out.contour) 2690 contour = (XI, YI, SI, cval, cinterval) 2691 2692 if fig == None: 2693 return { 2694 'anchors':anchors, 2695 'unknowns':unknowns, 2696 'anchors_d':anchors_d, 2697 'anchors_D':anchors_D, 2698 'unknowns_d':unknowns_d, 2699 'unknowns_D':unknowns_D, 2700 'anchor_avg':anchor_avg, 2701 'unknown_avg':unknown_avg, 2702 'contour':contour, 2703 } 2704 2705 ppl.xlabel(x_label) 2706 ppl.ylabel(y_label) 2707 ppl.title(session, weight = 'bold') 2708 ppl.grid(alpha = .2) 2709 out.ax = ppl.gca() 2710 2711 return out
Generate plot for a single session
2713 def plot_residuals( 2714 self, 2715 kde = False, 2716 hist = False, 2717 binwidth = 2/3, 2718 dir = 'output', 2719 filename = None, 2720 highlight = [], 2721 colors = None, 2722 figsize = None, 2723 dpi = 100, 2724 yspan = None, 2725 ): 2726 ''' 2727 Plot residuals of each analysis as a function of time (actually, as a function of 2728 the order of analyses in the `D4xdata` object) 2729 2730 + `kde`: whether to add a kernel density estimate of residuals 2731 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2732 + `histbins`: specify bin edges for the histogram 2733 + `dir`: the directory in which to save the plot 2734 + `highlight`: a list of samples to highlight 2735 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2736 + `figsize`: (width, height) of figure 2737 + `dpi`: resolution for PNG output 2738 + `yspan`: factor controlling the range of y values shown in plot 2739 (by default: `yspan = 1.5 if kde else 1.0`) 2740 ''' 2741 2742 from matplotlib import ticker 2743 2744 if yspan is None: 2745 if kde: 2746 yspan = 1.5 2747 else: 2748 yspan = 1.0 2749 2750 # Layout 2751 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2752 if hist or kde: 2753 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2754 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2755 else: 2756 ppl.subplots_adjust(.08,.05,.78,.8) 2757 ax1 = ppl.subplot(111) 2758 2759 # Colors 2760 N = len(self.anchors) 2761 if colors is None: 2762 if len(highlight) > 0: 2763 Nh = len(highlight) 2764 if Nh == 1: 2765 colors = {highlight[0]: (0,0,0)} 2766 elif Nh == 3: 2767 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2768 elif Nh == 4: 2769 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2770 else: 2771 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2772 else: 2773 if N == 3: 2774 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2775 elif N == 4: 2776 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2777 else: 2778 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2779 2780 ppl.sca(ax1) 2781 2782 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2783 2784 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2785 2786 session = self[0]['Session'] 2787 x1 = 0 2788# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2789 x_sessions = {} 2790 one_or_more_singlets = False 2791 one_or_more_multiplets = False 2792 multiplets = set() 2793 for k,r in enumerate(self): 2794 if r['Session'] != session: 2795 x2 = k-1 2796 x_sessions[session] = (x1+x2)/2 2797 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2798 session = r['Session'] 2799 x1 = k 2800 singlet = len(self.samples[r['Sample']]['data']) == 1 2801 if not singlet: 2802 multiplets.add(r['Sample']) 2803 if r['Sample'] in self.unknowns: 2804 if singlet: 2805 one_or_more_singlets = True 2806 else: 2807 one_or_more_multiplets = True 2808 kw = dict( 2809 marker = 'x' if singlet else '+', 2810 ms = 4 if singlet else 5, 2811 ls = 'None', 2812 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2813 mew = 1, 2814 alpha = 0.2 if singlet else 1, 2815 ) 2816 if highlight and r['Sample'] not in highlight: 2817 kw['alpha'] = 0.2 2818 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2819 x2 = k 2820 x_sessions[session] = (x1+x2)/2 2821 2822 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2823 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2824 if not (hist or kde): 2825 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2826 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2827 2828 xmin, xmax, ymin, ymax = ppl.axis() 2829 if yspan != 1: 2830 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2831 for s in x_sessions: 2832 ppl.text( 2833 x_sessions[s], 2834 ymax +1, 2835 s, 2836 va = 'bottom', 2837 **( 2838 dict(ha = 'center') 2839 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2840 else dict(ha = 'left', rotation = 45) 2841 ) 2842 ) 2843 2844 if hist or kde: 2845 ppl.sca(ax2) 2846 2847 for s in colors: 2848 kw['marker'] = '+' 2849 kw['ms'] = 5 2850 kw['mec'] = colors[s] 2851 kw['label'] = s 2852 kw['alpha'] = 1 2853 ppl.plot([], [], **kw) 2854 2855 kw['mec'] = (0,0,0) 2856 2857 if one_or_more_singlets: 2858 kw['marker'] = 'x' 2859 kw['ms'] = 4 2860 kw['alpha'] = .2 2861 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2862 ppl.plot([], [], **kw) 2863 2864 if one_or_more_multiplets: 2865 kw['marker'] = '+' 2866 kw['ms'] = 4 2867 kw['alpha'] = 1 2868 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2869 ppl.plot([], [], **kw) 2870 2871 if hist or kde: 2872 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2873 else: 2874 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2875 leg.set_zorder(-1000) 2876 2877 ppl.sca(ax1) 2878 2879 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2880 ppl.xticks([]) 2881 ppl.axis([-1, len(self), None, None]) 2882 2883 if hist or kde: 2884 ppl.sca(ax2) 2885 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2886 2887 if kde: 2888 from scipy.stats import gaussian_kde 2889 yi = np.linspace(ymin, ymax, 201) 2890 xi = gaussian_kde(X).evaluate(yi) 2891 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2892# ppl.plot(xi, yi, 'k-', lw = 1) 2893 elif hist: 2894 ppl.hist( 2895 X, 2896 orientation = 'horizontal', 2897 histtype = 'stepfilled', 2898 ec = [.4]*3, 2899 fc = [.25]*3, 2900 alpha = .25, 2901 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2902 ) 2903 ppl.text(0, 0, 2904 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2905 size = 7.5, 2906 alpha = 1, 2907 va = 'center', 2908 ha = 'left', 2909 ) 2910 2911 ppl.axis([0, None, ymin, ymax]) 2912 ppl.xticks([]) 2913 ppl.yticks([]) 2914# ax2.spines['left'].set_visible(False) 2915 ax2.spines['right'].set_visible(False) 2916 ax2.spines['top'].set_visible(False) 2917 ax2.spines['bottom'].set_visible(False) 2918 2919 ax1.axis([None, None, ymin, ymax]) 2920 2921 if not os.path.exists(dir): 2922 os.makedirs(dir) 2923 if filename is None: 2924 return fig 2925 elif filename == '': 2926 filename = f'D{self._4x}_residuals.pdf' 2927 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2928 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata object)
kde: whether to add a kernel density estimate of residualshist: whether to add a histogram of residuals (incompatible withkde)histbins: specify bin edges for the histogramdir: the directory in which to save the plothighlight: a list of samples to highlightcolors: a dict of{<sample>: (r, g, b)}for all samplesfigsize: (width, height) of figuredpi: resolution for PNG outputyspan: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0)
2931 def simulate(self, *args, **kwargs): 2932 ''' 2933 Legacy function with warning message pointing to `virtual_data()` 2934 ''' 2935 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2937 def plot_anchor_residuals( 2938 self, 2939 dir = 'output', 2940 filename = '', 2941 figsize = None, 2942 subplots_adjust = (0.05, 0.1, 0.95, 0.98, .25, .25), 2943 dpi = 100, 2944 colors = None, 2945 ): 2946 ''' 2947 Plot a summary of the residuals for all anchors, intended to help detect systematic bias. 2948 2949 **Parameters** 2950 2951 + `dir`: the directory in which to save the plot 2952 + `filename`: the file name to save to. 2953 + `dpi`: resolution for PNG output 2954 + `figsize`: (width, height) of figure 2955 + `subplots_adjust`: passed to the figure 2956 + `dpi`: resolution for PNG output 2957 + `colors`: a dict of `{<sample>: (r, g, b)}` for all samples 2958 ''' 2959 2960 # Colors 2961 N = len(self.anchors) 2962 if colors is None: 2963 if N == 3: 2964 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2965 elif N == 4: 2966 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2967 else: 2968 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2969 2970 if figsize is None: 2971 figsize = (4, 1.5*N+1) 2972 fig = ppl.figure(figsize = figsize) 2973 ppl.subplots_adjust(*subplots_adjust) 2974 axs = {} 2975 X = np.array([r[f'D{self._4x}_residual'] for a in self.anchors for r in self.anchors[a]['data']])*1000 2976 sigma = self.repeatability['r_D47a'] * 1000 2977 D = max(np.abs(X)) 2978 2979 for k,a in enumerate(self.anchors): 2980 color = colors[a] 2981 axs[a] = ppl.subplot(N, 1, 1+k) 2982 axs[a].text( 2983 0.02, 1-0.05, a, 2984 va = 'top', 2985 ha = 'left', 2986 weight = 'bold', 2987 size = 9, 2988 color = [_*0.75 for _ in color], 2989 transform = axs[a].transAxes, 2990 ) 2991 X = np.array([r[f'D{self._4x}_residual'] for r in self.anchors[a]['data']])*1000 2992 axs[a].axvline(0, lw = 0.5, color = color) 2993 axs[a].plot(X, X*0, 'o', mew = 0.7, mec = (*color,.5), mfc = (*color, 0), ms = 7, clip_on = False) 2994 2995 xi = np.linspace(-3*D, 3*D, 601) 2996 yi = np.array([np.exp(-0.5 * ((xi - x)/sigma)**2) for x in X]).sum(0) 2997 ppl.fill_between(xi, yi, yi*0, fc = (*color, .15), lw = 1, ec = color) 2998 2999 axs[a].errorbar( 3000 X.mean(), yi.max()*.2, None, 1.96*sigma/len(X)**0.5, 3001 ecolor = color, 3002 marker = 's', 3003 ls = 'None', 3004 mec = color, 3005 mew = 1, 3006 mfc = 'w', 3007 ms = 8, 3008 elinewidth = 1, 3009 capsize = 4, 3010 capthick = 1, 3011 ) 3012 3013 axs[a].axis([xi[0], xi[-1], 0, yi.max()*1.05]) 3014 ppl.yticks([]) 3015 3016 ppl.xlabel(f'$Δ_{{{self._4x}}}$ residuals (ppm)') 3017 3018 if not os.path.exists(dir): 3019 os.makedirs(dir) 3020 if filename is None: 3021 return fig 3022 elif filename == '': 3023 filename = f'D{self._4x}_anchor_residuals.pdf' 3024 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3025 ppl.close(fig)
Plot a summary of the residuals for all anchors, intended to help detect systematic bias.
Parameters
dir: the directory in which to save the plotfilename: the file name to save to.dpi: resolution for PNG outputfigsize: (width, height) of figuresubplots_adjust: passed to the figuredpi: resolution for PNG outputcolors: a dict of{<sample>: (r, g, b)}for all samples
3028 def plot_distribution_of_analyses( 3029 self, 3030 dir = 'output', 3031 filename = None, 3032 vs_time = False, 3033 figsize = (6,4), 3034 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 3035 output = None, 3036 dpi = 100, 3037 ): 3038 ''' 3039 Plot temporal distribution of all analyses in the data set. 3040 3041 **Parameters** 3042 3043 + `dir`: the directory in which to save the plot 3044 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 3045 + `dpi`: resolution for PNG output 3046 + `figsize`: (width, height) of figure 3047 + `dpi`: resolution for PNG output 3048 ''' 3049 3050 asamples = [s for s in self.anchors] 3051 usamples = [s for s in self.unknowns] 3052 if output is None or output == 'fig': 3053 fig = ppl.figure(figsize = figsize) 3054 ppl.subplots_adjust(*subplots_adjust) 3055 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3056 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 3057 Xmax += (Xmax-Xmin)/40 3058 Xmin -= (Xmax-Xmin)/41 3059 for k, s in enumerate(asamples + usamples): 3060 if vs_time: 3061 X = [r['TimeTag'] for r in self if r['Sample'] == s] 3062 else: 3063 X = [x for x,r in enumerate(self) if r['Sample'] == s] 3064 Y = [-k for x in X] 3065 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 3066 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 3067 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 3068 ppl.axis([Xmin, Xmax, -k-1, 1]) 3069 ppl.xlabel('\ntime') 3070 ppl.gca().annotate('', 3071 xy = (0.6, -0.02), 3072 xycoords = 'axes fraction', 3073 xytext = (.4, -0.02), 3074 arrowprops = dict(arrowstyle = "->", color = 'k'), 3075 ) 3076 3077 3078 x2 = -1 3079 for session in self.sessions: 3080 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3081 if vs_time: 3082 ppl.axvline(x1, color = 'k', lw = .75) 3083 if x2 > -1: 3084 if not vs_time: 3085 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 3086 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 3087# from xlrd import xldate_as_datetime 3088# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 3089 if vs_time: 3090 ppl.axvline(x2, color = 'k', lw = .75) 3091 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 3092 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 3093 3094 ppl.xticks([]) 3095 ppl.yticks([]) 3096 3097 if output is None: 3098 if not os.path.exists(dir): 3099 os.makedirs(dir) 3100 if filename == None: 3101 filename = f'D{self._4x}_distribution_of_analyses.pdf' 3102 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 3103 ppl.close(fig) 3104 elif output == 'ax': 3105 return ppl.gca() 3106 elif output == 'fig': 3107 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir: the directory in which to save the plotvs_time: ifTrue, plot as a function ofTimeTagrather than sequentially.dpi: resolution for PNG outputfigsize: (width, height) of figuredpi: resolution for PNG output
3110 def plot_bulk_compositions( 3111 self, 3112 samples = None, 3113 dir = 'output/bulk_compositions', 3114 figsize = (6,6), 3115 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3116 show = False, 3117 sample_color = (0,.5,1), 3118 analysis_color = (.7,.7,.7), 3119 labeldist = 0.3, 3120 radius = 0.05, 3121 ): 3122 ''' 3123 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3124 3125 By default, creates a directory `./output/bulk_compositions` where plots for 3126 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3127 3128 3129 **Parameters** 3130 3131 + `samples`: Only these samples are processed (by default: all samples). 3132 + `dir`: where to save the plots 3133 + `figsize`: (width, height) of figure 3134 + `subplots_adjust`: passed to `subplots_adjust()` 3135 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3136 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3137 + `sample_color`: color used for replicate markers/labels 3138 + `analysis_color`: color used for sample markers/labels 3139 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3140 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3141 ''' 3142 3143 from matplotlib.patches import Ellipse 3144 3145 if samples is None: 3146 samples = [_ for _ in self.samples] 3147 3148 saved = {} 3149 3150 for s in samples: 3151 3152 fig = ppl.figure(figsize = figsize) 3153 fig.subplots_adjust(*subplots_adjust) 3154 ax = ppl.subplot(111) 3155 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3156 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3157 ppl.title(s) 3158 3159 3160 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3161 UID = [_['UID'] for _ in self.samples[s]['data']] 3162 XY0 = XY.mean(0) 3163 3164 for xy in XY: 3165 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3166 3167 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3168 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3169 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3170 saved[s] = [XY, XY0] 3171 3172 x1, x2, y1, y2 = ppl.axis() 3173 x0, dx = (x1+x2)/2, (x2-x1)/2 3174 y0, dy = (y1+y2)/2, (y2-y1)/2 3175 dx, dy = [max(max(dx, dy), radius)]*2 3176 3177 ppl.axis([ 3178 x0 - 1.2*dx, 3179 x0 + 1.2*dx, 3180 y0 - 1.2*dy, 3181 y0 + 1.2*dy, 3182 ]) 3183 3184 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3185 3186 for xy, uid in zip(XY, UID): 3187 3188 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3189 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3190 3191 if (vector_in_display_space**2).sum() > 0: 3192 3193 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3194 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3195 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3196 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3197 3198 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3199 3200 else: 3201 3202 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3203 3204 if radius: 3205 ax.add_artist(Ellipse( 3206 xy = XY0, 3207 width = radius*2, 3208 height = radius*2, 3209 ls = (0, (2,2)), 3210 lw = .7, 3211 ec = analysis_color, 3212 fc = 'None', 3213 )) 3214 ppl.text( 3215 XY0[0], 3216 XY0[1]-radius, 3217 f'\n± {radius*1e3:.0f} ppm', 3218 color = analysis_color, 3219 va = 'top', 3220 ha = 'center', 3221 linespacing = 0.4, 3222 size = 8, 3223 ) 3224 3225 if not os.path.exists(dir): 3226 os.makedirs(dir) 3227 fig.savefig(f'{dir}/{s}.pdf') 3228 ppl.close(fig) 3229 3230 fig = ppl.figure(figsize = figsize) 3231 fig.subplots_adjust(*subplots_adjust) 3232 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3233 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3234 3235 for s in saved: 3236 for xy in saved[s][0]: 3237 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3238 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3239 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3240 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3241 3242 x1, x2, y1, y2 = ppl.axis() 3243 ppl.axis([ 3244 x1 - (x2-x1)/10, 3245 x2 + (x2-x1)/10, 3246 y1 - (y2-y1)/10, 3247 y2 + (y2-y1)/10, 3248 ]) 3249 3250 3251 if not os.path.exists(dir): 3252 os.makedirs(dir) 3253 fig.savefig(f'{dir}/__all__.pdf') 3254 if show: 3255 ppl.show() 3256 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions where plots for
each sample are saved. Another plot named __all__.pdf shows all analyses together.
Parameters
samples: Only these samples are processed (by default: all samples).dir: where to save the plotsfigsize: (width, height) of figuresubplots_adjust: passed tosubplots_adjust()show: whether to callmatplotlib.pyplot.show()on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color: color used for replicate markers/labelsanalysis_color: color used for sample markers/labelslabeldist: distance (in inches) from replicate markers to replicate labelsradius: radius of the dashed circle providing scale. No circle ifradius = 0.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3302class D47data(D4xdata): 3303 ''' 3304 Store and process data for a large set of Δ47 analyses, 3305 usually comprising more than one analytical session. 3306 ''' 3307 3308 Nominal_D4x = { 3309 'ETH-1': 0.2052, 3310 'ETH-2': 0.2085, 3311 'ETH-3': 0.6132, 3312 'ETH-4': 0.4511, 3313 'IAEA-C1': 0.3018, 3314 'IAEA-C2': 0.6409, 3315 'MERCK': 0.5135, 3316 } # I-CDES (Bernasconi et al., 2021) 3317 ''' 3318 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3319 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3320 reference frame. 3321 3322 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3323 ```py 3324 { 3325 'ETH-1' : 0.2052, 3326 'ETH-2' : 0.2085, 3327 'ETH-3' : 0.6132, 3328 'ETH-4' : 0.4511, 3329 'IAEA-C1' : 0.3018, 3330 'IAEA-C2' : 0.6409, 3331 'MERCK' : 0.5135, 3332 } 3333 ``` 3334 ''' 3335 3336 3337 @property 3338 def Nominal_D47(self): 3339 return self.Nominal_D4x 3340 3341 3342 @Nominal_D47.setter 3343 def Nominal_D47(self, new): 3344 self.Nominal_D4x = dict(**new) 3345 self.refresh() 3346 3347 3348 def __init__(self, l = [], **kwargs): 3349 ''' 3350 **Parameters:** same as `D4xdata.__init__()` 3351 ''' 3352 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3353 3354 3355 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3356 ''' 3357 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3358 value for that temperature, and add treat these samples as additional anchors. 3359 3360 **Parameters** 3361 3362 + `fCo2eqD47`: Which CO2 equilibrium law to use 3363 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3364 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3365 + `priority`: if `replace`: forget old anchors and only use the new ones; 3366 if `new`: keep pre-existing anchors but update them in case of conflict 3367 between old and new Δ47 values; 3368 if `old`: keep pre-existing anchors but preserve their original Δ47 3369 values in case of conflict. 3370 ''' 3371 f = { 3372 'petersen': fCO2eqD47_Petersen, 3373 'wang': fCO2eqD47_Wang, 3374 }[fCo2eqD47] 3375 foo = {} 3376 for r in self: 3377 if 'Teq' in r: 3378 if r['Sample'] in foo: 3379 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3380 else: 3381 foo[r['Sample']] = f(r['Teq']) 3382 else: 3383 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3384 3385 if priority == 'replace': 3386 self.Nominal_D47 = {} 3387 for s in foo: 3388 if priority != 'old' or s not in self.Nominal_D47: 3389 self.Nominal_D47[s] = foo[s] 3390 3391 def save_D47_correl(self, *args, **kwargs): 3392 return self._save_D4x_correl(*args, **kwargs) 3393 3394 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3348 def __init__(self, l = [], **kwargs): 3349 ''' 3350 **Parameters:** same as `D4xdata.__init__()` 3351 ''' 3352 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize() to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3355 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3356 ''' 3357 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3358 value for that temperature, and add treat these samples as additional anchors. 3359 3360 **Parameters** 3361 3362 + `fCo2eqD47`: Which CO2 equilibrium law to use 3363 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3364 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3365 + `priority`: if `replace`: forget old anchors and only use the new ones; 3366 if `new`: keep pre-existing anchors but update them in case of conflict 3367 between old and new Δ47 values; 3368 if `old`: keep pre-existing anchors but preserve their original Δ47 3369 values in case of conflict. 3370 ''' 3371 f = { 3372 'petersen': fCO2eqD47_Petersen, 3373 'wang': fCO2eqD47_Wang, 3374 }[fCo2eqD47] 3375 foo = {} 3376 for r in self: 3377 if 'Teq' in r: 3378 if r['Sample'] in foo: 3379 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3380 else: 3381 foo[r['Sample']] = f(r['Teq']) 3382 else: 3383 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3384 3385 if priority == 'replace': 3386 self.Nominal_D47 = {} 3387 for s in foo: 3388 if priority != 'old' or s not in self.Nominal_D47: 3389 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47: Which CO2 equilibrium law to use (petersen: Petersen et al. (2019);wang: Wang et al. (2019)).priority: ifreplace: forget old anchors and only use the new ones; ifnew: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Save D47 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D47_correl.csv)D47_precision: the precision to use when writingD47andD47_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3397class D48data(D4xdata): 3398 ''' 3399 Store and process data for a large set of Δ48 analyses, 3400 usually comprising more than one analytical session. 3401 ''' 3402 3403 Nominal_D4x = { 3404 'ETH-1': 0.138, 3405 'ETH-2': 0.138, 3406 'ETH-3': 0.270, 3407 'ETH-4': 0.223, 3408 'GU-1': -0.419, 3409 } # (Fiebig et al., 2019, 2021) 3410 ''' 3411 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3412 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3413 reference frame. 3414 3415 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3416 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3417 3418 ```py 3419 { 3420 'ETH-1' : 0.138, 3421 'ETH-2' : 0.138, 3422 'ETH-3' : 0.270, 3423 'ETH-4' : 0.223, 3424 'GU-1' : -0.419, 3425 } 3426 ``` 3427 ''' 3428 3429 3430 @property 3431 def Nominal_D48(self): 3432 return self.Nominal_D4x 3433 3434 3435 @Nominal_D48.setter 3436 def Nominal_D48(self, new): 3437 self.Nominal_D4x = dict(**new) 3438 self.refresh() 3439 3440 3441 def __init__(self, l = [], **kwargs): 3442 ''' 3443 **Parameters:** same as `D4xdata.__init__()` 3444 ''' 3445 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3446 3447 def save_D48_correl(self, *args, **kwargs): 3448 return self._save_D4x_correl(*args, **kwargs) 3449 3450 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3441 def __init__(self, l = [], **kwargs): 3442 ''' 3443 **Parameters:** same as `D4xdata.__init__()` 3444 ''' 3445 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize() to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Save D48 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D48_correl.csv)D48_precision: the precision to use when writingD48andD48_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3453class D49data(D4xdata): 3454 ''' 3455 Store and process data for a large set of Δ49 analyses, 3456 usually comprising more than one analytical session. 3457 ''' 3458 3459 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3460 ''' 3461 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3462 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3463 reference frame. 3464 3465 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3466 3467 ```py 3468 { 3469 "1000C": 0.0, 3470 "25C": 2.228 3471 } 3472 ``` 3473 ''' 3474 3475 @property 3476 def Nominal_D49(self): 3477 return self.Nominal_D4x 3478 3479 @Nominal_D49.setter 3480 def Nominal_D49(self, new): 3481 self.Nominal_D4x = dict(**new) 3482 self.refresh() 3483 3484 def __init__(self, l=[], **kwargs): 3485 ''' 3486 **Parameters:** same as `D4xdata.__init__()` 3487 ''' 3488 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3489 3490 def save_D49_correl(self, *args, **kwargs): 3491 return self._save_D4x_correl(*args, **kwargs) 3492 3493 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.
3484 def __init__(self, l=[], **kwargs): 3485 ''' 3486 **Parameters:** same as `D4xdata.__init__()` 3487 ''' 3488 D4xdata.__init__(self, l=l, mass='49', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ49 values assigned to the Δ49 anchor samples, used by
D49data.standardize() to normalize unknown samples to an absolute Δ49
reference frame.
By default equal to (after Wang et al. (2004)):
{
"1000C": 0.0,
"25C": 2.228
}
Save D49 values along with their SE and correlation matrix.
Parameters
samples: Only these samples are output (by default: all samples).dir: the directory in which to save the faile (by defaut:output)filename: the name to the csv file to write to (by default:D49_correl.csv)D49_precision: the precision to use when writingD49andD49_SEvalues (by default: 4)correl_precision: the precision to use when writing correlation factor values (by default: 4)save_to_file: whether to write the output to a file factor values (by default: True). IfFalse, returns the output as a string
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_anchor_residuals
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort