D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip
; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
dev
branch source code here and rename it toD47crunch.py
. - Do any of the following:
- copy
D47crunch.py
to somewhere in your Python path - copy
D47crunch.py
to a working directory (import D47crunch
will only work if called within that directory) - copy
D47crunch.py
to any other location (e.g.,/foo/bar
) and then use the following code snippet in your own code to importD47crunch
:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv
with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data
object which will store and process this data:
import D47crunch
mydata = D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv
into our D47data
object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW
is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data()
. The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO
and BAR
with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data()
documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch
offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')
The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses()
for how to plot analyses as a function of “true” time (based on the TimeTag
for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.
2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)
Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions()
produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE
:
2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDB
D4xdata.Nominal_d18O_VPDB
D47data.Nominal_D4x
(also accessible throughD47data.Nominal_D47
)D48data.Nominal_D4x
(also accessible throughD48data.Nominal_D48
)
17O correction parameters are defined by:
D4xdata.R13_VPDB
D4xdata.R18_VSMOW
D4xdata.R18_VPDB
D4xdata.LAMBDA_17
D4xdata.R17_VSMOW
D4xdata.R17_VPDB
When creating a new instance of D47data
or D48data
, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW
and Nominal_D47
can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW
and D47data.Nominal_D47
_before_ creating a D47data
object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW
and Nominal_D47
_after_ creating a D47data
object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch
uses two independent classes — D47data
and D48data
— which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data
and D48data
.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:
D47crunch rawdata.csv
This will create a directory named output
and populate it by calling the following methods:
D47data.wg()
D47data.crunch()
D47data.standardize()
D47data.summary()
D47data.table_of_samples()
D47data.table_of_sessions()
D47data.plot_sessions()
D47data.plot_residuals()
D47data.table_of_analyses()
D47data.plot_distribution_of_analyses()
D47data.plot_bulk_compositions()
D47data.save_D47_correl()
You may specify a custom set of anchors instead of the default ones using the --anchors
or -a
option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv
file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB
, d18O_VPDB
, and D47
values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude
or -e
option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv
file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03
, A09
, and B06
, and those of samples MYBADSAMPLE-1
and MYBADSAMPLE-2
. It is possible to have and exclude file with only the UID
column, or only the Sample
column, or both, in any order.
The --output-dir
or -o
option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
To process Δ48 as well as Δ47, just add the --D48
option.
API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../../docpages/tutorial.md 13.. include:: ../../docpages/howto.md 14.. include:: ../../docpages/cli.md 15 16<h1>API Documentation</h1> 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) Mathieu Daëron' 23__license__ = 'MIT License - https://opensource.org/licenses/MIT' 24__date__ = '2025-09-04' 25__version__ = '2.4.3' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42from typer import rich_utils 43 44rich_utils.STYLE_HELPTEXT = '' 45 46rcParams['font.family'] = 'sans-serif' 47rcParams['font.sans-serif'] = 'Helvetica' 48rcParams['font.size'] = 10 49rcParams['mathtext.fontset'] = 'custom' 50rcParams['mathtext.rm'] = 'sans' 51rcParams['mathtext.bf'] = 'sans:bold' 52rcParams['mathtext.it'] = 'sans:italic' 53rcParams['mathtext.cal'] = 'sans:italic' 54rcParams['mathtext.default'] = 'rm' 55rcParams['xtick.major.size'] = 4 56rcParams['xtick.major.width'] = 1 57rcParams['ytick.major.size'] = 4 58rcParams['ytick.major.width'] = 1 59rcParams['axes.grid'] = False 60rcParams['axes.linewidth'] = 1 61rcParams['grid.linewidth'] = .75 62rcParams['grid.linestyle'] = '-' 63rcParams['grid.alpha'] = .15 64rcParams['savefig.dpi'] = 150 65 66Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 67_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 68def fCO2eqD47_Petersen(T): 69 ''' 70 CO2 equilibrium Δ47 value as a function of T (in degrees C) 71 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 72 73 ''' 74 return float(_fCO2eqD47_Petersen(T)) 75 76 77Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 78_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 79def fCO2eqD47_Wang(T): 80 ''' 81 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 82 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 83 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 84 ''' 85 return float(_fCO2eqD47_Wang(T)) 86 87 88def correlated_sum(X, C, w = None): 89 ''' 90 Compute covariance-aware linear combinations 91 92 **Parameters** 93 94 + `X`: list or 1-D array of values to sum 95 + `C`: covariance matrix for the elements of `X` 96 + `w`: list or 1-D array of weights to apply to the elements of `X` 97 (all equal to 1 by default) 98 99 Return the sum (and its SE) of the elements of `X`, with optional weights equal 100 to the elements of `w`, accounting for covariances between the elements of `X`. 101 ''' 102 if w is None: 103 w = [1 for x in X] 104 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 105 106 107def make_csv(x, hsep = ',', vsep = '\n'): 108 ''' 109 Formats a list of lists of strings as a CSV 110 111 **Parameters** 112 113 + `x`: the list of lists of strings to format 114 + `hsep`: the field separator (`,` by default) 115 + `vsep`: the line-ending convention to use (`\\n` by default) 116 117 **Example** 118 119 ```py 120 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 121 ``` 122 123 outputs: 124 125 ```py 126 a,b,c 127 d,e,f 128 ``` 129 ''' 130 return vsep.join([hsep.join(l) for l in x]) 131 132 133def pf(txt): 134 ''' 135 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 136 ''' 137 return txt.replace('-','_').replace('.','_').replace(' ','_') 138 139 140def smart_type(x): 141 ''' 142 Tries to convert string `x` to a float if it includes a decimal point, or 143 to an integer if it does not. If both attempts fail, return the original 144 string unchanged. 145 ''' 146 try: 147 y = float(x) 148 except ValueError: 149 return x 150 if '.' not in x: 151 return int(y) 152 return y 153 154class _Defaults(): 155 def __init__(self): 156 pass 157 158D47crunch_defaults = _Defaults() 159D47crunch_defaults.PRETTY_TABLE_VSEP = '—' 160 161def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 162 ''' 163 Reads a list of lists of strings and outputs an ascii table 164 165 **Parameters** 166 167 + `x`: a list of lists of strings 168 + `header`: the number of lines to treat as header lines 169 + `hsep`: the horizontal separator between columns 170 + `vsep`: the character to use as vertical separator 171 + `align`: string of left (`<`) or right (`>`) alignment characters. 172 173 **Example** 174 175 ```py 176 print(pretty_table([ 177 ['A', 'B', 'C'], 178 ['1', '1.9999', 'foo'], 179 ['10', 'x', 'bar'], 180 ])) 181 ``` 182 yields: 183 ``` 184 —— —————— ——— 185 A B C 186 —— —————— ——— 187 1 1.9999 foo 188 10 x bar 189 —— —————— ——— 190 ``` 191 192 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 193 194 ```py 195 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 196 print(pretty_table([ 197 ['A', 'B', 'C'], 198 ['1', '1.9999', 'foo'], 199 ['10', 'x', 'bar'], 200 ])) 201 ``` 202 yields: 203 ``` 204 == ====== === 205 A B C 206 == ====== === 207 1 1.9999 foo 208 10 x bar 209 == ====== === 210 ``` 211 ''' 212 213 if vsep is None: 214 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 215 216 txt = [] 217 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 218 219 if len(widths) > len(align): 220 align += '>' * (len(widths)-len(align)) 221 sepline = hsep.join([vsep*w for w in widths]) 222 txt += [sepline] 223 for k,l in enumerate(x): 224 if k and k == header: 225 txt += [sepline] 226 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 227 txt += [sepline] 228 txt += [''] 229 return '\n'.join(txt) 230 231 232def transpose_table(x): 233 ''' 234 Transpose a list if lists 235 236 **Parameters** 237 238 + `x`: a list of lists 239 240 **Example** 241 242 ```py 243 x = [[1, 2], [3, 4]] 244 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 245 ``` 246 ''' 247 return [[e for e in c] for c in zip(*x)] 248 249 250def w_avg(X, sX) : 251 ''' 252 Compute variance-weighted average 253 254 Returns the value and SE of the weighted average of the elements of `X`, 255 with relative weights equal to their inverse variances (`1/sX**2`). 256 257 **Parameters** 258 259 + `X`: array-like of elements to average 260 + `sX`: array-like of the corresponding SE values 261 262 **Tip** 263 264 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 265 they may be rearranged using `zip()`: 266 267 ```python 268 foo = [(0, 1), (1, 0.5), (2, 0.5)] 269 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 270 ``` 271 ''' 272 X = [ x for x in X ] 273 sX = [ sx for sx in sX ] 274 W = [ sx**-2 for sx in sX ] 275 W = [ w/sum(W) for w in W ] 276 Xavg = sum([ w*x for w,x in zip(W,X) ]) 277 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 278 return Xavg, sXavg 279 280 281def read_csv(filename, sep = ''): 282 ''' 283 Read contents of `filename` in csv format and return a list of dictionaries. 284 285 In the csv string, spaces before and after field separators (`','` by default) 286 are optional. 287 288 **Parameters** 289 290 + `filename`: the csv file to read 291 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 292 whichever appers most often in the contents of `filename`. 293 ''' 294 with open(filename) as fid: 295 txt = fid.read() 296 297 if sep == '': 298 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 299 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 300 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 301 302 303def simulate_single_analysis( 304 sample = 'MYSAMPLE', 305 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 306 d13C_VPDB = None, d18O_VPDB = None, 307 D47 = None, D48 = None, D49 = 0., D17O = 0., 308 a47 = 1., b47 = 0., c47 = -0.9, 309 a48 = 1., b48 = 0., c48 = -0.45, 310 Nominal_D47 = None, 311 Nominal_D48 = None, 312 Nominal_d13C_VPDB = None, 313 Nominal_d18O_VPDB = None, 314 ALPHA_18O_ACID_REACTION = None, 315 R13_VPDB = None, 316 R17_VSMOW = None, 317 R18_VSMOW = None, 318 LAMBDA_17 = None, 319 R18_VPDB = None, 320 ): 321 ''' 322 Compute working-gas delta values for a single analysis, assuming a stochastic working 323 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 324 325 **Parameters** 326 327 + `sample`: sample name 328 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 329 (respectively –4 and +26 ‰ by default) 330 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 331 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 332 of the carbonate sample 333 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 334 Δ48 values if `D47` or `D48` are not specified 335 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 336 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 337 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 338 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 339 correction parameters (by default equal to the `D4xdata` default values) 340 341 Returns a dictionary with fields 342 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 343 ''' 344 345 if Nominal_d13C_VPDB is None: 346 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 347 348 if Nominal_d18O_VPDB is None: 349 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 350 351 if ALPHA_18O_ACID_REACTION is None: 352 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 353 354 if R13_VPDB is None: 355 R13_VPDB = D4xdata().R13_VPDB 356 357 if R17_VSMOW is None: 358 R17_VSMOW = D4xdata().R17_VSMOW 359 360 if R18_VSMOW is None: 361 R18_VSMOW = D4xdata().R18_VSMOW 362 363 if LAMBDA_17 is None: 364 LAMBDA_17 = D4xdata().LAMBDA_17 365 366 if R18_VPDB is None: 367 R18_VPDB = D4xdata().R18_VPDB 368 369 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 370 371 if Nominal_D47 is None: 372 Nominal_D47 = D47data().Nominal_D47 373 374 if Nominal_D48 is None: 375 Nominal_D48 = D48data().Nominal_D48 376 377 if d13C_VPDB is None: 378 if sample in Nominal_d13C_VPDB: 379 d13C_VPDB = Nominal_d13C_VPDB[sample] 380 else: 381 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 382 383 if d18O_VPDB is None: 384 if sample in Nominal_d18O_VPDB: 385 d18O_VPDB = Nominal_d18O_VPDB[sample] 386 else: 387 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 388 389 if D47 is None: 390 if sample in Nominal_D47: 391 D47 = Nominal_D47[sample] 392 else: 393 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 394 395 if D48 is None: 396 if sample in Nominal_D48: 397 D48 = Nominal_D48[sample] 398 else: 399 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 400 401 X = D4xdata() 402 X.R13_VPDB = R13_VPDB 403 X.R17_VSMOW = R17_VSMOW 404 X.R18_VSMOW = R18_VSMOW 405 X.LAMBDA_17 = LAMBDA_17 406 X.R18_VPDB = R18_VPDB 407 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 408 409 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 410 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 411 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 412 ) 413 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 414 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 415 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 416 D17O=D17O, D47=D47, D48=D48, D49=D49, 417 ) 418 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 419 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 420 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 421 D17O=D17O, 422 ) 423 424 d45 = 1000 * (R45/R45wg - 1) 425 d46 = 1000 * (R46/R46wg - 1) 426 d47 = 1000 * (R47/R47wg - 1) 427 d48 = 1000 * (R48/R48wg - 1) 428 d49 = 1000 * (R49/R49wg - 1) 429 430 for k in range(3): # dumb iteration to adjust for small changes in d47 431 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 432 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 433 d47 = 1000 * (R47raw/R47wg - 1) 434 d48 = 1000 * (R48raw/R48wg - 1) 435 436 return dict( 437 Sample = sample, 438 D17O = D17O, 439 d13Cwg_VPDB = d13Cwg_VPDB, 440 d18Owg_VSMOW = d18Owg_VSMOW, 441 d45 = d45, 442 d46 = d46, 443 d47 = d47, 444 d48 = d48, 445 d49 = d49, 446 ) 447 448 449def virtual_data( 450 samples = [], 451 a47 = 1., b47 = 0., c47 = -0.9, 452 a48 = 1., b48 = 0., c48 = -0.45, 453 rd45 = 0.020, rd46 = 0.060, 454 rD47 = 0.015, rD48 = 0.045, 455 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 456 session = None, 457 Nominal_D47 = None, Nominal_D48 = None, 458 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 459 ALPHA_18O_ACID_REACTION = None, 460 R13_VPDB = None, 461 R17_VSMOW = None, 462 R18_VSMOW = None, 463 LAMBDA_17 = None, 464 R18_VPDB = None, 465 seed = 0, 466 shuffle = True, 467 ): 468 ''' 469 Return list with simulated analyses from a single session. 470 471 **Parameters** 472 473 + `samples`: a list of entries; each entry is a dictionary with the following fields: 474 * `Sample`: the name of the sample 475 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 476 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 477 * `N`: how many analyses to generate for this sample 478 + `a47`: scrambling factor for Δ47 479 + `b47`: compositional nonlinearity for Δ47 480 + `c47`: working gas offset for Δ47 481 + `a48`: scrambling factor for Δ48 482 + `b48`: compositional nonlinearity for Δ48 483 + `c48`: working gas offset for Δ48 484 + `rd45`: analytical repeatability of δ45 485 + `rd46`: analytical repeatability of δ46 486 + `rD47`: analytical repeatability of Δ47 487 + `rD48`: analytical repeatability of Δ48 488 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 489 (by default equal to the `simulate_single_analysis` default values) 490 + `session`: name of the session (no name by default) 491 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 492 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 493 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 494 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 495 (by default equal to the `simulate_single_analysis` defaults) 496 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 497 (by default equal to the `simulate_single_analysis` defaults) 498 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 499 correction parameters (by default equal to the `simulate_single_analysis` default) 500 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 501 + `shuffle`: randomly reorder the sequence of analyses 502 503 504 Here is an example of using this method to generate an arbitrary combination of 505 anchors and unknowns for a bunch of sessions: 506 507 ```py 508 .. include:: ../../code_examples/virtual_data/example.py 509 ``` 510 511 This should output something like: 512 513 ``` 514 .. include:: ../../code_examples/virtual_data/output.txt 515 ``` 516 ''' 517 518 kwargs = locals().copy() 519 520 from numpy import random as nprandom 521 if seed: 522 nprandom.seed(seed) 523 rng = nprandom.default_rng(seed) 524 else: 525 rng = nprandom.default_rng() 526 527 N = sum([s['N'] for s in samples]) 528 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 529 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 530 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 531 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 532 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 533 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 534 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 535 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 536 537 k = 0 538 out = [] 539 for s in samples: 540 kw = {} 541 kw['sample'] = s['Sample'] 542 kw = { 543 **kw, 544 **{var: kwargs[var] 545 for var in [ 546 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 547 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 548 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 549 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 550 ] 551 if kwargs[var] is not None}, 552 **{var: s[var] 553 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 554 if var in s}, 555 } 556 557 sN = s['N'] 558 while sN: 559 out.append(simulate_single_analysis(**kw)) 560 out[-1]['d45'] += errors45[k] 561 out[-1]['d46'] += errors46[k] 562 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 563 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 564 sN -= 1 565 k += 1 566 567 if session is not None: 568 for r in out: 569 r['Session'] = session 570 571 if shuffle: 572 nprandom.shuffle(out) 573 574 return out 575 576def table_of_samples( 577 data47 = None, 578 data48 = None, 579 dir = 'output', 580 filename = None, 581 save_to_file = True, 582 print_out = True, 583 output = None, 584 ): 585 ''' 586 Print out, save to disk and/or return a combined table of samples 587 for a pair of `D47data` and `D48data` objects. 588 589 **Parameters** 590 591 + `data47`: `D47data` instance 592 + `data48`: `D48data` instance 593 + `dir`: the directory in which to save the table 594 + `filename`: the name to the csv file to write to 595 + `save_to_file`: whether to save the table to disk 596 + `print_out`: whether to print out the table 597 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 598 if set to `'raw'`: return a list of list of strings 599 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 600 ''' 601 if data47 is None: 602 if data48 is None: 603 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 604 else: 605 return data48.table_of_samples( 606 dir = dir, 607 filename = filename, 608 save_to_file = save_to_file, 609 print_out = print_out, 610 output = output 611 ) 612 else: 613 if data48 is None: 614 return data47.table_of_samples( 615 dir = dir, 616 filename = filename, 617 save_to_file = save_to_file, 618 print_out = print_out, 619 output = output 620 ) 621 else: 622 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 623 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 624 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 625 626 if save_to_file: 627 if not os.path.exists(dir): 628 os.makedirs(dir) 629 if filename is None: 630 filename = f'D47D48_samples.csv' 631 with open(f'{dir}/{filename}', 'w') as fid: 632 fid.write(make_csv(out)) 633 if print_out: 634 print('\n'+pretty_table(out)) 635 if output == 'raw': 636 return out 637 elif output == 'pretty': 638 return pretty_table(out) 639 640 641def table_of_sessions( 642 data47 = None, 643 data48 = None, 644 dir = 'output', 645 filename = None, 646 save_to_file = True, 647 print_out = True, 648 output = None, 649 ): 650 ''' 651 Print out, save to disk and/or return a combined table of sessions 652 for a pair of `D47data` and `D48data` objects. 653 ***Only applicable if the sessions in `data47` and those in `data48` 654 consist of the exact same sets of analyses.*** 655 656 **Parameters** 657 658 + `data47`: `D47data` instance 659 + `data48`: `D48data` instance 660 + `dir`: the directory in which to save the table 661 + `filename`: the name to the csv file to write to 662 + `save_to_file`: whether to save the table to disk 663 + `print_out`: whether to print out the table 664 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 665 if set to `'raw'`: return a list of list of strings 666 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 667 ''' 668 if data47 is None: 669 if data48 is None: 670 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 671 else: 672 return data48.table_of_sessions( 673 dir = dir, 674 filename = filename, 675 save_to_file = save_to_file, 676 print_out = print_out, 677 output = output 678 ) 679 else: 680 if data48 is None: 681 return data47.table_of_sessions( 682 dir = dir, 683 filename = filename, 684 save_to_file = save_to_file, 685 print_out = print_out, 686 output = output 687 ) 688 else: 689 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 690 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 691 for k,x in enumerate(out47[0]): 692 if k>7: 693 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 694 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 695 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 696 697 if save_to_file: 698 if not os.path.exists(dir): 699 os.makedirs(dir) 700 if filename is None: 701 filename = f'D47D48_sessions.csv' 702 with open(f'{dir}/{filename}', 'w') as fid: 703 fid.write(make_csv(out)) 704 if print_out: 705 print('\n'+pretty_table(out)) 706 if output == 'raw': 707 return out 708 elif output == 'pretty': 709 return pretty_table(out) 710 711 712def table_of_analyses( 713 data47 = None, 714 data48 = None, 715 dir = 'output', 716 filename = None, 717 save_to_file = True, 718 print_out = True, 719 output = None, 720 ): 721 ''' 722 Print out, save to disk and/or return a combined table of analyses 723 for a pair of `D47data` and `D48data` objects. 724 725 If the sessions in `data47` and those in `data48` do not consist of 726 the exact same sets of analyses, the table will have two columns 727 `Session_47` and `Session_48` instead of a single `Session` column. 728 729 **Parameters** 730 731 + `data47`: `D47data` instance 732 + `data48`: `D48data` instance 733 + `dir`: the directory in which to save the table 734 + `filename`: the name to the csv file to write to 735 + `save_to_file`: whether to save the table to disk 736 + `print_out`: whether to print out the table 737 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 738 if set to `'raw'`: return a list of list of strings 739 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 740 ''' 741 if data47 is None: 742 if data48 is None: 743 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 744 else: 745 return data48.table_of_analyses( 746 dir = dir, 747 filename = filename, 748 save_to_file = save_to_file, 749 print_out = print_out, 750 output = output 751 ) 752 else: 753 if data48 is None: 754 return data47.table_of_analyses( 755 dir = dir, 756 filename = filename, 757 save_to_file = save_to_file, 758 print_out = print_out, 759 output = output 760 ) 761 else: 762 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 763 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 764 765 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 766 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 767 else: 768 out47[0][1] = 'Session_47' 769 out48[0][1] = 'Session_48' 770 out47 = transpose_table(out47) 771 out48 = transpose_table(out48) 772 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 773 774 if save_to_file: 775 if not os.path.exists(dir): 776 os.makedirs(dir) 777 if filename is None: 778 filename = f'D47D48_sessions.csv' 779 with open(f'{dir}/{filename}', 'w') as fid: 780 fid.write(make_csv(out)) 781 if print_out: 782 print('\n'+pretty_table(out)) 783 if output == 'raw': 784 return out 785 elif output == 'pretty': 786 return pretty_table(out) 787 788 789def _fullcovar(minresult, epsilon = 0.01, named = False): 790 ''' 791 Construct full covariance matrix in the case of constrained parameters 792 ''' 793 794 import asteval 795 796 def f(values): 797 interp = asteval.Interpreter() 798 for n,v in zip(minresult.var_names, values): 799 interp(f'{n} = {v}') 800 for q in minresult.params: 801 if minresult.params[q].expr: 802 interp(f'{q} = {minresult.params[q].expr}') 803 return np.array([interp.symtable[q] for q in minresult.params]) 804 805 # construct Jacobian 806 J = np.zeros((minresult.nvarys, len(minresult.params))) 807 X = np.array([minresult.params[p].value for p in minresult.var_names]) 808 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 809 810 for j in range(minresult.nvarys): 811 x1 = [_ for _ in X] 812 x1[j] += epsilon * sX[j] 813 x2 = [_ for _ in X] 814 x2[j] -= epsilon * sX[j] 815 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 816 817 _names = [q for q in minresult.params] 818 _covar = J.T @ minresult.covar @ J 819 _se = np.diag(_covar)**.5 820 _correl = _covar.copy() 821 for k,s in enumerate(_se): 822 if s: 823 _correl[k,:] /= s 824 _correl[:,k] /= s 825 826 if named: 827 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 828 _se = {i: _se[i] for i in minresult.params} 829 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 830 831 return _names, _covar, _se, _correl 832 833 834class D4xdata(list): 835 ''' 836 Store and process data for a large set of Δ47 and/or Δ48 837 analyses, usually comprising more than one analytical session. 838 ''' 839 840 ### 17O CORRECTION PARAMETERS 841 R13_VPDB = 0.01118 # (Chang & Li, 1990) 842 ''' 843 Absolute (13C/12C) ratio of VPDB. 844 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 845 ''' 846 847 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 848 ''' 849 Absolute (18O/16C) ratio of VSMOW. 850 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 851 ''' 852 853 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 854 ''' 855 Mass-dependent exponent for triple oxygen isotopes. 856 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 857 ''' 858 859 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 860 ''' 861 Absolute (17O/16C) ratio of VSMOW. 862 By default equal to 0.00038475 863 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 864 rescaled to `R13_VPDB`) 865 ''' 866 867 R18_VPDB = R18_VSMOW * 1.03092 868 ''' 869 Absolute (18O/16C) ratio of VPDB. 870 By definition equal to `R18_VSMOW * 1.03092`. 871 ''' 872 873 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 874 ''' 875 Absolute (17O/16C) ratio of VPDB. 876 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 877 ''' 878 879 LEVENE_REF_SAMPLE = 'ETH-3' 880 ''' 881 After the Δ4x standardization step, each sample is tested to 882 assess whether the Δ4x variance within all analyses for that 883 sample differs significantly from that observed for a given reference 884 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 885 which yields a p-value corresponding to the null hypothesis that the 886 underlying variances are equal). 887 888 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 889 sample should be used as a reference for this test. 890 ''' 891 892 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 893 ''' 894 Specifies the 18O/16O fractionation factor generally applicable 895 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 896 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 897 898 By default equal to 1.008129 (calcite reacted at 90 °C, 899 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 900 ''' 901 902 Nominal_d13C_VPDB = { 903 'ETH-1': 2.02, 904 'ETH-2': -10.17, 905 'ETH-3': 1.71, 906 } # (Bernasconi et al., 2018) 907 ''' 908 Nominal δ13C_VPDB values assigned to carbonate standards, used by 909 `D4xdata.standardize_d13C()`. 910 911 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 912 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 913 ''' 914 915 Nominal_d18O_VPDB = { 916 'ETH-1': -2.19, 917 'ETH-2': -18.69, 918 'ETH-3': -1.78, 919 } # (Bernasconi et al., 2018) 920 ''' 921 Nominal δ18O_VPDB values assigned to carbonate standards, used by 922 `D4xdata.standardize_d18O()`. 923 924 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 925 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 926 ''' 927 928 d13C_STANDARDIZATION_METHOD = '2pt' 929 ''' 930 Method by which to standardize δ13C values: 931 932 + `none`: do not apply any δ13C standardization. 933 + `'1pt'`: within each session, offset all initial δ13C values so as to 934 minimize the difference between final δ13C_VPDB values and 935 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 936 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 937 values so as to minimize the difference between final δ13C_VPDB 938 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 939 is defined). 940 ''' 941 942 d18O_STANDARDIZATION_METHOD = '2pt' 943 ''' 944 Method by which to standardize δ18O values: 945 946 + `none`: do not apply any δ18O standardization. 947 + `'1pt'`: within each session, offset all initial δ18O values so as to 948 minimize the difference between final δ18O_VPDB values and 949 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 950 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 951 values so as to minimize the difference between final δ18O_VPDB 952 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 953 is defined). 954 ''' 955 956 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 957 ''' 958 **Parameters** 959 960 + `l`: a list of dictionaries, with each dictionary including at least the keys 961 `Sample`, `d45`, `d46`, and `d47` or `d48`. 962 + `mass`: `'47'` or `'48'` 963 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 964 + `session`: define session name for analyses without a `Session` key 965 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 966 967 Returns a `D4xdata` object derived from `list`. 968 ''' 969 self._4x = mass 970 self.verbose = verbose 971 self.prefix = 'D4xdata' 972 self.logfile = logfile 973 list.__init__(self, l) 974 self.Nf = None 975 self.repeatability = {} 976 self.refresh(session = session) 977 978 979 def make_verbal(oldfun): 980 ''' 981 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 982 ''' 983 @wraps(oldfun) 984 def newfun(*args, verbose = '', **kwargs): 985 myself = args[0] 986 oldprefix = myself.prefix 987 myself.prefix = oldfun.__name__ 988 if verbose != '': 989 oldverbose = myself.verbose 990 myself.verbose = verbose 991 out = oldfun(*args, **kwargs) 992 myself.prefix = oldprefix 993 if verbose != '': 994 myself.verbose = oldverbose 995 return out 996 return newfun 997 998 999 def msg(self, txt): 1000 ''' 1001 Log a message to `self.logfile`, and print it out if `verbose = True` 1002 ''' 1003 self.log(txt) 1004 if self.verbose: 1005 print(f'{f"[{self.prefix}]":<16} {txt}') 1006 1007 1008 def vmsg(self, txt): 1009 ''' 1010 Log a message to `self.logfile` and print it out 1011 ''' 1012 self.log(txt) 1013 print(txt) 1014 1015 1016 def log(self, *txts): 1017 ''' 1018 Log a message to `self.logfile` 1019 ''' 1020 if self.logfile: 1021 with open(self.logfile, 'a') as fid: 1022 for txt in txts: 1023 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1024 1025 1026 def refresh(self, session = 'mySession'): 1027 ''' 1028 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1029 ''' 1030 self.fill_in_missing_info(session = session) 1031 self.refresh_sessions() 1032 self.refresh_samples() 1033 1034 1035 def refresh_sessions(self): 1036 ''' 1037 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1038 to `False` for all sessions. 1039 ''' 1040 self.sessions = { 1041 s: {'data': [r for r in self if r['Session'] == s]} 1042 for s in sorted({r['Session'] for r in self}) 1043 } 1044 for s in self.sessions: 1045 self.sessions[s]['scrambling_drift'] = False 1046 self.sessions[s]['slope_drift'] = False 1047 self.sessions[s]['wg_drift'] = False 1048 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1049 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1050 1051 1052 def refresh_samples(self): 1053 ''' 1054 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1055 ''' 1056 self.samples = { 1057 s: {'data': [r for r in self if r['Sample'] == s]} 1058 for s in sorted({r['Sample'] for r in self}) 1059 } 1060 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1061 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1062 1063 1064 def read(self, filename, sep = '', session = ''): 1065 ''' 1066 Read file in csv format to load data into a `D47data` object. 1067 1068 In the csv file, spaces before and after field separators (`','` by default) 1069 are optional. Each line corresponds to a single analysis. 1070 1071 The required fields are: 1072 1073 + `UID`: a unique identifier 1074 + `Session`: an identifier for the analytical session 1075 + `Sample`: a sample identifier 1076 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1077 1078 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1079 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1080 and `d49` are optional, and set to NaN by default. 1081 1082 **Parameters** 1083 1084 + `fileneme`: the path of the file to read 1085 + `sep`: csv separator delimiting the fields 1086 + `session`: set `Session` field to this string for all analyses 1087 ''' 1088 with open(filename) as fid: 1089 self.input(fid.read(), sep = sep, session = session) 1090 1091 1092 def input(self, txt, sep = '', session = ''): 1093 ''' 1094 Read `txt` string in csv format to load analysis data into a `D47data` object. 1095 1096 In the csv string, spaces before and after field separators (`','` by default) 1097 are optional. Each line corresponds to a single analysis. 1098 1099 The required fields are: 1100 1101 + `UID`: a unique identifier 1102 + `Session`: an identifier for the analytical session 1103 + `Sample`: a sample identifier 1104 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1105 1106 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1107 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1108 and `d49` are optional, and set to NaN by default. 1109 1110 **Parameters** 1111 1112 + `txt`: the csv string to read 1113 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1114 whichever appers most often in `txt`. 1115 + `session`: set `Session` field to this string for all analyses 1116 ''' 1117 if sep == '': 1118 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1119 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1120 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1121 1122 if session != '': 1123 for r in data: 1124 r['Session'] = session 1125 1126 self += data 1127 self.refresh() 1128 1129 1130 @make_verbal 1131 def wg(self, samples = None, a18_acid = None): 1132 ''' 1133 Compute bulk composition of the working gas for each session based on 1134 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1135 `self.Nominal_d18O_VPDB`. 1136 ''' 1137 1138 self.msg('Computing WG composition:') 1139 1140 if a18_acid is None: 1141 a18_acid = self.ALPHA_18O_ACID_REACTION 1142 if samples is None: 1143 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1144 1145 assert a18_acid, f'Acid fractionation factor should not be zero.' 1146 1147 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1148 R45R46_standards = {} 1149 for sample in samples: 1150 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1151 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1152 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1153 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1154 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1155 1156 C12_s = 1 / (1 + R13_s) 1157 C13_s = R13_s / (1 + R13_s) 1158 C16_s = 1 / (1 + R17_s + R18_s) 1159 C17_s = R17_s / (1 + R17_s + R18_s) 1160 C18_s = R18_s / (1 + R17_s + R18_s) 1161 1162 C626_s = C12_s * C16_s ** 2 1163 C627_s = 2 * C12_s * C16_s * C17_s 1164 C628_s = 2 * C12_s * C16_s * C18_s 1165 C636_s = C13_s * C16_s ** 2 1166 C637_s = 2 * C13_s * C16_s * C17_s 1167 C727_s = C12_s * C17_s ** 2 1168 1169 R45_s = (C627_s + C636_s) / C626_s 1170 R46_s = (C628_s + C637_s + C727_s) / C626_s 1171 R45R46_standards[sample] = (R45_s, R46_s) 1172 1173 for s in self.sessions: 1174 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1175 assert db, f'No sample from {samples} found in session "{s}".' 1176# dbsamples = sorted({r['Sample'] for r in db}) 1177 1178 X = [r['d45'] for r in db] 1179 Y = [R45R46_standards[r['Sample']][0] for r in db] 1180 x1, x2 = np.min(X), np.max(X) 1181 1182 if x1 < x2: 1183 wgcoord = x1/(x1-x2) 1184 else: 1185 wgcoord = 999 1186 1187 if wgcoord < -.5 or wgcoord > 1.5: 1188 # unreasonable to extrapolate to d45 = 0 1189 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1190 else : 1191 # d45 = 0 is reasonably well bracketed 1192 R45_wg = np.polyfit(X, Y, 1)[1] 1193 1194 X = [r['d46'] for r in db] 1195 Y = [R45R46_standards[r['Sample']][1] for r in db] 1196 x1, x2 = np.min(X), np.max(X) 1197 1198 if x1 < x2: 1199 wgcoord = x1/(x1-x2) 1200 else: 1201 wgcoord = 999 1202 1203 if wgcoord < -.5 or wgcoord > 1.5: 1204 # unreasonable to extrapolate to d46 = 0 1205 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1206 else : 1207 # d46 = 0 is reasonably well bracketed 1208 R46_wg = np.polyfit(X, Y, 1)[1] 1209 1210 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1211 1212 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1213 1214 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1215 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1216 for r in self.sessions[s]['data']: 1217 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1218 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1219 1220 1221 def compute_bulk_delta(self, R45, R46, D17O = 0): 1222 ''' 1223 Compute δ13C_VPDB and δ18O_VSMOW, 1224 by solving the generalized form of equation (17) from 1225 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1226 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1227 solving the corresponding second-order Taylor polynomial. 1228 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1229 ''' 1230 1231 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1232 1233 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1234 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1235 C = 2 * self.R18_VSMOW 1236 D = -R46 1237 1238 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1239 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1240 cc = A + B + C + D 1241 1242 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1243 1244 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1245 R17 = K * R18 ** self.LAMBDA_17 1246 R13 = R45 - 2 * R17 1247 1248 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1249 1250 return d13C_VPDB, d18O_VSMOW 1251 1252 1253 @make_verbal 1254 def crunch(self, verbose = ''): 1255 ''' 1256 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1257 ''' 1258 for r in self: 1259 self.compute_bulk_and_clumping_deltas(r) 1260 self.standardize_d13C() 1261 self.standardize_d18O() 1262 self.msg(f"Crunched {len(self)} analyses.") 1263 1264 1265 def fill_in_missing_info(self, session = 'mySession'): 1266 ''' 1267 Fill in optional fields with default values 1268 ''' 1269 for i,r in enumerate(self): 1270 if 'D17O' not in r: 1271 r['D17O'] = 0. 1272 if 'UID' not in r: 1273 r['UID'] = f'{i+1}' 1274 if 'Session' not in r: 1275 r['Session'] = session 1276 for k in ['d47', 'd48', 'd49']: 1277 if k not in r: 1278 r[k] = np.nan 1279 1280 1281 def standardize_d13C(self): 1282 ''' 1283 Perform δ13C standadization within each session `s` according to 1284 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1285 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1286 may be redefined abitrarily at a later stage. 1287 ''' 1288 for s in self.sessions: 1289 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1290 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1291 X,Y = zip(*XY) 1292 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1293 offset = np.mean(Y) - np.mean(X) 1294 for r in self.sessions[s]['data']: 1295 r['d13C_VPDB'] += offset 1296 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1297 a,b = np.polyfit(X,Y,1) 1298 for r in self.sessions[s]['data']: 1299 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1300 1301 def standardize_d18O(self): 1302 ''' 1303 Perform δ18O standadization within each session `s` according to 1304 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1305 which is defined by default by `D47data.refresh_sessions()`as equal to 1306 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1307 ''' 1308 for s in self.sessions: 1309 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1310 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1311 X,Y = zip(*XY) 1312 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1313 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1314 offset = np.mean(Y) - np.mean(X) 1315 for r in self.sessions[s]['data']: 1316 r['d18O_VSMOW'] += offset 1317 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1318 a,b = np.polyfit(X,Y,1) 1319 for r in self.sessions[s]['data']: 1320 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1321 1322 1323 def compute_bulk_and_clumping_deltas(self, r): 1324 ''' 1325 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1326 ''' 1327 1328 # Compute working gas R13, R18, and isobar ratios 1329 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1330 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1331 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1332 1333 # Compute analyte isobar ratios 1334 R45 = (1 + r['d45'] / 1000) * R45_wg 1335 R46 = (1 + r['d46'] / 1000) * R46_wg 1336 R47 = (1 + r['d47'] / 1000) * R47_wg 1337 R48 = (1 + r['d48'] / 1000) * R48_wg 1338 R49 = (1 + r['d49'] / 1000) * R49_wg 1339 1340 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1341 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1342 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1343 1344 # Compute stochastic isobar ratios of the analyte 1345 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1346 R13, R18, D17O = r['D17O'] 1347 ) 1348 1349 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1350 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1351 if (R45 / R45stoch - 1) > 5e-8: 1352 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1353 if (R46 / R46stoch - 1) > 5e-8: 1354 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1355 1356 # Compute raw clumped isotope anomalies 1357 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1358 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1359 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1360 1361 1362 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1363 ''' 1364 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1365 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1366 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1367 ''' 1368 1369 # Compute R17 1370 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1371 1372 # Compute isotope concentrations 1373 C12 = (1 + R13) ** -1 1374 C13 = C12 * R13 1375 C16 = (1 + R17 + R18) ** -1 1376 C17 = C16 * R17 1377 C18 = C16 * R18 1378 1379 # Compute stochastic isotopologue concentrations 1380 C626 = C16 * C12 * C16 1381 C627 = C16 * C12 * C17 * 2 1382 C628 = C16 * C12 * C18 * 2 1383 C636 = C16 * C13 * C16 1384 C637 = C16 * C13 * C17 * 2 1385 C638 = C16 * C13 * C18 * 2 1386 C727 = C17 * C12 * C17 1387 C728 = C17 * C12 * C18 * 2 1388 C737 = C17 * C13 * C17 1389 C738 = C17 * C13 * C18 * 2 1390 C828 = C18 * C12 * C18 1391 C838 = C18 * C13 * C18 1392 1393 # Compute stochastic isobar ratios 1394 R45 = (C636 + C627) / C626 1395 R46 = (C628 + C637 + C727) / C626 1396 R47 = (C638 + C728 + C737) / C626 1397 R48 = (C738 + C828) / C626 1398 R49 = C838 / C626 1399 1400 # Account for stochastic anomalies 1401 R47 *= 1 + D47 / 1000 1402 R48 *= 1 + D48 / 1000 1403 R49 *= 1 + D49 / 1000 1404 1405 # Return isobar ratios 1406 return R45, R46, R47, R48, R49 1407 1408 1409 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1410 ''' 1411 Split unknown samples by UID (treat all analyses as different samples) 1412 or by session (treat analyses of a given sample in different sessions as 1413 different samples). 1414 1415 **Parameters** 1416 1417 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1418 + `grouping`: `by_uid` | `by_session` 1419 ''' 1420 if samples_to_split == 'all': 1421 samples_to_split = [s for s in self.unknowns] 1422 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1423 self.grouping = grouping.lower() 1424 if self.grouping in gkeys: 1425 gkey = gkeys[self.grouping] 1426 for r in self: 1427 if r['Sample'] in samples_to_split: 1428 r['Sample_original'] = r['Sample'] 1429 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1430 elif r['Sample'] in self.unknowns: 1431 r['Sample_original'] = r['Sample'] 1432 self.refresh_samples() 1433 1434 1435 def unsplit_samples(self, tables = False): 1436 ''' 1437 Reverse the effects of `D47data.split_samples()`. 1438 1439 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1440 1441 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1442 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1443 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1444 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1445 that case session-averaged Δ4x values are statistically independent). 1446 ''' 1447 unknowns_old = sorted({s for s in self.unknowns}) 1448 CM_old = self.standardization.covar[:,:] 1449 VD_old = self.standardization.params.valuesdict().copy() 1450 vars_old = self.standardization.var_names 1451 1452 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1453 1454 Ns = len(vars_old) - len(unknowns_old) 1455 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1456 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1457 1458 W = np.zeros((len(vars_new), len(vars_old))) 1459 W[:Ns,:Ns] = np.eye(Ns) 1460 for u in unknowns_new: 1461 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1462 if self.grouping == 'by_session': 1463 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1464 elif self.grouping == 'by_uid': 1465 weights = [1 for s in splits] 1466 sw = sum(weights) 1467 weights = [w/sw for w in weights] 1468 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1469 1470 CM_new = W @ CM_old @ W.T 1471 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1472 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1473 1474 self.standardization.covar = CM_new 1475 self.standardization.params.valuesdict = lambda : VD_new 1476 self.standardization.var_names = vars_new 1477 1478 for r in self: 1479 if r['Sample'] in self.unknowns: 1480 r['Sample_split'] = r['Sample'] 1481 r['Sample'] = r['Sample_original'] 1482 1483 self.refresh_samples() 1484 self.consolidate_samples() 1485 self.repeatabilities() 1486 1487 if tables: 1488 self.table_of_analyses() 1489 self.table_of_samples() 1490 1491 def assign_timestamps(self): 1492 ''' 1493 Assign a time field `t` of type `float` to each analysis. 1494 1495 If `TimeTag` is one of the data fields, `t` is equal within a given session 1496 to `TimeTag` minus the mean value of `TimeTag` for that session. 1497 Otherwise, `TimeTag` is by default equal to the index of each analysis 1498 in the dataset and `t` is defined as above. 1499 ''' 1500 for session in self.sessions: 1501 sdata = self.sessions[session]['data'] 1502 try: 1503 t0 = np.mean([r['TimeTag'] for r in sdata]) 1504 for r in sdata: 1505 r['t'] = r['TimeTag'] - t0 1506 except KeyError: 1507 t0 = (len(sdata)-1)/2 1508 for t,r in enumerate(sdata): 1509 r['t'] = t - t0 1510 1511 1512 def report(self): 1513 ''' 1514 Prints a report on the standardization fit. 1515 Only applicable after `D4xdata.standardize(method='pooled')`. 1516 ''' 1517 report_fit(self.standardization) 1518 1519 1520 def combine_samples(self, sample_groups): 1521 ''' 1522 Combine analyses of different samples to compute weighted average Δ4x 1523 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1524 dictionary. 1525 1526 Caution: samples are weighted by number of replicate analyses, which is a 1527 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1528 correlated analytical errors for one or more samples). 1529 1530 Returns a tuplet of: 1531 1532 + the list of group names 1533 + an array of the corresponding Δ4x values 1534 + the corresponding (co)variance matrix 1535 1536 **Parameters** 1537 1538 + `sample_groups`: a dictionary of the form: 1539 ```py 1540 {'group1': ['sample_1', 'sample_2'], 1541 'group2': ['sample_3', 'sample_4', 'sample_5']} 1542 ``` 1543 ''' 1544 1545 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1546 groups = sorted(sample_groups.keys()) 1547 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1548 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1549 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1550 W = np.array([ 1551 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1552 for j in groups]) 1553 D4x_new = W @ D4x_old 1554 CM_new = W @ CM_old @ W.T 1555 1556 return groups, D4x_new[:,0], CM_new 1557 1558 1559 @make_verbal 1560 def standardize(self, 1561 method = 'pooled', 1562 weighted_sessions = [], 1563 consolidate = True, 1564 consolidate_tables = False, 1565 consolidate_plots = False, 1566 constraints = {}, 1567 ): 1568 ''' 1569 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1570 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1571 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1572 i.e. that their true Δ4x value does not change between sessions, 1573 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1574 `'indep_sessions'`, the standardization processes each session independently, based only 1575 on anchors analyses. 1576 ''' 1577 1578 self.standardization_method = method 1579 self.assign_timestamps() 1580 1581 if method == 'pooled': 1582 if weighted_sessions: 1583 for session_group in weighted_sessions: 1584 if self._4x == '47': 1585 X = D47data([r for r in self if r['Session'] in session_group]) 1586 elif self._4x == '48': 1587 X = D48data([r for r in self if r['Session'] in session_group]) 1588 X.Nominal_D4x = self.Nominal_D4x.copy() 1589 X.refresh() 1590 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1591 w = np.sqrt(result.redchi) 1592 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1593 for r in X: 1594 r[f'wD{self._4x}raw'] *= w 1595 else: 1596 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1597 for r in self: 1598 r[f'wD{self._4x}raw'] = 1. 1599 1600 params = Parameters() 1601 for k,session in enumerate(self.sessions): 1602 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1603 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1604 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1605 s = pf(session) 1606 params.add(f'a_{s}', value = 0.9) 1607 params.add(f'b_{s}', value = 0.) 1608 params.add(f'c_{s}', value = -0.9) 1609 params.add(f'a2_{s}', value = 0., 1610# vary = self.sessions[session]['scrambling_drift'], 1611 ) 1612 params.add(f'b2_{s}', value = 0., 1613# vary = self.sessions[session]['slope_drift'], 1614 ) 1615 params.add(f'c2_{s}', value = 0., 1616# vary = self.sessions[session]['wg_drift'], 1617 ) 1618 if not self.sessions[session]['scrambling_drift']: 1619 params[f'a2_{s}'].expr = '0' 1620 if not self.sessions[session]['slope_drift']: 1621 params[f'b2_{s}'].expr = '0' 1622 if not self.sessions[session]['wg_drift']: 1623 params[f'c2_{s}'].expr = '0' 1624 1625 for sample in self.unknowns: 1626 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1627 1628 for k in constraints: 1629 params[k].expr = constraints[k] 1630 1631 def residuals(p): 1632 R = [] 1633 for r in self: 1634 session = pf(r['Session']) 1635 sample = pf(r['Sample']) 1636 if r['Sample'] in self.Nominal_D4x: 1637 R += [ ( 1638 r[f'D{self._4x}raw'] - ( 1639 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1640 + p[f'b_{session}'] * r[f'd{self._4x}'] 1641 + p[f'c_{session}'] 1642 + r['t'] * ( 1643 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1644 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1645 + p[f'c2_{session}'] 1646 ) 1647 ) 1648 ) / r[f'wD{self._4x}raw'] ] 1649 else: 1650 R += [ ( 1651 r[f'D{self._4x}raw'] - ( 1652 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1653 + p[f'b_{session}'] * r[f'd{self._4x}'] 1654 + p[f'c_{session}'] 1655 + r['t'] * ( 1656 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1657 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1658 + p[f'c2_{session}'] 1659 ) 1660 ) 1661 ) / r[f'wD{self._4x}raw'] ] 1662 return R 1663 1664 M = Minimizer(residuals, params) 1665 result = M.least_squares() 1666 self.Nf = result.nfree 1667 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1668 new_names, new_covar, new_se = _fullcovar(result)[:3] 1669 result.var_names = new_names 1670 result.covar = new_covar 1671 1672 for r in self: 1673 s = pf(r["Session"]) 1674 a = result.params.valuesdict()[f'a_{s}'] 1675 b = result.params.valuesdict()[f'b_{s}'] 1676 c = result.params.valuesdict()[f'c_{s}'] 1677 a2 = result.params.valuesdict()[f'a2_{s}'] 1678 b2 = result.params.valuesdict()[f'b2_{s}'] 1679 c2 = result.params.valuesdict()[f'c2_{s}'] 1680 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1681 1682 1683 self.standardization = result 1684 1685 for session in self.sessions: 1686 self.sessions[session]['Np'] = 3 1687 for k in ['scrambling', 'slope', 'wg']: 1688 if self.sessions[session][f'{k}_drift']: 1689 self.sessions[session]['Np'] += 1 1690 1691 if consolidate: 1692 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1693 return result 1694 1695 1696 elif method == 'indep_sessions': 1697 1698 if weighted_sessions: 1699 for session_group in weighted_sessions: 1700 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1701 X.Nominal_D4x = self.Nominal_D4x.copy() 1702 X.refresh() 1703 # This is only done to assign r['wD47raw'] for r in X: 1704 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1705 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1706 else: 1707 self.msg('All weights set to 1 ‰') 1708 for r in self: 1709 r[f'wD{self._4x}raw'] = 1 1710 1711 for session in self.sessions: 1712 s = self.sessions[session] 1713 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1714 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1715 s['Np'] = sum(p_active) 1716 sdata = s['data'] 1717 1718 A = np.array([ 1719 [ 1720 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1721 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1722 1 / r[f'wD{self._4x}raw'], 1723 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1724 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1725 r['t'] / r[f'wD{self._4x}raw'] 1726 ] 1727 for r in sdata if r['Sample'] in self.anchors 1728 ])[:,p_active] # only keep columns for the active parameters 1729 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1730 s['Na'] = Y.size 1731 CM = linalg.inv(A.T @ A) 1732 bf = (CM @ A.T @ Y).T[0,:] 1733 k = 0 1734 for n,a in zip(p_names, p_active): 1735 if a: 1736 s[n] = bf[k] 1737# self.msg(f'{n} = {bf[k]}') 1738 k += 1 1739 else: 1740 s[n] = 0. 1741# self.msg(f'{n} = 0.0') 1742 1743 for r in sdata : 1744 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1745 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1746 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1747 1748 s['CM'] = np.zeros((6,6)) 1749 i = 0 1750 k_active = [j for j,a in enumerate(p_active) if a] 1751 for j,a in enumerate(p_active): 1752 if a: 1753 s['CM'][j,k_active] = CM[i,:] 1754 i += 1 1755 1756 if not weighted_sessions: 1757 w = self.rmswd()['rmswd'] 1758 for r in self: 1759 r[f'wD{self._4x}'] *= w 1760 r[f'wD{self._4x}raw'] *= w 1761 for session in self.sessions: 1762 self.sessions[session]['CM'] *= w**2 1763 1764 for session in self.sessions: 1765 s = self.sessions[session] 1766 s['SE_a'] = s['CM'][0,0]**.5 1767 s['SE_b'] = s['CM'][1,1]**.5 1768 s['SE_c'] = s['CM'][2,2]**.5 1769 s['SE_a2'] = s['CM'][3,3]**.5 1770 s['SE_b2'] = s['CM'][4,4]**.5 1771 s['SE_c2'] = s['CM'][5,5]**.5 1772 1773 if not weighted_sessions: 1774 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1775 else: 1776 self.Nf = 0 1777 for sg in weighted_sessions: 1778 self.Nf += self.rmswd(sessions = sg)['Nf'] 1779 1780 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1781 1782 avgD4x = { 1783 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1784 for sample in self.samples 1785 } 1786 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1787 rD4x = (chi2/self.Nf)**.5 1788 self.repeatability[f'sigma_{self._4x}'] = rD4x 1789 1790 if consolidate: 1791 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1792 1793 1794 def standardization_error(self, session, d4x, D4x, t = 0): 1795 ''' 1796 Compute standardization error for a given session and 1797 (δ47, Δ47) composition. 1798 ''' 1799 a = self.sessions[session]['a'] 1800 b = self.sessions[session]['b'] 1801 c = self.sessions[session]['c'] 1802 a2 = self.sessions[session]['a2'] 1803 b2 = self.sessions[session]['b2'] 1804 c2 = self.sessions[session]['c2'] 1805 CM = self.sessions[session]['CM'] 1806 1807 x, y = D4x, d4x 1808 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1809# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1810 dxdy = -(b+b2*t) / (a+a2*t) 1811 dxdz = 1. / (a+a2*t) 1812 dxda = -x / (a+a2*t) 1813 dxdb = -y / (a+a2*t) 1814 dxdc = -1. / (a+a2*t) 1815 dxda2 = -x * a2 / (a+a2*t) 1816 dxdb2 = -y * t / (a+a2*t) 1817 dxdc2 = -t / (a+a2*t) 1818 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1819 sx = (V @ CM @ V.T) ** .5 1820 return sx 1821 1822 1823 @make_verbal 1824 def summary(self, 1825 dir = 'output', 1826 filename = None, 1827 save_to_file = True, 1828 print_out = True, 1829 ): 1830 ''' 1831 Print out an/or save to disk a summary of the standardization results. 1832 1833 **Parameters** 1834 1835 + `dir`: the directory in which to save the table 1836 + `filename`: the name to the csv file to write to 1837 + `save_to_file`: whether to save the table to disk 1838 + `print_out`: whether to print out the table 1839 ''' 1840 1841 out = [] 1842 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1843 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1844 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1845 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1846 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1847 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1848 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1849 out += [['Model degrees of freedom', f"{self.Nf}"]] 1850 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1851 out += [['Standardization method', self.standardization_method]] 1852 1853 if save_to_file: 1854 if not os.path.exists(dir): 1855 os.makedirs(dir) 1856 if filename is None: 1857 filename = f'D{self._4x}_summary.csv' 1858 with open(f'{dir}/{filename}', 'w') as fid: 1859 fid.write(make_csv(out)) 1860 if print_out: 1861 self.msg('\n' + pretty_table(out, header = 0)) 1862 1863 1864 @make_verbal 1865 def table_of_sessions(self, 1866 dir = 'output', 1867 filename = None, 1868 save_to_file = True, 1869 print_out = True, 1870 output = None, 1871 ): 1872 ''' 1873 Print out an/or save to disk a table of sessions. 1874 1875 **Parameters** 1876 1877 + `dir`: the directory in which to save the table 1878 + `filename`: the name to the csv file to write to 1879 + `save_to_file`: whether to save the table to disk 1880 + `print_out`: whether to print out the table 1881 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1882 if set to `'raw'`: return a list of list of strings 1883 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1884 ''' 1885 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1886 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1887 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1888 1889 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1890 if include_a2: 1891 out[-1] += ['a2 ± SE'] 1892 if include_b2: 1893 out[-1] += ['b2 ± SE'] 1894 if include_c2: 1895 out[-1] += ['c2 ± SE'] 1896 for session in self.sessions: 1897 out += [[ 1898 session, 1899 f"{self.sessions[session]['Na']}", 1900 f"{self.sessions[session]['Nu']}", 1901 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1902 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1903 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1904 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1905 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1906 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1907 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1908 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1909 ]] 1910 if include_a2: 1911 if self.sessions[session]['scrambling_drift']: 1912 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1913 else: 1914 out[-1] += [''] 1915 if include_b2: 1916 if self.sessions[session]['slope_drift']: 1917 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1918 else: 1919 out[-1] += [''] 1920 if include_c2: 1921 if self.sessions[session]['wg_drift']: 1922 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1923 else: 1924 out[-1] += [''] 1925 1926 if save_to_file: 1927 if not os.path.exists(dir): 1928 os.makedirs(dir) 1929 if filename is None: 1930 filename = f'D{self._4x}_sessions.csv' 1931 with open(f'{dir}/{filename}', 'w') as fid: 1932 fid.write(make_csv(out)) 1933 if print_out: 1934 self.msg('\n' + pretty_table(out)) 1935 if output == 'raw': 1936 return out 1937 elif output == 'pretty': 1938 return pretty_table(out) 1939 1940 1941 @make_verbal 1942 def table_of_analyses( 1943 self, 1944 dir = 'output', 1945 filename = None, 1946 save_to_file = True, 1947 print_out = True, 1948 output = None, 1949 ): 1950 ''' 1951 Print out an/or save to disk a table of analyses. 1952 1953 **Parameters** 1954 1955 + `dir`: the directory in which to save the table 1956 + `filename`: the name to the csv file to write to 1957 + `save_to_file`: whether to save the table to disk 1958 + `print_out`: whether to print out the table 1959 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1960 if set to `'raw'`: return a list of list of strings 1961 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1962 ''' 1963 1964 out = [['UID','Session','Sample']] 1965 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1966 for f in extra_fields: 1967 out[-1] += [f[0]] 1968 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1969 for r in self: 1970 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1971 for f in extra_fields: 1972 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1973 out[-1] += [ 1974 f"{r['d13Cwg_VPDB']:.3f}", 1975 f"{r['d18Owg_VSMOW']:.3f}", 1976 f"{r['d45']:.6f}", 1977 f"{r['d46']:.6f}", 1978 f"{r['d47']:.6f}", 1979 f"{r['d48']:.6f}", 1980 f"{r['d49']:.6f}", 1981 f"{r['d13C_VPDB']:.6f}", 1982 f"{r['d18O_VSMOW']:.6f}", 1983 f"{r['D47raw']:.6f}", 1984 f"{r['D48raw']:.6f}", 1985 f"{r['D49raw']:.6f}", 1986 f"{r[f'D{self._4x}']:.6f}" 1987 ] 1988 if save_to_file: 1989 if not os.path.exists(dir): 1990 os.makedirs(dir) 1991 if filename is None: 1992 filename = f'D{self._4x}_analyses.csv' 1993 with open(f'{dir}/{filename}', 'w') as fid: 1994 fid.write(make_csv(out)) 1995 if print_out: 1996 self.msg('\n' + pretty_table(out)) 1997 return out 1998 1999 @make_verbal 2000 def covar_table( 2001 self, 2002 correl = False, 2003 dir = 'output', 2004 filename = None, 2005 save_to_file = True, 2006 print_out = True, 2007 output = None, 2008 ): 2009 ''' 2010 Print out, save to disk and/or return the variance-covariance matrix of D4x 2011 for all unknown samples. 2012 2013 **Parameters** 2014 2015 + `dir`: the directory in which to save the csv 2016 + `filename`: the name of the csv file to write to 2017 + `save_to_file`: whether to save the csv 2018 + `print_out`: whether to print out the matrix 2019 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2020 if set to `'raw'`: return a list of list of strings 2021 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2022 ''' 2023 samples = sorted([u for u in self.unknowns]) 2024 out = [[''] + samples] 2025 for s1 in samples: 2026 out.append([s1]) 2027 for s2 in samples: 2028 if correl: 2029 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2030 else: 2031 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2032 2033 if save_to_file: 2034 if not os.path.exists(dir): 2035 os.makedirs(dir) 2036 if filename is None: 2037 if correl: 2038 filename = f'D{self._4x}_correl.csv' 2039 else: 2040 filename = f'D{self._4x}_covar.csv' 2041 with open(f'{dir}/{filename}', 'w') as fid: 2042 fid.write(make_csv(out)) 2043 if print_out: 2044 self.msg('\n'+pretty_table(out)) 2045 if output == 'raw': 2046 return out 2047 elif output == 'pretty': 2048 return pretty_table(out) 2049 2050 @make_verbal 2051 def table_of_samples( 2052 self, 2053 dir = 'output', 2054 filename = None, 2055 save_to_file = True, 2056 print_out = True, 2057 output = None, 2058 ): 2059 ''' 2060 Print out, save to disk and/or return a table of samples. 2061 2062 **Parameters** 2063 2064 + `dir`: the directory in which to save the csv 2065 + `filename`: the name of the csv file to write to 2066 + `save_to_file`: whether to save the csv 2067 + `print_out`: whether to print out the table 2068 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2069 if set to `'raw'`: return a list of list of strings 2070 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2071 ''' 2072 2073 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2074 for sample in self.anchors: 2075 out += [[ 2076 f"{sample}", 2077 f"{self.samples[sample]['N']}", 2078 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2079 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2080 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2081 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2082 ]] 2083 for sample in self.unknowns: 2084 out += [[ 2085 f"{sample}", 2086 f"{self.samples[sample]['N']}", 2087 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2088 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2089 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2090 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2091 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2092 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2093 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2094 ]] 2095 if save_to_file: 2096 if not os.path.exists(dir): 2097 os.makedirs(dir) 2098 if filename is None: 2099 filename = f'D{self._4x}_samples.csv' 2100 with open(f'{dir}/{filename}', 'w') as fid: 2101 fid.write(make_csv(out)) 2102 if print_out: 2103 self.msg('\n'+pretty_table(out)) 2104 if output == 'raw': 2105 return out 2106 elif output == 'pretty': 2107 return pretty_table(out) 2108 2109 2110 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2111 ''' 2112 Generate session plots and save them to disk. 2113 2114 **Parameters** 2115 2116 + `dir`: the directory in which to save the plots 2117 + `figsize`: the width and height (in inches) of each plot 2118 + `filetype`: 'pdf' or 'png' 2119 + `dpi`: resolution for PNG output 2120 ''' 2121 if not os.path.exists(dir): 2122 os.makedirs(dir) 2123 2124 for session in self.sessions: 2125 sp = self.plot_single_session(session, xylimits = 'constant') 2126 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2127 ppl.close(sp.fig) 2128 2129 2130 2131 @make_verbal 2132 def consolidate_samples(self): 2133 ''' 2134 Compile various statistics for each sample. 2135 2136 For each anchor sample: 2137 2138 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2139 + `SE_D47` or `SE_D48`: set to zero by definition 2140 2141 For each unknown sample: 2142 2143 + `D47` or `D48`: the standardized Δ4x value for this unknown 2144 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2145 2146 For each anchor and unknown: 2147 2148 + `N`: the total number of analyses of this sample 2149 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2150 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2151 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2152 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2153 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2154 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2155 ''' 2156 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2157 for sample in self.samples: 2158 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2159 if self.samples[sample]['N'] > 1: 2160 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2161 2162 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2163 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2164 2165 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2166 if len(D4x_pop) > 2: 2167 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2168 2169 if self.standardization_method == 'pooled': 2170 for sample in self.anchors: 2171 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2172 self.samples[sample][f'SE_D{self._4x}'] = 0. 2173 for sample in self.unknowns: 2174 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2175 try: 2176 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2177 except ValueError: 2178 # when `sample` is constrained by self.standardize(constraints = {...}), 2179 # it is no longer listed in self.standardization.var_names. 2180 # Temporary fix: define SE as zero for now 2181 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2182 2183 elif self.standardization_method == 'indep_sessions': 2184 for sample in self.anchors: 2185 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2186 self.samples[sample][f'SE_D{self._4x}'] = 0. 2187 for sample in self.unknowns: 2188 self.msg(f'Consolidating sample {sample}') 2189 self.unknowns[sample][f'session_D{self._4x}'] = {} 2190 session_avg = [] 2191 for session in self.sessions: 2192 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2193 if sdata: 2194 self.msg(f'{sample} found in session {session}') 2195 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2196 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2197 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2198 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2199 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2200 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2201 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2202 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2203 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2204 wsum = sum([weights[s] for s in weights]) 2205 for s in weights: 2206 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2207 2208 for r in self: 2209 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2210 2211 2212 2213 def consolidate_sessions(self): 2214 ''' 2215 Compute various statistics for each session. 2216 2217 + `Na`: Number of anchor analyses in the session 2218 + `Nu`: Number of unknown analyses in the session 2219 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2220 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2221 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2222 + `a`: scrambling factor 2223 + `b`: compositional slope 2224 + `c`: WG offset 2225 + `SE_a`: Model stadard erorr of `a` 2226 + `SE_b`: Model stadard erorr of `b` 2227 + `SE_c`: Model stadard erorr of `c` 2228 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2229 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2230 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2231 + `a2`: scrambling factor drift 2232 + `b2`: compositional slope drift 2233 + `c2`: WG offset drift 2234 + `Np`: Number of standardization parameters to fit 2235 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2236 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2237 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2238 ''' 2239 for session in self.sessions: 2240 if 'd13Cwg_VPDB' not in self.sessions[session]: 2241 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2242 if 'd18Owg_VSMOW' not in self.sessions[session]: 2243 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2244 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2245 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2246 2247 self.msg(f'Computing repeatabilities for session {session}') 2248 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2249 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2250 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2251 2252 if self.standardization_method == 'pooled': 2253 for session in self.sessions: 2254 2255 # different (better?) computation of D4x repeatability for each session: 2256 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2257 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2258 2259 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2260 i = self.standardization.var_names.index(f'a_{pf(session)}') 2261 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2262 2263 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2264 i = self.standardization.var_names.index(f'b_{pf(session)}') 2265 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2266 2267 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2268 i = self.standardization.var_names.index(f'c_{pf(session)}') 2269 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2270 2271 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2272 if self.sessions[session]['scrambling_drift']: 2273 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2274 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2275 else: 2276 self.sessions[session]['SE_a2'] = 0. 2277 2278 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2279 if self.sessions[session]['slope_drift']: 2280 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2281 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2282 else: 2283 self.sessions[session]['SE_b2'] = 0. 2284 2285 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2286 if self.sessions[session]['wg_drift']: 2287 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2288 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2289 else: 2290 self.sessions[session]['SE_c2'] = 0. 2291 2292 i = self.standardization.var_names.index(f'a_{pf(session)}') 2293 j = self.standardization.var_names.index(f'b_{pf(session)}') 2294 k = self.standardization.var_names.index(f'c_{pf(session)}') 2295 CM = np.zeros((6,6)) 2296 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2297 try: 2298 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2299 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2300 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2301 try: 2302 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2303 CM[3,4] = self.standardization.covar[i2,j2] 2304 CM[4,3] = self.standardization.covar[j2,i2] 2305 except ValueError: 2306 pass 2307 try: 2308 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2309 CM[3,5] = self.standardization.covar[i2,k2] 2310 CM[5,3] = self.standardization.covar[k2,i2] 2311 except ValueError: 2312 pass 2313 except ValueError: 2314 pass 2315 try: 2316 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2317 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2318 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2319 try: 2320 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2321 CM[4,5] = self.standardization.covar[j2,k2] 2322 CM[5,4] = self.standardization.covar[k2,j2] 2323 except ValueError: 2324 pass 2325 except ValueError: 2326 pass 2327 try: 2328 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2329 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2330 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2331 except ValueError: 2332 pass 2333 2334 self.sessions[session]['CM'] = CM 2335 2336 elif self.standardization_method == 'indep_sessions': 2337 pass # Not implemented yet 2338 2339 2340 @make_verbal 2341 def repeatabilities(self): 2342 ''' 2343 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2344 (for all samples, for anchors, and for unknowns). 2345 ''' 2346 self.msg('Computing reproducibilities for all sessions') 2347 2348 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2349 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2350 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2351 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2352 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2353 2354 2355 @make_verbal 2356 def consolidate(self, tables = True, plots = True): 2357 ''' 2358 Collect information about samples, sessions and repeatabilities. 2359 ''' 2360 self.consolidate_samples() 2361 self.consolidate_sessions() 2362 self.repeatabilities() 2363 2364 if tables: 2365 self.summary() 2366 self.table_of_sessions() 2367 self.table_of_analyses() 2368 self.table_of_samples() 2369 2370 if plots: 2371 self.plot_sessions() 2372 2373 2374 @make_verbal 2375 def rmswd(self, 2376 samples = 'all samples', 2377 sessions = 'all sessions', 2378 ): 2379 ''' 2380 Compute the χ2, root mean squared weighted deviation 2381 (i.e. reduced χ2), and corresponding degrees of freedom of the 2382 Δ4x values for samples in `samples` and sessions in `sessions`. 2383 2384 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2385 ''' 2386 if samples == 'all samples': 2387 mysamples = [k for k in self.samples] 2388 elif samples == 'anchors': 2389 mysamples = [k for k in self.anchors] 2390 elif samples == 'unknowns': 2391 mysamples = [k for k in self.unknowns] 2392 else: 2393 mysamples = samples 2394 2395 if sessions == 'all sessions': 2396 sessions = [k for k in self.sessions] 2397 2398 chisq, Nf = 0, 0 2399 for sample in mysamples : 2400 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2401 if len(G) > 1 : 2402 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2403 Nf += (len(G) - 1) 2404 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2405 r = (chisq / Nf)**.5 if Nf > 0 else 0 2406 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2407 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2408 2409 2410 @make_verbal 2411 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2412 ''' 2413 Compute the repeatability of `[r[key] for r in self]` 2414 ''' 2415 2416 if samples == 'all samples': 2417 mysamples = [k for k in self.samples] 2418 elif samples == 'anchors': 2419 mysamples = [k for k in self.anchors] 2420 elif samples == 'unknowns': 2421 mysamples = [k for k in self.unknowns] 2422 else: 2423 mysamples = samples 2424 2425 if sessions == 'all sessions': 2426 sessions = [k for k in self.sessions] 2427 2428 if key in ['D47', 'D48']: 2429 # Full disclosure: the definition of Nf is tricky/debatable 2430 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2431 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2432 Nf = len(G) 2433# print(f'len(G) = {Nf}') 2434 Nf -= len([s for s in mysamples if s in self.unknowns]) 2435# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2436 for session in sessions: 2437 Np = len([ 2438 _ for _ in self.standardization.params 2439 if ( 2440 self.standardization.params[_].expr is not None 2441 and ( 2442 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2443 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2444 ) 2445 ) 2446 ]) 2447# print(f'session {session}: {Np} parameters to consider') 2448 Na = len({ 2449 r['Sample'] for r in self.sessions[session]['data'] 2450 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2451 }) 2452# print(f'session {session}: {Na} different anchors in that session') 2453 Nf -= min(Np, Na) 2454# print(f'Nf = {Nf}') 2455 2456# for sample in mysamples : 2457# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2458# if len(X) > 1 : 2459# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2460# if sample in self.unknowns: 2461# Nf += len(X) - 1 2462# else: 2463# Nf += len(X) 2464# if samples in ['anchors', 'all samples']: 2465# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2466 r = (chisq / Nf)**.5 if Nf > 0 else 0 2467 2468 else: # if key not in ['D47', 'D48'] 2469 chisq, Nf = 0, 0 2470 for sample in mysamples : 2471 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2472 if len(X) > 1 : 2473 Nf += len(X) - 1 2474 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2475 r = (chisq / Nf)**.5 if Nf > 0 else 0 2476 2477 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2478 return r 2479 2480 def sample_average(self, samples, weights = 'equal', normalize = True): 2481 ''' 2482 Weighted average Δ4x value of a group of samples, accounting for covariance. 2483 2484 Returns the weighed average Δ4x value and associated SE 2485 of a group of samples. Weights are equal by default. If `normalize` is 2486 true, `weights` will be rescaled so that their sum equals 1. 2487 2488 **Examples** 2489 2490 ```python 2491 self.sample_average(['X','Y'], [1, 2]) 2492 ``` 2493 2494 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2495 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2496 values of samples X and Y, respectively. 2497 2498 ```python 2499 self.sample_average(['X','Y'], [1, -1], normalize = False) 2500 ``` 2501 2502 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2503 ''' 2504 if weights == 'equal': 2505 weights = [1/len(samples)] * len(samples) 2506 2507 if normalize: 2508 s = sum(weights) 2509 if s: 2510 weights = [w/s for w in weights] 2511 2512 try: 2513# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2514# C = self.standardization.covar[indices,:][:,indices] 2515 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2516 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2517 return correlated_sum(X, C, weights) 2518 except ValueError: 2519 return (0., 0.) 2520 2521 2522 def sample_D4x_covar(self, sample1, sample2 = None): 2523 ''' 2524 Covariance between Δ4x values of samples 2525 2526 Returns the error covariance between the average Δ4x values of two 2527 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2528 returns the Δ4x variance for that sample. 2529 ''' 2530 if sample2 is None: 2531 sample2 = sample1 2532 if self.standardization_method == 'pooled': 2533 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2534 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2535 return self.standardization.covar[i, j] 2536 elif self.standardization_method == 'indep_sessions': 2537 if sample1 == sample2: 2538 return self.samples[sample1][f'SE_D{self._4x}']**2 2539 else: 2540 c = 0 2541 for session in self.sessions: 2542 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2543 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2544 if sdata1 and sdata2: 2545 a = self.sessions[session]['a'] 2546 # !! TODO: CM below does not account for temporal changes in standardization parameters 2547 CM = self.sessions[session]['CM'][:3,:3] 2548 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2549 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2550 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2551 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2552 c += ( 2553 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2554 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2555 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2556 @ CM 2557 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2558 ) / a**2 2559 return float(c) 2560 2561 def sample_D4x_correl(self, sample1, sample2 = None): 2562 ''' 2563 Correlation between Δ4x errors of samples 2564 2565 Returns the error correlation between the average Δ4x values of two samples. 2566 ''' 2567 if sample2 is None or sample2 == sample1: 2568 return 1. 2569 return ( 2570 self.sample_D4x_covar(sample1, sample2) 2571 / self.unknowns[sample1][f'SE_D{self._4x}'] 2572 / self.unknowns[sample2][f'SE_D{self._4x}'] 2573 ) 2574 2575 def plot_single_session(self, 2576 session, 2577 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2578 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2579 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2580 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2581 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2582 xylimits = 'free', # | 'constant' 2583 x_label = None, 2584 y_label = None, 2585 error_contour_interval = 'auto', 2586 fig = 'new', 2587 ): 2588 ''' 2589 Generate plot for a single session 2590 ''' 2591 if x_label is None: 2592 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2593 if y_label is None: 2594 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2595 2596 out = _SessionPlot() 2597 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2598 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2599 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2600 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2601 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2602 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2603 anchor_avg = (np.array([ np.array([ 2604 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2605 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2606 ]) for sample in anchors]).T, 2607 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2608 unknown_avg = (np.array([ np.array([ 2609 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2610 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2611 ]) for sample in unknowns]).T, 2612 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2613 2614 2615 if fig == 'new': 2616 out.fig = ppl.figure(figsize = (6,6)) 2617 ppl.subplots_adjust(.1,.1,.9,.9) 2618 2619 out.anchor_analyses, = ppl.plot( 2620 anchors_d, 2621 anchors_D, 2622 **kw_plot_anchors) 2623 out.unknown_analyses, = ppl.plot( 2624 unknowns_d, 2625 unknowns_D, 2626 **kw_plot_unknowns) 2627 out.anchor_avg = ppl.plot( 2628 *anchor_avg, 2629 **kw_plot_anchor_avg) 2630 out.unknown_avg = ppl.plot( 2631 *unknown_avg, 2632 **kw_plot_unknown_avg) 2633 if xylimits == 'constant': 2634 x = [r[f'd{self._4x}'] for r in self] 2635 y = [r[f'D{self._4x}'] for r in self] 2636 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2637 w, h = x2-x1, y2-y1 2638 x1 -= w/20 2639 x2 += w/20 2640 y1 -= h/20 2641 y2 += h/20 2642 ppl.axis([x1, x2, y1, y2]) 2643 elif xylimits == 'free': 2644 x1, x2, y1, y2 = ppl.axis() 2645 else: 2646 x1, x2, y1, y2 = ppl.axis(xylimits) 2647 2648 if error_contour_interval != 'none': 2649 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2650 XI,YI = np.meshgrid(xi, yi) 2651 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2652 if error_contour_interval == 'auto': 2653 rng = np.max(SI) - np.min(SI) 2654 if rng <= 0.01: 2655 cinterval = 0.001 2656 elif rng <= 0.03: 2657 cinterval = 0.004 2658 elif rng <= 0.1: 2659 cinterval = 0.01 2660 elif rng <= 0.3: 2661 cinterval = 0.03 2662 elif rng <= 1.: 2663 cinterval = 0.1 2664 else: 2665 cinterval = 0.5 2666 else: 2667 cinterval = error_contour_interval 2668 2669 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2670 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2671 out.clabel = ppl.clabel(out.contour) 2672 contour = (XI, YI, SI, cval, cinterval) 2673 2674 if fig == None: 2675 return { 2676 'anchors':anchors, 2677 'unknowns':unknowns, 2678 'anchors_d':anchors_d, 2679 'anchors_D':anchors_D, 2680 'unknowns_d':unknowns_d, 2681 'unknowns_D':unknowns_D, 2682 'anchor_avg':anchor_avg, 2683 'unknown_avg':unknown_avg, 2684 'contour':contour, 2685 } 2686 2687 ppl.xlabel(x_label) 2688 ppl.ylabel(y_label) 2689 ppl.title(session, weight = 'bold') 2690 ppl.grid(alpha = .2) 2691 out.ax = ppl.gca() 2692 2693 return out 2694 2695 def plot_residuals( 2696 self, 2697 kde = False, 2698 hist = False, 2699 binwidth = 2/3, 2700 dir = 'output', 2701 filename = None, 2702 highlight = [], 2703 colors = None, 2704 figsize = None, 2705 dpi = 100, 2706 yspan = None, 2707 ): 2708 ''' 2709 Plot residuals of each analysis as a function of time (actually, as a function of 2710 the order of analyses in the `D4xdata` object) 2711 2712 + `kde`: whether to add a kernel density estimate of residuals 2713 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2714 + `histbins`: specify bin edges for the histogram 2715 + `dir`: the directory in which to save the plot 2716 + `highlight`: a list of samples to highlight 2717 + `colors`: a dict of `{<sample>: <color>}` for all samples 2718 + `figsize`: (width, height) of figure 2719 + `dpi`: resolution for PNG output 2720 + `yspan`: factor controlling the range of y values shown in plot 2721 (by default: `yspan = 1.5 if kde else 1.0`) 2722 ''' 2723 2724 from matplotlib import ticker 2725 2726 if yspan is None: 2727 if kde: 2728 yspan = 1.5 2729 else: 2730 yspan = 1.0 2731 2732 # Layout 2733 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2734 if hist or kde: 2735 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2736 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2737 else: 2738 ppl.subplots_adjust(.08,.05,.78,.8) 2739 ax1 = ppl.subplot(111) 2740 2741 # Colors 2742 N = len(self.anchors) 2743 if colors is None: 2744 if len(highlight) > 0: 2745 Nh = len(highlight) 2746 if Nh == 1: 2747 colors = {highlight[0]: (0,0,0)} 2748 elif Nh == 3: 2749 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2750 elif Nh == 4: 2751 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2752 else: 2753 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2754 else: 2755 if N == 3: 2756 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2757 elif N == 4: 2758 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2759 else: 2760 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2761 2762 ppl.sca(ax1) 2763 2764 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2765 2766 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2767 2768 session = self[0]['Session'] 2769 x1 = 0 2770# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2771 x_sessions = {} 2772 one_or_more_singlets = False 2773 one_or_more_multiplets = False 2774 multiplets = set() 2775 for k,r in enumerate(self): 2776 if r['Session'] != session: 2777 x2 = k-1 2778 x_sessions[session] = (x1+x2)/2 2779 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2780 session = r['Session'] 2781 x1 = k 2782 singlet = len(self.samples[r['Sample']]['data']) == 1 2783 if not singlet: 2784 multiplets.add(r['Sample']) 2785 if r['Sample'] in self.unknowns: 2786 if singlet: 2787 one_or_more_singlets = True 2788 else: 2789 one_or_more_multiplets = True 2790 kw = dict( 2791 marker = 'x' if singlet else '+', 2792 ms = 4 if singlet else 5, 2793 ls = 'None', 2794 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2795 mew = 1, 2796 alpha = 0.2 if singlet else 1, 2797 ) 2798 if highlight and r['Sample'] not in highlight: 2799 kw['alpha'] = 0.2 2800 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2801 x2 = k 2802 x_sessions[session] = (x1+x2)/2 2803 2804 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2805 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2806 if not (hist or kde): 2807 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2808 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2809 2810 xmin, xmax, ymin, ymax = ppl.axis() 2811 if yspan != 1: 2812 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2813 for s in x_sessions: 2814 ppl.text( 2815 x_sessions[s], 2816 ymax +1, 2817 s, 2818 va = 'bottom', 2819 **( 2820 dict(ha = 'center') 2821 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2822 else dict(ha = 'left', rotation = 45) 2823 ) 2824 ) 2825 2826 if hist or kde: 2827 ppl.sca(ax2) 2828 2829 for s in colors: 2830 kw['marker'] = '+' 2831 kw['ms'] = 5 2832 kw['mec'] = colors[s] 2833 kw['label'] = s 2834 kw['alpha'] = 1 2835 ppl.plot([], [], **kw) 2836 2837 kw['mec'] = (0,0,0) 2838 2839 if one_or_more_singlets: 2840 kw['marker'] = 'x' 2841 kw['ms'] = 4 2842 kw['alpha'] = .2 2843 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2844 ppl.plot([], [], **kw) 2845 2846 if one_or_more_multiplets: 2847 kw['marker'] = '+' 2848 kw['ms'] = 4 2849 kw['alpha'] = 1 2850 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2851 ppl.plot([], [], **kw) 2852 2853 if hist or kde: 2854 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2855 else: 2856 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2857 leg.set_zorder(-1000) 2858 2859 ppl.sca(ax1) 2860 2861 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2862 ppl.xticks([]) 2863 ppl.axis([-1, len(self), None, None]) 2864 2865 if hist or kde: 2866 ppl.sca(ax2) 2867 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2868 2869 if kde: 2870 from scipy.stats import gaussian_kde 2871 yi = np.linspace(ymin, ymax, 201) 2872 xi = gaussian_kde(X).evaluate(yi) 2873 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2874# ppl.plot(xi, yi, 'k-', lw = 1) 2875 elif hist: 2876 ppl.hist( 2877 X, 2878 orientation = 'horizontal', 2879 histtype = 'stepfilled', 2880 ec = [.4]*3, 2881 fc = [.25]*3, 2882 alpha = .25, 2883 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2884 ) 2885 ppl.text(0, 0, 2886 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2887 size = 7.5, 2888 alpha = 1, 2889 va = 'center', 2890 ha = 'left', 2891 ) 2892 2893 ppl.axis([0, None, ymin, ymax]) 2894 ppl.xticks([]) 2895 ppl.yticks([]) 2896# ax2.spines['left'].set_visible(False) 2897 ax2.spines['right'].set_visible(False) 2898 ax2.spines['top'].set_visible(False) 2899 ax2.spines['bottom'].set_visible(False) 2900 2901 ax1.axis([None, None, ymin, ymax]) 2902 2903 if not os.path.exists(dir): 2904 os.makedirs(dir) 2905 if filename is None: 2906 return fig 2907 elif filename == '': 2908 filename = f'D{self._4x}_residuals.pdf' 2909 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2910 ppl.close(fig) 2911 2912 2913 def simulate(self, *args, **kwargs): 2914 ''' 2915 Legacy function with warning message pointing to `virtual_data()` 2916 ''' 2917 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2918 2919 def plot_distribution_of_analyses( 2920 self, 2921 dir = 'output', 2922 filename = None, 2923 vs_time = False, 2924 figsize = (6,4), 2925 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2926 output = None, 2927 dpi = 100, 2928 ): 2929 ''' 2930 Plot temporal distribution of all analyses in the data set. 2931 2932 **Parameters** 2933 2934 + `dir`: the directory in which to save the plot 2935 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2936 + `dpi`: resolution for PNG output 2937 + `figsize`: (width, height) of figure 2938 + `dpi`: resolution for PNG output 2939 ''' 2940 2941 asamples = [s for s in self.anchors] 2942 usamples = [s for s in self.unknowns] 2943 if output is None or output == 'fig': 2944 fig = ppl.figure(figsize = figsize) 2945 ppl.subplots_adjust(*subplots_adjust) 2946 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2947 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2948 Xmax += (Xmax-Xmin)/40 2949 Xmin -= (Xmax-Xmin)/41 2950 for k, s in enumerate(asamples + usamples): 2951 if vs_time: 2952 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2953 else: 2954 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2955 Y = [-k for x in X] 2956 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2957 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2958 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2959 ppl.axis([Xmin, Xmax, -k-1, 1]) 2960 ppl.xlabel('\ntime') 2961 ppl.gca().annotate('', 2962 xy = (0.6, -0.02), 2963 xycoords = 'axes fraction', 2964 xytext = (.4, -0.02), 2965 arrowprops = dict(arrowstyle = "->", color = 'k'), 2966 ) 2967 2968 2969 x2 = -1 2970 for session in self.sessions: 2971 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2972 if vs_time: 2973 ppl.axvline(x1, color = 'k', lw = .75) 2974 if x2 > -1: 2975 if not vs_time: 2976 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2977 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2978# from xlrd import xldate_as_datetime 2979# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2980 if vs_time: 2981 ppl.axvline(x2, color = 'k', lw = .75) 2982 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2983 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2984 2985 ppl.xticks([]) 2986 ppl.yticks([]) 2987 2988 if output is None: 2989 if not os.path.exists(dir): 2990 os.makedirs(dir) 2991 if filename == None: 2992 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2993 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2994 ppl.close(fig) 2995 elif output == 'ax': 2996 return ppl.gca() 2997 elif output == 'fig': 2998 return fig 2999 3000 3001 def plot_bulk_compositions( 3002 self, 3003 samples = None, 3004 dir = 'output/bulk_compositions', 3005 figsize = (6,6), 3006 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3007 show = False, 3008 sample_color = (0,.5,1), 3009 analysis_color = (.7,.7,.7), 3010 labeldist = 0.3, 3011 radius = 0.05, 3012 ): 3013 ''' 3014 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3015 3016 By default, creates a directory `./output/bulk_compositions` where plots for 3017 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3018 3019 3020 **Parameters** 3021 3022 + `samples`: Only these samples are processed (by default: all samples). 3023 + `dir`: where to save the plots 3024 + `figsize`: (width, height) of figure 3025 + `subplots_adjust`: passed to `subplots_adjust()` 3026 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3027 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3028 + `sample_color`: color used for replicate markers/labels 3029 + `analysis_color`: color used for sample markers/labels 3030 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3031 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3032 ''' 3033 3034 from matplotlib.patches import Ellipse 3035 3036 if samples is None: 3037 samples = [_ for _ in self.samples] 3038 3039 saved = {} 3040 3041 for s in samples: 3042 3043 fig = ppl.figure(figsize = figsize) 3044 fig.subplots_adjust(*subplots_adjust) 3045 ax = ppl.subplot(111) 3046 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3047 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3048 ppl.title(s) 3049 3050 3051 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3052 UID = [_['UID'] for _ in self.samples[s]['data']] 3053 XY0 = XY.mean(0) 3054 3055 for xy in XY: 3056 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3057 3058 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3059 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3060 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3061 saved[s] = [XY, XY0] 3062 3063 x1, x2, y1, y2 = ppl.axis() 3064 x0, dx = (x1+x2)/2, (x2-x1)/2 3065 y0, dy = (y1+y2)/2, (y2-y1)/2 3066 dx, dy = [max(max(dx, dy), radius)]*2 3067 3068 ppl.axis([ 3069 x0 - 1.2*dx, 3070 x0 + 1.2*dx, 3071 y0 - 1.2*dy, 3072 y0 + 1.2*dy, 3073 ]) 3074 3075 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3076 3077 for xy, uid in zip(XY, UID): 3078 3079 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3080 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3081 3082 if (vector_in_display_space**2).sum() > 0: 3083 3084 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3085 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3086 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3087 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3088 3089 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3090 3091 else: 3092 3093 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3094 3095 if radius: 3096 ax.add_artist(Ellipse( 3097 xy = XY0, 3098 width = radius*2, 3099 height = radius*2, 3100 ls = (0, (2,2)), 3101 lw = .7, 3102 ec = analysis_color, 3103 fc = 'None', 3104 )) 3105 ppl.text( 3106 XY0[0], 3107 XY0[1]-radius, 3108 f'\n± {radius*1e3:.0f} ppm', 3109 color = analysis_color, 3110 va = 'top', 3111 ha = 'center', 3112 linespacing = 0.4, 3113 size = 8, 3114 ) 3115 3116 if not os.path.exists(dir): 3117 os.makedirs(dir) 3118 fig.savefig(f'{dir}/{s}.pdf') 3119 ppl.close(fig) 3120 3121 fig = ppl.figure(figsize = figsize) 3122 fig.subplots_adjust(*subplots_adjust) 3123 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3124 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3125 3126 for s in saved: 3127 for xy in saved[s][0]: 3128 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3129 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3130 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3131 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3132 3133 x1, x2, y1, y2 = ppl.axis() 3134 ppl.axis([ 3135 x1 - (x2-x1)/10, 3136 x2 + (x2-x1)/10, 3137 y1 - (y2-y1)/10, 3138 y2 + (y2-y1)/10, 3139 ]) 3140 3141 3142 if not os.path.exists(dir): 3143 os.makedirs(dir) 3144 fig.savefig(f'{dir}/__all__.pdf') 3145 if show: 3146 ppl.show() 3147 ppl.close(fig) 3148 3149 3150 def _save_D4x_correl( 3151 self, 3152 samples = None, 3153 dir = 'output', 3154 filename = None, 3155 D4x_precision = 4, 3156 correl_precision = 4, 3157 ): 3158 ''' 3159 Save D4x values along with their SE and correlation matrix. 3160 3161 **Parameters** 3162 3163 + `samples`: Only these samples are output (by default: all samples). 3164 + `dir`: the directory in which to save the faile (by defaut: `output`) 3165 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3166 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3167 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3168 ''' 3169 if samples is None: 3170 samples = sorted([s for s in self.unknowns]) 3171 3172 out = [['Sample']] + [[s] for s in samples] 3173 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3174 for k,s in enumerate(samples): 3175 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3176 for s2 in samples: 3177 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3178 3179 if not os.path.exists(dir): 3180 os.makedirs(dir) 3181 if filename is None: 3182 filename = f'D{self._4x}_correl.csv' 3183 with open(f'{dir}/{filename}', 'w') as fid: 3184 fid.write(make_csv(out)) 3185 3186 3187 3188 3189class D47data(D4xdata): 3190 ''' 3191 Store and process data for a large set of Δ47 analyses, 3192 usually comprising more than one analytical session. 3193 ''' 3194 3195 Nominal_D4x = { 3196 'ETH-1': 0.2052, 3197 'ETH-2': 0.2085, 3198 'ETH-3': 0.6132, 3199 'ETH-4': 0.4511, 3200 'IAEA-C1': 0.3018, 3201 'IAEA-C2': 0.6409, 3202 'MERCK': 0.5135, 3203 } # I-CDES (Bernasconi et al., 2021) 3204 ''' 3205 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3206 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3207 reference frame. 3208 3209 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3210 ```py 3211 { 3212 'ETH-1' : 0.2052, 3213 'ETH-2' : 0.2085, 3214 'ETH-3' : 0.6132, 3215 'ETH-4' : 0.4511, 3216 'IAEA-C1' : 0.3018, 3217 'IAEA-C2' : 0.6409, 3218 'MERCK' : 0.5135, 3219 } 3220 ``` 3221 ''' 3222 3223 3224 @property 3225 def Nominal_D47(self): 3226 return self.Nominal_D4x 3227 3228 3229 @Nominal_D47.setter 3230 def Nominal_D47(self, new): 3231 self.Nominal_D4x = dict(**new) 3232 self.refresh() 3233 3234 3235 def __init__(self, l = [], **kwargs): 3236 ''' 3237 **Parameters:** same as `D4xdata.__init__()` 3238 ''' 3239 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3240 3241 3242 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3243 ''' 3244 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3245 value for that temperature, and add treat these samples as additional anchors. 3246 3247 **Parameters** 3248 3249 + `fCo2eqD47`: Which CO2 equilibrium law to use 3250 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3251 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3252 + `priority`: if `replace`: forget old anchors and only use the new ones; 3253 if `new`: keep pre-existing anchors but update them in case of conflict 3254 between old and new Δ47 values; 3255 if `old`: keep pre-existing anchors but preserve their original Δ47 3256 values in case of conflict. 3257 ''' 3258 f = { 3259 'petersen': fCO2eqD47_Petersen, 3260 'wang': fCO2eqD47_Wang, 3261 }[fCo2eqD47] 3262 foo = {} 3263 for r in self: 3264 if 'Teq' in r: 3265 if r['Sample'] in foo: 3266 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3267 else: 3268 foo[r['Sample']] = f(r['Teq']) 3269 else: 3270 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3271 3272 if priority == 'replace': 3273 self.Nominal_D47 = {} 3274 for s in foo: 3275 if priority != 'old' or s not in self.Nominal_D47: 3276 self.Nominal_D47[s] = foo[s] 3277 3278 def save_D47_correl(self, *args, **kwargs): 3279 return self._save_D4x_correl(*args, **kwargs) 3280 3281 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47') 3282 3283 3284class D48data(D4xdata): 3285 ''' 3286 Store and process data for a large set of Δ48 analyses, 3287 usually comprising more than one analytical session. 3288 ''' 3289 3290 Nominal_D4x = { 3291 'ETH-1': 0.138, 3292 'ETH-2': 0.138, 3293 'ETH-3': 0.270, 3294 'ETH-4': 0.223, 3295 'GU-1': -0.419, 3296 } # (Fiebig et al., 2019, 2021) 3297 ''' 3298 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3299 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3300 reference frame. 3301 3302 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3303 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3304 3305 ```py 3306 { 3307 'ETH-1' : 0.138, 3308 'ETH-2' : 0.138, 3309 'ETH-3' : 0.270, 3310 'ETH-4' : 0.223, 3311 'GU-1' : -0.419, 3312 } 3313 ``` 3314 ''' 3315 3316 3317 @property 3318 def Nominal_D48(self): 3319 return self.Nominal_D4x 3320 3321 3322 @Nominal_D48.setter 3323 def Nominal_D48(self, new): 3324 self.Nominal_D4x = dict(**new) 3325 self.refresh() 3326 3327 3328 def __init__(self, l = [], **kwargs): 3329 ''' 3330 **Parameters:** same as `D4xdata.__init__()` 3331 ''' 3332 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3333 3334 def save_D48_correl(self, *args, **kwargs): 3335 return self._save_D4x_correl(*args, **kwargs) 3336 3337 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48') 3338 3339 3340class D49data(D4xdata): 3341 ''' 3342 Store and process data for a large set of Δ49 analyses, 3343 usually comprising more than one analytical session. 3344 ''' 3345 3346 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3347 ''' 3348 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3349 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3350 reference frame. 3351 3352 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3353 3354 ```py 3355 { 3356 "1000C": 0.0, 3357 "25C": 2.228 3358 } 3359 ``` 3360 ''' 3361 3362 @property 3363 def Nominal_D49(self): 3364 return self.Nominal_D4x 3365 3366 @Nominal_D49.setter 3367 def Nominal_D49(self, new): 3368 self.Nominal_D4x = dict(**new) 3369 self.refresh() 3370 3371 def __init__(self, l=[], **kwargs): 3372 ''' 3373 **Parameters:** same as `D4xdata.__init__()` 3374 ''' 3375 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3376 3377 def save_D49_correl(self, *args, **kwargs): 3378 return self._save_D4x_correl(*args, **kwargs) 3379 3380 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49') 3381 3382class _SessionPlot(): 3383 ''' 3384 Simple placeholder class 3385 ''' 3386 def __init__(self): 3387 pass 3388 3389_app = typer.Typer( 3390 add_completion = False, 3391 context_settings={'help_option_names': ['-h', '--help']}, 3392 rich_markup_mode = 'rich', 3393 ) 3394 3395@_app.command() 3396def _cli( 3397 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3398 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3399 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3400 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3401 run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False, 3402 ): 3403 """ 3404 Process raw D47 data and return standardized results. 3405 3406 See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details. 3407 3408 Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes 3409 the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different 3410 user-specified anchors. A new directory (named `output` by default) is created to store the results and 3411 the following sequence is applied: 3412 3413 * [b]D47data.wg()[/b] 3414 * [b]D47data.crunch()[/b] 3415 * [b]D47data.standardize()[/b] 3416 * [b]D47data.summary()[/b] 3417 * [b]D47data.table_of_samples()[/b] 3418 * [b]D47data.table_of_sessions()[/b] 3419 * [b]D47data.plot_sessions()[/b] 3420 * [b]D47data.plot_residuals()[/b] 3421 * [b]D47data.table_of_analyses()[/b] 3422 * [b]D47data.plot_distribution_of_analyses()[/b] 3423 * [b]D47data.plot_bulk_compositions()[/b] 3424 * [b]D47data.save_D47_correl()[/b] 3425 3426 Optionally, also apply similar methods for [b]]D48[/b]. 3427 3428 [b]Example CSV file for --anchors option:[/b] 3429 [i] 3430 Sample, d13C_VPDB, d18O_VPDB, D47, D48 3431 ETH-1, 2.02, -2.19, 0.2052, 0.138 3432 ETH-2, -10.17, -18.69, 0.2085, 0.138 3433 ETH-3, 1.71, -1.78, 0.6132, 0.270 3434 ETH-4, , , 0.4511, 0.223 3435 [/i] 3436 Except for [i]Sample[/i], none of the columns above are mandatory. 3437 3438 [b]Example CSV file for --exclude option:[/b] 3439 [i] 3440 Sample, UID 3441 FOO-1, 3442 BAR-2, 3443 , A04 3444 , A17 3445 , A88 3446 [/i] 3447 This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i], 3448 and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i]. 3449 Neither column is mandatory. 3450 """ 3451 3452 data = D47data() 3453 data.read(rawdata) 3454 3455 if exclude != 'none': 3456 exclude = read_csv(exclude) 3457 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3458 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3459 else: 3460 exclude_uid = [] 3461 exclude_sample = [] 3462 3463 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3464 3465 if anchors != 'none': 3466 anchors = read_csv(anchors) 3467 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3468 data.Nominal_d13C_VPDB = { 3469 _['Sample']: _['d13C_VPDB'] 3470 for _ in anchors 3471 if 'd13C_VPDB' in _ 3472 } 3473 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3474 data.Nominal_d18O_VPDB = { 3475 _['Sample']: _['d18O_VPDB'] 3476 for _ in anchors 3477 if 'd18O_VPDB' in _ 3478 } 3479 if len([_ for _ in anchors if 'D47' in _]): 3480 data.Nominal_D4x = { 3481 _['Sample']: _['D47'] 3482 for _ in anchors 3483 if 'D47' in _ 3484 } 3485 3486 data.refresh() 3487 data.wg() 3488 data.crunch() 3489 data.standardize() 3490 data.summary(dir = output_dir) 3491 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3492 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3493 data.plot_sessions(dir = output_dir) 3494 data.save_D47_correl(dir = output_dir) 3495 3496 if not run_D48: 3497 data.table_of_samples(dir = output_dir) 3498 data.table_of_analyses(dir = output_dir) 3499 data.table_of_sessions(dir = output_dir) 3500 3501 3502 if run_D48: 3503 data2 = D48data() 3504 print(rawdata) 3505 data2.read(rawdata) 3506 3507 data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3508 3509 if anchors != 'none': 3510 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3511 data2.Nominal_d13C_VPDB = { 3512 _['Sample']: _['d13C_VPDB'] 3513 for _ in anchors 3514 if 'd13C_VPDB' in _ 3515 } 3516 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3517 data2.Nominal_d18O_VPDB = { 3518 _['Sample']: _['d18O_VPDB'] 3519 for _ in anchors 3520 if 'd18O_VPDB' in _ 3521 } 3522 if len([_ for _ in anchors if 'D48' in _]): 3523 data2.Nominal_D4x = { 3524 _['Sample']: _['D48'] 3525 for _ in anchors 3526 if 'D48' in _ 3527 } 3528 3529 data2.refresh() 3530 data2.wg() 3531 data2.crunch() 3532 data2.standardize() 3533 data2.summary(dir = output_dir) 3534 data2.plot_sessions(dir = output_dir) 3535 data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True) 3536 data2.plot_distribution_of_analyses(dir = output_dir) 3537 data2.save_D48_correl(dir = output_dir) 3538 3539 table_of_analyses(data, data2, dir = output_dir) 3540 table_of_samples(data, data2, dir = output_dir) 3541 table_of_sessions(data, data2, dir = output_dir) 3542 3543def __cli(): 3544 _app()
69def fCO2eqD47_Petersen(T): 70 ''' 71 CO2 equilibrium Δ47 value as a function of T (in degrees C) 72 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 73 74 ''' 75 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
80def fCO2eqD47_Wang(T): 81 ''' 82 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 83 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 84 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 85 ''' 86 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T
(in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
108def make_csv(x, hsep = ',', vsep = '\n'): 109 ''' 110 Formats a list of lists of strings as a CSV 111 112 **Parameters** 113 114 + `x`: the list of lists of strings to format 115 + `hsep`: the field separator (`,` by default) 116 + `vsep`: the line-ending convention to use (`\\n` by default) 117 118 **Example** 119 120 ```py 121 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 122 ``` 123 124 outputs: 125 126 ```py 127 a,b,c 128 d,e,f 129 ``` 130 ''' 131 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x
: the list of lists of strings to formathsep
: the field separator (,
by default)vsep
: the line-ending convention to use (\n
by default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
134def pf(txt): 135 ''' 136 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 137 ''' 138 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt
to follow lmfit.Parameter()
naming rules.
141def smart_type(x): 142 ''' 143 Tries to convert string `x` to a float if it includes a decimal point, or 144 to an integer if it does not. If both attempts fail, return the original 145 string unchanged. 146 ''' 147 try: 148 y = float(x) 149 except ValueError: 150 return x 151 if '.' not in x: 152 return int(y) 153 return y
Tries to convert string x
to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
162def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 163 ''' 164 Reads a list of lists of strings and outputs an ascii table 165 166 **Parameters** 167 168 + `x`: a list of lists of strings 169 + `header`: the number of lines to treat as header lines 170 + `hsep`: the horizontal separator between columns 171 + `vsep`: the character to use as vertical separator 172 + `align`: string of left (`<`) or right (`>`) alignment characters. 173 174 **Example** 175 176 ```py 177 print(pretty_table([ 178 ['A', 'B', 'C'], 179 ['1', '1.9999', 'foo'], 180 ['10', 'x', 'bar'], 181 ])) 182 ``` 183 yields: 184 ``` 185 —— —————— ——— 186 A B C 187 —— —————— ——— 188 1 1.9999 foo 189 10 x bar 190 —— —————— ——— 191 ``` 192 193 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 194 195 ```py 196 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 197 print(pretty_table([ 198 ['A', 'B', 'C'], 199 ['1', '1.9999', 'foo'], 200 ['10', 'x', 'bar'], 201 ])) 202 ``` 203 yields: 204 ``` 205 == ====== === 206 A B C 207 == ====== === 208 1 1.9999 foo 209 10 x bar 210 == ====== === 211 ``` 212 ''' 213 214 if vsep is None: 215 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 216 217 txt = [] 218 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 219 220 if len(widths) > len(align): 221 align += '>' * (len(widths)-len(align)) 222 sepline = hsep.join([vsep*w for w in widths]) 223 txt += [sepline] 224 for k,l in enumerate(x): 225 if k and k == header: 226 txt += [sepline] 227 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 228 txt += [sepline] 229 txt += [''] 230 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x
: a list of lists of stringsheader
: the number of lines to treat as header lineshsep
: the horizontal separator between columnsvsep
: the character to use as vertical separatoralign
: string of left (<
) or right (>
) alignment characters.
Example
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
—— —————— ———
A B C
—— —————— ———
1 1.9999 foo
10 x bar
—— —————— ———
To change the default vsep
globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP
:
D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
== ====== ===
A B C
== ====== ===
1 1.9999 foo
10 x bar
== ====== ===
233def transpose_table(x): 234 ''' 235 Transpose a list if lists 236 237 **Parameters** 238 239 + `x`: a list of lists 240 241 **Example** 242 243 ```py 244 x = [[1, 2], [3, 4]] 245 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 246 ``` 247 ''' 248 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x
: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
251def w_avg(X, sX) : 252 ''' 253 Compute variance-weighted average 254 255 Returns the value and SE of the weighted average of the elements of `X`, 256 with relative weights equal to their inverse variances (`1/sX**2`). 257 258 **Parameters** 259 260 + `X`: array-like of elements to average 261 + `sX`: array-like of the corresponding SE values 262 263 **Tip** 264 265 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 266 they may be rearranged using `zip()`: 267 268 ```python 269 foo = [(0, 1), (1, 0.5), (2, 0.5)] 270 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 271 ``` 272 ''' 273 X = [ x for x in X ] 274 sX = [ sx for sx in sX ] 275 W = [ sx**-2 for sx in sX ] 276 W = [ w/sum(W) for w in W ] 277 Xavg = sum([ w*x for w,x in zip(W,X) ]) 278 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 279 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X
,
with relative weights equal to their inverse variances (1/sX**2
).
Parameters
X
: array-like of elements to averagesX
: array-like of the corresponding SE values
Tip
If X
and sX
are initially arranged as a list of (x, sx)
doublets,
they may be rearranged using zip()
:
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
282def read_csv(filename, sep = ''): 283 ''' 284 Read contents of `filename` in csv format and return a list of dictionaries. 285 286 In the csv string, spaces before and after field separators (`','` by default) 287 are optional. 288 289 **Parameters** 290 291 + `filename`: the csv file to read 292 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 293 whichever appers most often in the contents of `filename`. 294 ''' 295 with open(filename) as fid: 296 txt = fid.read() 297 298 if sep == '': 299 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 300 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 301 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename
in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (','
by default)
are optional.
Parameters
filename
: the csv file to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in the contents of
filename
.
304def simulate_single_analysis( 305 sample = 'MYSAMPLE', 306 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 307 d13C_VPDB = None, d18O_VPDB = None, 308 D47 = None, D48 = None, D49 = 0., D17O = 0., 309 a47 = 1., b47 = 0., c47 = -0.9, 310 a48 = 1., b48 = 0., c48 = -0.45, 311 Nominal_D47 = None, 312 Nominal_D48 = None, 313 Nominal_d13C_VPDB = None, 314 Nominal_d18O_VPDB = None, 315 ALPHA_18O_ACID_REACTION = None, 316 R13_VPDB = None, 317 R17_VSMOW = None, 318 R18_VSMOW = None, 319 LAMBDA_17 = None, 320 R18_VPDB = None, 321 ): 322 ''' 323 Compute working-gas delta values for a single analysis, assuming a stochastic working 324 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 325 326 **Parameters** 327 328 + `sample`: sample name 329 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 330 (respectively –4 and +26 ‰ by default) 331 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 332 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 333 of the carbonate sample 334 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 335 Δ48 values if `D47` or `D48` are not specified 336 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 337 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 338 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 339 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 340 correction parameters (by default equal to the `D4xdata` default values) 341 342 Returns a dictionary with fields 343 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 344 ''' 345 346 if Nominal_d13C_VPDB is None: 347 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 348 349 if Nominal_d18O_VPDB is None: 350 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 351 352 if ALPHA_18O_ACID_REACTION is None: 353 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 354 355 if R13_VPDB is None: 356 R13_VPDB = D4xdata().R13_VPDB 357 358 if R17_VSMOW is None: 359 R17_VSMOW = D4xdata().R17_VSMOW 360 361 if R18_VSMOW is None: 362 R18_VSMOW = D4xdata().R18_VSMOW 363 364 if LAMBDA_17 is None: 365 LAMBDA_17 = D4xdata().LAMBDA_17 366 367 if R18_VPDB is None: 368 R18_VPDB = D4xdata().R18_VPDB 369 370 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 371 372 if Nominal_D47 is None: 373 Nominal_D47 = D47data().Nominal_D47 374 375 if Nominal_D48 is None: 376 Nominal_D48 = D48data().Nominal_D48 377 378 if d13C_VPDB is None: 379 if sample in Nominal_d13C_VPDB: 380 d13C_VPDB = Nominal_d13C_VPDB[sample] 381 else: 382 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 383 384 if d18O_VPDB is None: 385 if sample in Nominal_d18O_VPDB: 386 d18O_VPDB = Nominal_d18O_VPDB[sample] 387 else: 388 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 389 390 if D47 is None: 391 if sample in Nominal_D47: 392 D47 = Nominal_D47[sample] 393 else: 394 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 395 396 if D48 is None: 397 if sample in Nominal_D48: 398 D48 = Nominal_D48[sample] 399 else: 400 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 401 402 X = D4xdata() 403 X.R13_VPDB = R13_VPDB 404 X.R17_VSMOW = R17_VSMOW 405 X.R18_VSMOW = R18_VSMOW 406 X.LAMBDA_17 = LAMBDA_17 407 X.R18_VPDB = R18_VPDB 408 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 409 410 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 411 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 412 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 413 ) 414 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 415 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 416 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 417 D17O=D17O, D47=D47, D48=D48, D49=D49, 418 ) 419 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 420 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 421 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 422 D17O=D17O, 423 ) 424 425 d45 = 1000 * (R45/R45wg - 1) 426 d46 = 1000 * (R46/R46wg - 1) 427 d47 = 1000 * (R47/R47wg - 1) 428 d48 = 1000 * (R48/R48wg - 1) 429 d49 = 1000 * (R49/R49wg - 1) 430 431 for k in range(3): # dumb iteration to adjust for small changes in d47 432 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 433 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 434 d47 = 1000 * (R47raw/R47wg - 1) 435 d48 = 1000 * (R48raw/R48wg - 1) 436 437 return dict( 438 Sample = sample, 439 D17O = D17O, 440 d13Cwg_VPDB = d13Cwg_VPDB, 441 d18Owg_VSMOW = d18Owg_VSMOW, 442 d45 = d45, 443 d46 = d46, 444 d47 = d47, 445 d48 = d48, 446 d49 = d49, 447 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample
: sample named13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specifiedNominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specifiedALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factorR13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to theD4xdata
default values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']
.
450def virtual_data( 451 samples = [], 452 a47 = 1., b47 = 0., c47 = -0.9, 453 a48 = 1., b48 = 0., c48 = -0.45, 454 rd45 = 0.020, rd46 = 0.060, 455 rD47 = 0.015, rD48 = 0.045, 456 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 457 session = None, 458 Nominal_D47 = None, Nominal_D48 = None, 459 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 460 ALPHA_18O_ACID_REACTION = None, 461 R13_VPDB = None, 462 R17_VSMOW = None, 463 R18_VSMOW = None, 464 LAMBDA_17 = None, 465 R18_VPDB = None, 466 seed = 0, 467 shuffle = True, 468 ): 469 ''' 470 Return list with simulated analyses from a single session. 471 472 **Parameters** 473 474 + `samples`: a list of entries; each entry is a dictionary with the following fields: 475 * `Sample`: the name of the sample 476 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 477 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 478 * `N`: how many analyses to generate for this sample 479 + `a47`: scrambling factor for Δ47 480 + `b47`: compositional nonlinearity for Δ47 481 + `c47`: working gas offset for Δ47 482 + `a48`: scrambling factor for Δ48 483 + `b48`: compositional nonlinearity for Δ48 484 + `c48`: working gas offset for Δ48 485 + `rd45`: analytical repeatability of δ45 486 + `rd46`: analytical repeatability of δ46 487 + `rD47`: analytical repeatability of Δ47 488 + `rD48`: analytical repeatability of Δ48 489 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 490 (by default equal to the `simulate_single_analysis` default values) 491 + `session`: name of the session (no name by default) 492 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 493 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 494 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 495 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 496 (by default equal to the `simulate_single_analysis` defaults) 497 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 498 (by default equal to the `simulate_single_analysis` defaults) 499 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 500 correction parameters (by default equal to the `simulate_single_analysis` default) 501 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 502 + `shuffle`: randomly reorder the sequence of analyses 503 504 505 Here is an example of using this method to generate an arbitrary combination of 506 anchors and unknowns for a bunch of sessions: 507 508 ```py 509 .. include:: ../../code_examples/virtual_data/example.py 510 ``` 511 512 This should output something like: 513 514 ``` 515 .. include:: ../../code_examples/virtual_data/output.txt 516 ``` 517 ''' 518 519 kwargs = locals().copy() 520 521 from numpy import random as nprandom 522 if seed: 523 nprandom.seed(seed) 524 rng = nprandom.default_rng(seed) 525 else: 526 rng = nprandom.default_rng() 527 528 N = sum([s['N'] for s in samples]) 529 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 530 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 531 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 532 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 533 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 534 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 535 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 536 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 537 538 k = 0 539 out = [] 540 for s in samples: 541 kw = {} 542 kw['sample'] = s['Sample'] 543 kw = { 544 **kw, 545 **{var: kwargs[var] 546 for var in [ 547 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 548 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 549 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 550 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 551 ] 552 if kwargs[var] is not None}, 553 **{var: s[var] 554 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 555 if var in s}, 556 } 557 558 sN = s['N'] 559 while sN: 560 out.append(simulate_single_analysis(**kw)) 561 out[-1]['d45'] += errors45[k] 562 out[-1]['d46'] += errors46[k] 563 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 564 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 565 sN -= 1 566 k += 1 567 568 if session is not None: 569 for r in out: 570 r['Session'] = session 571 572 if shuffle: 573 nprandom.shuffle(out) 574 575 return out
Return list with simulated analyses from a single session.
Parameters
samples
: a list of entries; each entry is a dictionary with the following fields:Sample
: the name of the sampled13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN
: how many analyses to generate for this sample
a47
: scrambling factor for Δ47b47
: compositional nonlinearity for Δ47c47
: working gas offset for Δ47a48
: scrambling factor for Δ48b48
: compositional nonlinearity for Δ48c48
: working gas offset for Δ48rd45
: analytical repeatability of δ45rd46
: analytical repeatability of δ46rD47
: analytical repeatability of Δ47rD48
: analytical repeatability of Δ48d13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (by default equal to thesimulate_single_analysis
default values)session
: name of the session (no name by default)Nominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specified (by default equal to thesimulate_single_analysis
defaults)Nominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specified (by default equal to thesimulate_single_analysis
defaults)ALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysis
defaults)R13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to thesimulate_single_analysis
default)seed
: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle
: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0075 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0082 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0091 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0070 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
[table_of_samples]
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
[table_of_analyses]
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
1 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
2 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
3 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
4 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
5 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
6 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
7 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
8 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
9 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
10 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
11 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
12 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
13 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
14 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
15 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
16 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
17 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
18 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
19 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
20 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
21 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
22 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
23 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
24 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
25 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
26 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
27 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
28 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
29 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
30 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
31 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
32 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
33 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
34 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
35 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
36 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
37 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
38 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
39 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
40 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
41 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
42 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
43 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
44 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
45 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
46 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
47 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
48 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
49 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
50 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
51 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
52 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
53 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
54 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
55 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
56 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
57 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
58 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
59 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
60 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
577def table_of_samples( 578 data47 = None, 579 data48 = None, 580 dir = 'output', 581 filename = None, 582 save_to_file = True, 583 print_out = True, 584 output = None, 585 ): 586 ''' 587 Print out, save to disk and/or return a combined table of samples 588 for a pair of `D47data` and `D48data` objects. 589 590 **Parameters** 591 592 + `data47`: `D47data` instance 593 + `data48`: `D48data` instance 594 + `dir`: the directory in which to save the table 595 + `filename`: the name to the csv file to write to 596 + `save_to_file`: whether to save the table to disk 597 + `print_out`: whether to print out the table 598 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 599 if set to `'raw'`: return a list of list of strings 600 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 601 ''' 602 if data47 is None: 603 if data48 is None: 604 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 605 else: 606 return data48.table_of_samples( 607 dir = dir, 608 filename = filename, 609 save_to_file = save_to_file, 610 print_out = print_out, 611 output = output 612 ) 613 else: 614 if data48 is None: 615 return data47.table_of_samples( 616 dir = dir, 617 filename = filename, 618 save_to_file = save_to_file, 619 print_out = print_out, 620 output = output 621 ) 622 else: 623 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 624 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 625 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 626 627 if save_to_file: 628 if not os.path.exists(dir): 629 os.makedirs(dir) 630 if filename is None: 631 filename = f'D47D48_samples.csv' 632 with open(f'{dir}/{filename}', 'w') as fid: 633 fid.write(make_csv(out)) 634 if print_out: 635 print('\n'+pretty_table(out)) 636 if output == 'raw': 637 return out 638 elif output == 'pretty': 639 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data
and D48data
objects.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
642def table_of_sessions( 643 data47 = None, 644 data48 = None, 645 dir = 'output', 646 filename = None, 647 save_to_file = True, 648 print_out = True, 649 output = None, 650 ): 651 ''' 652 Print out, save to disk and/or return a combined table of sessions 653 for a pair of `D47data` and `D48data` objects. 654 ***Only applicable if the sessions in `data47` and those in `data48` 655 consist of the exact same sets of analyses.*** 656 657 **Parameters** 658 659 + `data47`: `D47data` instance 660 + `data48`: `D48data` instance 661 + `dir`: the directory in which to save the table 662 + `filename`: the name to the csv file to write to 663 + `save_to_file`: whether to save the table to disk 664 + `print_out`: whether to print out the table 665 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 666 if set to `'raw'`: return a list of list of strings 667 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 668 ''' 669 if data47 is None: 670 if data48 is None: 671 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 672 else: 673 return data48.table_of_sessions( 674 dir = dir, 675 filename = filename, 676 save_to_file = save_to_file, 677 print_out = print_out, 678 output = output 679 ) 680 else: 681 if data48 is None: 682 return data47.table_of_sessions( 683 dir = dir, 684 filename = filename, 685 save_to_file = save_to_file, 686 print_out = print_out, 687 output = output 688 ) 689 else: 690 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 691 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 692 for k,x in enumerate(out47[0]): 693 if k>7: 694 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 695 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 696 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 697 698 if save_to_file: 699 if not os.path.exists(dir): 700 os.makedirs(dir) 701 if filename is None: 702 filename = f'D47D48_sessions.csv' 703 with open(f'{dir}/{filename}', 'w') as fid: 704 fid.write(make_csv(out)) 705 if print_out: 706 print('\n'+pretty_table(out)) 707 if output == 'raw': 708 return out 709 elif output == 'pretty': 710 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data
and D48data
objects.
Only applicable if the sessions in data47
and those in data48
consist of the exact same sets of analyses.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
713def table_of_analyses( 714 data47 = None, 715 data48 = None, 716 dir = 'output', 717 filename = None, 718 save_to_file = True, 719 print_out = True, 720 output = None, 721 ): 722 ''' 723 Print out, save to disk and/or return a combined table of analyses 724 for a pair of `D47data` and `D48data` objects. 725 726 If the sessions in `data47` and those in `data48` do not consist of 727 the exact same sets of analyses, the table will have two columns 728 `Session_47` and `Session_48` instead of a single `Session` column. 729 730 **Parameters** 731 732 + `data47`: `D47data` instance 733 + `data48`: `D48data` instance 734 + `dir`: the directory in which to save the table 735 + `filename`: the name to the csv file to write to 736 + `save_to_file`: whether to save the table to disk 737 + `print_out`: whether to print out the table 738 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 739 if set to `'raw'`: return a list of list of strings 740 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 741 ''' 742 if data47 is None: 743 if data48 is None: 744 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 745 else: 746 return data48.table_of_analyses( 747 dir = dir, 748 filename = filename, 749 save_to_file = save_to_file, 750 print_out = print_out, 751 output = output 752 ) 753 else: 754 if data48 is None: 755 return data47.table_of_analyses( 756 dir = dir, 757 filename = filename, 758 save_to_file = save_to_file, 759 print_out = print_out, 760 output = output 761 ) 762 else: 763 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 764 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 765 766 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 767 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 768 else: 769 out47[0][1] = 'Session_47' 770 out48[0][1] = 'Session_48' 771 out47 = transpose_table(out47) 772 out48 = transpose_table(out48) 773 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 774 775 if save_to_file: 776 if not os.path.exists(dir): 777 os.makedirs(dir) 778 if filename is None: 779 filename = f'D47D48_sessions.csv' 780 with open(f'{dir}/{filename}', 'w') as fid: 781 fid.write(make_csv(out)) 782 if print_out: 783 print('\n'+pretty_table(out)) 784 if output == 'raw': 785 return out 786 elif output == 'pretty': 787 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data
and D48data
objects.
If the sessions in data47
and those in data48
do not consist of
the exact same sets of analyses, the table will have two columns
Session_47
and Session_48
instead of a single Session
column.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
835class D4xdata(list): 836 ''' 837 Store and process data for a large set of Δ47 and/or Δ48 838 analyses, usually comprising more than one analytical session. 839 ''' 840 841 ### 17O CORRECTION PARAMETERS 842 R13_VPDB = 0.01118 # (Chang & Li, 1990) 843 ''' 844 Absolute (13C/12C) ratio of VPDB. 845 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 846 ''' 847 848 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 849 ''' 850 Absolute (18O/16C) ratio of VSMOW. 851 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 852 ''' 853 854 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 855 ''' 856 Mass-dependent exponent for triple oxygen isotopes. 857 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 858 ''' 859 860 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 861 ''' 862 Absolute (17O/16C) ratio of VSMOW. 863 By default equal to 0.00038475 864 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 865 rescaled to `R13_VPDB`) 866 ''' 867 868 R18_VPDB = R18_VSMOW * 1.03092 869 ''' 870 Absolute (18O/16C) ratio of VPDB. 871 By definition equal to `R18_VSMOW * 1.03092`. 872 ''' 873 874 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 875 ''' 876 Absolute (17O/16C) ratio of VPDB. 877 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 878 ''' 879 880 LEVENE_REF_SAMPLE = 'ETH-3' 881 ''' 882 After the Δ4x standardization step, each sample is tested to 883 assess whether the Δ4x variance within all analyses for that 884 sample differs significantly from that observed for a given reference 885 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 886 which yields a p-value corresponding to the null hypothesis that the 887 underlying variances are equal). 888 889 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 890 sample should be used as a reference for this test. 891 ''' 892 893 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 894 ''' 895 Specifies the 18O/16O fractionation factor generally applicable 896 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 897 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 898 899 By default equal to 1.008129 (calcite reacted at 90 °C, 900 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 901 ''' 902 903 Nominal_d13C_VPDB = { 904 'ETH-1': 2.02, 905 'ETH-2': -10.17, 906 'ETH-3': 1.71, 907 } # (Bernasconi et al., 2018) 908 ''' 909 Nominal δ13C_VPDB values assigned to carbonate standards, used by 910 `D4xdata.standardize_d13C()`. 911 912 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 913 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 914 ''' 915 916 Nominal_d18O_VPDB = { 917 'ETH-1': -2.19, 918 'ETH-2': -18.69, 919 'ETH-3': -1.78, 920 } # (Bernasconi et al., 2018) 921 ''' 922 Nominal δ18O_VPDB values assigned to carbonate standards, used by 923 `D4xdata.standardize_d18O()`. 924 925 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 926 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 927 ''' 928 929 d13C_STANDARDIZATION_METHOD = '2pt' 930 ''' 931 Method by which to standardize δ13C values: 932 933 + `none`: do not apply any δ13C standardization. 934 + `'1pt'`: within each session, offset all initial δ13C values so as to 935 minimize the difference between final δ13C_VPDB values and 936 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 937 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 938 values so as to minimize the difference between final δ13C_VPDB 939 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 940 is defined). 941 ''' 942 943 d18O_STANDARDIZATION_METHOD = '2pt' 944 ''' 945 Method by which to standardize δ18O values: 946 947 + `none`: do not apply any δ18O standardization. 948 + `'1pt'`: within each session, offset all initial δ18O values so as to 949 minimize the difference between final δ18O_VPDB values and 950 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 951 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 952 values so as to minimize the difference between final δ18O_VPDB 953 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 954 is defined). 955 ''' 956 957 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 958 ''' 959 **Parameters** 960 961 + `l`: a list of dictionaries, with each dictionary including at least the keys 962 `Sample`, `d45`, `d46`, and `d47` or `d48`. 963 + `mass`: `'47'` or `'48'` 964 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 965 + `session`: define session name for analyses without a `Session` key 966 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 967 968 Returns a `D4xdata` object derived from `list`. 969 ''' 970 self._4x = mass 971 self.verbose = verbose 972 self.prefix = 'D4xdata' 973 self.logfile = logfile 974 list.__init__(self, l) 975 self.Nf = None 976 self.repeatability = {} 977 self.refresh(session = session) 978 979 980 def make_verbal(oldfun): 981 ''' 982 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 983 ''' 984 @wraps(oldfun) 985 def newfun(*args, verbose = '', **kwargs): 986 myself = args[0] 987 oldprefix = myself.prefix 988 myself.prefix = oldfun.__name__ 989 if verbose != '': 990 oldverbose = myself.verbose 991 myself.verbose = verbose 992 out = oldfun(*args, **kwargs) 993 myself.prefix = oldprefix 994 if verbose != '': 995 myself.verbose = oldverbose 996 return out 997 return newfun 998 999 1000 def msg(self, txt): 1001 ''' 1002 Log a message to `self.logfile`, and print it out if `verbose = True` 1003 ''' 1004 self.log(txt) 1005 if self.verbose: 1006 print(f'{f"[{self.prefix}]":<16} {txt}') 1007 1008 1009 def vmsg(self, txt): 1010 ''' 1011 Log a message to `self.logfile` and print it out 1012 ''' 1013 self.log(txt) 1014 print(txt) 1015 1016 1017 def log(self, *txts): 1018 ''' 1019 Log a message to `self.logfile` 1020 ''' 1021 if self.logfile: 1022 with open(self.logfile, 'a') as fid: 1023 for txt in txts: 1024 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1025 1026 1027 def refresh(self, session = 'mySession'): 1028 ''' 1029 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1030 ''' 1031 self.fill_in_missing_info(session = session) 1032 self.refresh_sessions() 1033 self.refresh_samples() 1034 1035 1036 def refresh_sessions(self): 1037 ''' 1038 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1039 to `False` for all sessions. 1040 ''' 1041 self.sessions = { 1042 s: {'data': [r for r in self if r['Session'] == s]} 1043 for s in sorted({r['Session'] for r in self}) 1044 } 1045 for s in self.sessions: 1046 self.sessions[s]['scrambling_drift'] = False 1047 self.sessions[s]['slope_drift'] = False 1048 self.sessions[s]['wg_drift'] = False 1049 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1050 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1051 1052 1053 def refresh_samples(self): 1054 ''' 1055 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1056 ''' 1057 self.samples = { 1058 s: {'data': [r for r in self if r['Sample'] == s]} 1059 for s in sorted({r['Sample'] for r in self}) 1060 } 1061 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1062 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1063 1064 1065 def read(self, filename, sep = '', session = ''): 1066 ''' 1067 Read file in csv format to load data into a `D47data` object. 1068 1069 In the csv file, spaces before and after field separators (`','` by default) 1070 are optional. Each line corresponds to a single analysis. 1071 1072 The required fields are: 1073 1074 + `UID`: a unique identifier 1075 + `Session`: an identifier for the analytical session 1076 + `Sample`: a sample identifier 1077 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1078 1079 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1080 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1081 and `d49` are optional, and set to NaN by default. 1082 1083 **Parameters** 1084 1085 + `fileneme`: the path of the file to read 1086 + `sep`: csv separator delimiting the fields 1087 + `session`: set `Session` field to this string for all analyses 1088 ''' 1089 with open(filename) as fid: 1090 self.input(fid.read(), sep = sep, session = session) 1091 1092 1093 def input(self, txt, sep = '', session = ''): 1094 ''' 1095 Read `txt` string in csv format to load analysis data into a `D47data` object. 1096 1097 In the csv string, spaces before and after field separators (`','` by default) 1098 are optional. Each line corresponds to a single analysis. 1099 1100 The required fields are: 1101 1102 + `UID`: a unique identifier 1103 + `Session`: an identifier for the analytical session 1104 + `Sample`: a sample identifier 1105 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1106 1107 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1108 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1109 and `d49` are optional, and set to NaN by default. 1110 1111 **Parameters** 1112 1113 + `txt`: the csv string to read 1114 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1115 whichever appers most often in `txt`. 1116 + `session`: set `Session` field to this string for all analyses 1117 ''' 1118 if sep == '': 1119 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1120 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1121 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1122 1123 if session != '': 1124 for r in data: 1125 r['Session'] = session 1126 1127 self += data 1128 self.refresh() 1129 1130 1131 @make_verbal 1132 def wg(self, samples = None, a18_acid = None): 1133 ''' 1134 Compute bulk composition of the working gas for each session based on 1135 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1136 `self.Nominal_d18O_VPDB`. 1137 ''' 1138 1139 self.msg('Computing WG composition:') 1140 1141 if a18_acid is None: 1142 a18_acid = self.ALPHA_18O_ACID_REACTION 1143 if samples is None: 1144 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1145 1146 assert a18_acid, f'Acid fractionation factor should not be zero.' 1147 1148 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1149 R45R46_standards = {} 1150 for sample in samples: 1151 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1152 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1153 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1154 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1155 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1156 1157 C12_s = 1 / (1 + R13_s) 1158 C13_s = R13_s / (1 + R13_s) 1159 C16_s = 1 / (1 + R17_s + R18_s) 1160 C17_s = R17_s / (1 + R17_s + R18_s) 1161 C18_s = R18_s / (1 + R17_s + R18_s) 1162 1163 C626_s = C12_s * C16_s ** 2 1164 C627_s = 2 * C12_s * C16_s * C17_s 1165 C628_s = 2 * C12_s * C16_s * C18_s 1166 C636_s = C13_s * C16_s ** 2 1167 C637_s = 2 * C13_s * C16_s * C17_s 1168 C727_s = C12_s * C17_s ** 2 1169 1170 R45_s = (C627_s + C636_s) / C626_s 1171 R46_s = (C628_s + C637_s + C727_s) / C626_s 1172 R45R46_standards[sample] = (R45_s, R46_s) 1173 1174 for s in self.sessions: 1175 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1176 assert db, f'No sample from {samples} found in session "{s}".' 1177# dbsamples = sorted({r['Sample'] for r in db}) 1178 1179 X = [r['d45'] for r in db] 1180 Y = [R45R46_standards[r['Sample']][0] for r in db] 1181 x1, x2 = np.min(X), np.max(X) 1182 1183 if x1 < x2: 1184 wgcoord = x1/(x1-x2) 1185 else: 1186 wgcoord = 999 1187 1188 if wgcoord < -.5 or wgcoord > 1.5: 1189 # unreasonable to extrapolate to d45 = 0 1190 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1191 else : 1192 # d45 = 0 is reasonably well bracketed 1193 R45_wg = np.polyfit(X, Y, 1)[1] 1194 1195 X = [r['d46'] for r in db] 1196 Y = [R45R46_standards[r['Sample']][1] for r in db] 1197 x1, x2 = np.min(X), np.max(X) 1198 1199 if x1 < x2: 1200 wgcoord = x1/(x1-x2) 1201 else: 1202 wgcoord = 999 1203 1204 if wgcoord < -.5 or wgcoord > 1.5: 1205 # unreasonable to extrapolate to d46 = 0 1206 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1207 else : 1208 # d46 = 0 is reasonably well bracketed 1209 R46_wg = np.polyfit(X, Y, 1)[1] 1210 1211 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1212 1213 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1214 1215 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1216 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1217 for r in self.sessions[s]['data']: 1218 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1219 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1220 1221 1222 def compute_bulk_delta(self, R45, R46, D17O = 0): 1223 ''' 1224 Compute δ13C_VPDB and δ18O_VSMOW, 1225 by solving the generalized form of equation (17) from 1226 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1227 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1228 solving the corresponding second-order Taylor polynomial. 1229 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1230 ''' 1231 1232 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1233 1234 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1235 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1236 C = 2 * self.R18_VSMOW 1237 D = -R46 1238 1239 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1240 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1241 cc = A + B + C + D 1242 1243 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1244 1245 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1246 R17 = K * R18 ** self.LAMBDA_17 1247 R13 = R45 - 2 * R17 1248 1249 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1250 1251 return d13C_VPDB, d18O_VSMOW 1252 1253 1254 @make_verbal 1255 def crunch(self, verbose = ''): 1256 ''' 1257 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1258 ''' 1259 for r in self: 1260 self.compute_bulk_and_clumping_deltas(r) 1261 self.standardize_d13C() 1262 self.standardize_d18O() 1263 self.msg(f"Crunched {len(self)} analyses.") 1264 1265 1266 def fill_in_missing_info(self, session = 'mySession'): 1267 ''' 1268 Fill in optional fields with default values 1269 ''' 1270 for i,r in enumerate(self): 1271 if 'D17O' not in r: 1272 r['D17O'] = 0. 1273 if 'UID' not in r: 1274 r['UID'] = f'{i+1}' 1275 if 'Session' not in r: 1276 r['Session'] = session 1277 for k in ['d47', 'd48', 'd49']: 1278 if k not in r: 1279 r[k] = np.nan 1280 1281 1282 def standardize_d13C(self): 1283 ''' 1284 Perform δ13C standadization within each session `s` according to 1285 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1286 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1287 may be redefined abitrarily at a later stage. 1288 ''' 1289 for s in self.sessions: 1290 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1291 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1292 X,Y = zip(*XY) 1293 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1294 offset = np.mean(Y) - np.mean(X) 1295 for r in self.sessions[s]['data']: 1296 r['d13C_VPDB'] += offset 1297 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1298 a,b = np.polyfit(X,Y,1) 1299 for r in self.sessions[s]['data']: 1300 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1301 1302 def standardize_d18O(self): 1303 ''' 1304 Perform δ18O standadization within each session `s` according to 1305 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1306 which is defined by default by `D47data.refresh_sessions()`as equal to 1307 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1308 ''' 1309 for s in self.sessions: 1310 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1311 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1312 X,Y = zip(*XY) 1313 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1314 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1315 offset = np.mean(Y) - np.mean(X) 1316 for r in self.sessions[s]['data']: 1317 r['d18O_VSMOW'] += offset 1318 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1319 a,b = np.polyfit(X,Y,1) 1320 for r in self.sessions[s]['data']: 1321 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1322 1323 1324 def compute_bulk_and_clumping_deltas(self, r): 1325 ''' 1326 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1327 ''' 1328 1329 # Compute working gas R13, R18, and isobar ratios 1330 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1331 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1332 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1333 1334 # Compute analyte isobar ratios 1335 R45 = (1 + r['d45'] / 1000) * R45_wg 1336 R46 = (1 + r['d46'] / 1000) * R46_wg 1337 R47 = (1 + r['d47'] / 1000) * R47_wg 1338 R48 = (1 + r['d48'] / 1000) * R48_wg 1339 R49 = (1 + r['d49'] / 1000) * R49_wg 1340 1341 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1342 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1343 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1344 1345 # Compute stochastic isobar ratios of the analyte 1346 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1347 R13, R18, D17O = r['D17O'] 1348 ) 1349 1350 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1351 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1352 if (R45 / R45stoch - 1) > 5e-8: 1353 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1354 if (R46 / R46stoch - 1) > 5e-8: 1355 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1356 1357 # Compute raw clumped isotope anomalies 1358 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1359 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1360 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1361 1362 1363 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1364 ''' 1365 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1366 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1367 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1368 ''' 1369 1370 # Compute R17 1371 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1372 1373 # Compute isotope concentrations 1374 C12 = (1 + R13) ** -1 1375 C13 = C12 * R13 1376 C16 = (1 + R17 + R18) ** -1 1377 C17 = C16 * R17 1378 C18 = C16 * R18 1379 1380 # Compute stochastic isotopologue concentrations 1381 C626 = C16 * C12 * C16 1382 C627 = C16 * C12 * C17 * 2 1383 C628 = C16 * C12 * C18 * 2 1384 C636 = C16 * C13 * C16 1385 C637 = C16 * C13 * C17 * 2 1386 C638 = C16 * C13 * C18 * 2 1387 C727 = C17 * C12 * C17 1388 C728 = C17 * C12 * C18 * 2 1389 C737 = C17 * C13 * C17 1390 C738 = C17 * C13 * C18 * 2 1391 C828 = C18 * C12 * C18 1392 C838 = C18 * C13 * C18 1393 1394 # Compute stochastic isobar ratios 1395 R45 = (C636 + C627) / C626 1396 R46 = (C628 + C637 + C727) / C626 1397 R47 = (C638 + C728 + C737) / C626 1398 R48 = (C738 + C828) / C626 1399 R49 = C838 / C626 1400 1401 # Account for stochastic anomalies 1402 R47 *= 1 + D47 / 1000 1403 R48 *= 1 + D48 / 1000 1404 R49 *= 1 + D49 / 1000 1405 1406 # Return isobar ratios 1407 return R45, R46, R47, R48, R49 1408 1409 1410 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1411 ''' 1412 Split unknown samples by UID (treat all analyses as different samples) 1413 or by session (treat analyses of a given sample in different sessions as 1414 different samples). 1415 1416 **Parameters** 1417 1418 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1419 + `grouping`: `by_uid` | `by_session` 1420 ''' 1421 if samples_to_split == 'all': 1422 samples_to_split = [s for s in self.unknowns] 1423 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1424 self.grouping = grouping.lower() 1425 if self.grouping in gkeys: 1426 gkey = gkeys[self.grouping] 1427 for r in self: 1428 if r['Sample'] in samples_to_split: 1429 r['Sample_original'] = r['Sample'] 1430 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1431 elif r['Sample'] in self.unknowns: 1432 r['Sample_original'] = r['Sample'] 1433 self.refresh_samples() 1434 1435 1436 def unsplit_samples(self, tables = False): 1437 ''' 1438 Reverse the effects of `D47data.split_samples()`. 1439 1440 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1441 1442 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1443 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1444 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1445 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1446 that case session-averaged Δ4x values are statistically independent). 1447 ''' 1448 unknowns_old = sorted({s for s in self.unknowns}) 1449 CM_old = self.standardization.covar[:,:] 1450 VD_old = self.standardization.params.valuesdict().copy() 1451 vars_old = self.standardization.var_names 1452 1453 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1454 1455 Ns = len(vars_old) - len(unknowns_old) 1456 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1457 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1458 1459 W = np.zeros((len(vars_new), len(vars_old))) 1460 W[:Ns,:Ns] = np.eye(Ns) 1461 for u in unknowns_new: 1462 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1463 if self.grouping == 'by_session': 1464 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1465 elif self.grouping == 'by_uid': 1466 weights = [1 for s in splits] 1467 sw = sum(weights) 1468 weights = [w/sw for w in weights] 1469 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1470 1471 CM_new = W @ CM_old @ W.T 1472 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1473 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1474 1475 self.standardization.covar = CM_new 1476 self.standardization.params.valuesdict = lambda : VD_new 1477 self.standardization.var_names = vars_new 1478 1479 for r in self: 1480 if r['Sample'] in self.unknowns: 1481 r['Sample_split'] = r['Sample'] 1482 r['Sample'] = r['Sample_original'] 1483 1484 self.refresh_samples() 1485 self.consolidate_samples() 1486 self.repeatabilities() 1487 1488 if tables: 1489 self.table_of_analyses() 1490 self.table_of_samples() 1491 1492 def assign_timestamps(self): 1493 ''' 1494 Assign a time field `t` of type `float` to each analysis. 1495 1496 If `TimeTag` is one of the data fields, `t` is equal within a given session 1497 to `TimeTag` minus the mean value of `TimeTag` for that session. 1498 Otherwise, `TimeTag` is by default equal to the index of each analysis 1499 in the dataset and `t` is defined as above. 1500 ''' 1501 for session in self.sessions: 1502 sdata = self.sessions[session]['data'] 1503 try: 1504 t0 = np.mean([r['TimeTag'] for r in sdata]) 1505 for r in sdata: 1506 r['t'] = r['TimeTag'] - t0 1507 except KeyError: 1508 t0 = (len(sdata)-1)/2 1509 for t,r in enumerate(sdata): 1510 r['t'] = t - t0 1511 1512 1513 def report(self): 1514 ''' 1515 Prints a report on the standardization fit. 1516 Only applicable after `D4xdata.standardize(method='pooled')`. 1517 ''' 1518 report_fit(self.standardization) 1519 1520 1521 def combine_samples(self, sample_groups): 1522 ''' 1523 Combine analyses of different samples to compute weighted average Δ4x 1524 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1525 dictionary. 1526 1527 Caution: samples are weighted by number of replicate analyses, which is a 1528 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1529 correlated analytical errors for one or more samples). 1530 1531 Returns a tuplet of: 1532 1533 + the list of group names 1534 + an array of the corresponding Δ4x values 1535 + the corresponding (co)variance matrix 1536 1537 **Parameters** 1538 1539 + `sample_groups`: a dictionary of the form: 1540 ```py 1541 {'group1': ['sample_1', 'sample_2'], 1542 'group2': ['sample_3', 'sample_4', 'sample_5']} 1543 ``` 1544 ''' 1545 1546 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1547 groups = sorted(sample_groups.keys()) 1548 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1549 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1550 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1551 W = np.array([ 1552 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1553 for j in groups]) 1554 D4x_new = W @ D4x_old 1555 CM_new = W @ CM_old @ W.T 1556 1557 return groups, D4x_new[:,0], CM_new 1558 1559 1560 @make_verbal 1561 def standardize(self, 1562 method = 'pooled', 1563 weighted_sessions = [], 1564 consolidate = True, 1565 consolidate_tables = False, 1566 consolidate_plots = False, 1567 constraints = {}, 1568 ): 1569 ''' 1570 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1571 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1572 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1573 i.e. that their true Δ4x value does not change between sessions, 1574 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1575 `'indep_sessions'`, the standardization processes each session independently, based only 1576 on anchors analyses. 1577 ''' 1578 1579 self.standardization_method = method 1580 self.assign_timestamps() 1581 1582 if method == 'pooled': 1583 if weighted_sessions: 1584 for session_group in weighted_sessions: 1585 if self._4x == '47': 1586 X = D47data([r for r in self if r['Session'] in session_group]) 1587 elif self._4x == '48': 1588 X = D48data([r for r in self if r['Session'] in session_group]) 1589 X.Nominal_D4x = self.Nominal_D4x.copy() 1590 X.refresh() 1591 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1592 w = np.sqrt(result.redchi) 1593 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1594 for r in X: 1595 r[f'wD{self._4x}raw'] *= w 1596 else: 1597 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1598 for r in self: 1599 r[f'wD{self._4x}raw'] = 1. 1600 1601 params = Parameters() 1602 for k,session in enumerate(self.sessions): 1603 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1604 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1605 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1606 s = pf(session) 1607 params.add(f'a_{s}', value = 0.9) 1608 params.add(f'b_{s}', value = 0.) 1609 params.add(f'c_{s}', value = -0.9) 1610 params.add(f'a2_{s}', value = 0., 1611# vary = self.sessions[session]['scrambling_drift'], 1612 ) 1613 params.add(f'b2_{s}', value = 0., 1614# vary = self.sessions[session]['slope_drift'], 1615 ) 1616 params.add(f'c2_{s}', value = 0., 1617# vary = self.sessions[session]['wg_drift'], 1618 ) 1619 if not self.sessions[session]['scrambling_drift']: 1620 params[f'a2_{s}'].expr = '0' 1621 if not self.sessions[session]['slope_drift']: 1622 params[f'b2_{s}'].expr = '0' 1623 if not self.sessions[session]['wg_drift']: 1624 params[f'c2_{s}'].expr = '0' 1625 1626 for sample in self.unknowns: 1627 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1628 1629 for k in constraints: 1630 params[k].expr = constraints[k] 1631 1632 def residuals(p): 1633 R = [] 1634 for r in self: 1635 session = pf(r['Session']) 1636 sample = pf(r['Sample']) 1637 if r['Sample'] in self.Nominal_D4x: 1638 R += [ ( 1639 r[f'D{self._4x}raw'] - ( 1640 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1641 + p[f'b_{session}'] * r[f'd{self._4x}'] 1642 + p[f'c_{session}'] 1643 + r['t'] * ( 1644 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1645 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1646 + p[f'c2_{session}'] 1647 ) 1648 ) 1649 ) / r[f'wD{self._4x}raw'] ] 1650 else: 1651 R += [ ( 1652 r[f'D{self._4x}raw'] - ( 1653 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1654 + p[f'b_{session}'] * r[f'd{self._4x}'] 1655 + p[f'c_{session}'] 1656 + r['t'] * ( 1657 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1658 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1659 + p[f'c2_{session}'] 1660 ) 1661 ) 1662 ) / r[f'wD{self._4x}raw'] ] 1663 return R 1664 1665 M = Minimizer(residuals, params) 1666 result = M.least_squares() 1667 self.Nf = result.nfree 1668 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1669 new_names, new_covar, new_se = _fullcovar(result)[:3] 1670 result.var_names = new_names 1671 result.covar = new_covar 1672 1673 for r in self: 1674 s = pf(r["Session"]) 1675 a = result.params.valuesdict()[f'a_{s}'] 1676 b = result.params.valuesdict()[f'b_{s}'] 1677 c = result.params.valuesdict()[f'c_{s}'] 1678 a2 = result.params.valuesdict()[f'a2_{s}'] 1679 b2 = result.params.valuesdict()[f'b2_{s}'] 1680 c2 = result.params.valuesdict()[f'c2_{s}'] 1681 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1682 1683 1684 self.standardization = result 1685 1686 for session in self.sessions: 1687 self.sessions[session]['Np'] = 3 1688 for k in ['scrambling', 'slope', 'wg']: 1689 if self.sessions[session][f'{k}_drift']: 1690 self.sessions[session]['Np'] += 1 1691 1692 if consolidate: 1693 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1694 return result 1695 1696 1697 elif method == 'indep_sessions': 1698 1699 if weighted_sessions: 1700 for session_group in weighted_sessions: 1701 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1702 X.Nominal_D4x = self.Nominal_D4x.copy() 1703 X.refresh() 1704 # This is only done to assign r['wD47raw'] for r in X: 1705 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1706 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1707 else: 1708 self.msg('All weights set to 1 ‰') 1709 for r in self: 1710 r[f'wD{self._4x}raw'] = 1 1711 1712 for session in self.sessions: 1713 s = self.sessions[session] 1714 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1715 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1716 s['Np'] = sum(p_active) 1717 sdata = s['data'] 1718 1719 A = np.array([ 1720 [ 1721 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1722 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1723 1 / r[f'wD{self._4x}raw'], 1724 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1725 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1726 r['t'] / r[f'wD{self._4x}raw'] 1727 ] 1728 for r in sdata if r['Sample'] in self.anchors 1729 ])[:,p_active] # only keep columns for the active parameters 1730 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1731 s['Na'] = Y.size 1732 CM = linalg.inv(A.T @ A) 1733 bf = (CM @ A.T @ Y).T[0,:] 1734 k = 0 1735 for n,a in zip(p_names, p_active): 1736 if a: 1737 s[n] = bf[k] 1738# self.msg(f'{n} = {bf[k]}') 1739 k += 1 1740 else: 1741 s[n] = 0. 1742# self.msg(f'{n} = 0.0') 1743 1744 for r in sdata : 1745 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1746 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1747 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1748 1749 s['CM'] = np.zeros((6,6)) 1750 i = 0 1751 k_active = [j for j,a in enumerate(p_active) if a] 1752 for j,a in enumerate(p_active): 1753 if a: 1754 s['CM'][j,k_active] = CM[i,:] 1755 i += 1 1756 1757 if not weighted_sessions: 1758 w = self.rmswd()['rmswd'] 1759 for r in self: 1760 r[f'wD{self._4x}'] *= w 1761 r[f'wD{self._4x}raw'] *= w 1762 for session in self.sessions: 1763 self.sessions[session]['CM'] *= w**2 1764 1765 for session in self.sessions: 1766 s = self.sessions[session] 1767 s['SE_a'] = s['CM'][0,0]**.5 1768 s['SE_b'] = s['CM'][1,1]**.5 1769 s['SE_c'] = s['CM'][2,2]**.5 1770 s['SE_a2'] = s['CM'][3,3]**.5 1771 s['SE_b2'] = s['CM'][4,4]**.5 1772 s['SE_c2'] = s['CM'][5,5]**.5 1773 1774 if not weighted_sessions: 1775 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1776 else: 1777 self.Nf = 0 1778 for sg in weighted_sessions: 1779 self.Nf += self.rmswd(sessions = sg)['Nf'] 1780 1781 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1782 1783 avgD4x = { 1784 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1785 for sample in self.samples 1786 } 1787 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1788 rD4x = (chi2/self.Nf)**.5 1789 self.repeatability[f'sigma_{self._4x}'] = rD4x 1790 1791 if consolidate: 1792 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1793 1794 1795 def standardization_error(self, session, d4x, D4x, t = 0): 1796 ''' 1797 Compute standardization error for a given session and 1798 (δ47, Δ47) composition. 1799 ''' 1800 a = self.sessions[session]['a'] 1801 b = self.sessions[session]['b'] 1802 c = self.sessions[session]['c'] 1803 a2 = self.sessions[session]['a2'] 1804 b2 = self.sessions[session]['b2'] 1805 c2 = self.sessions[session]['c2'] 1806 CM = self.sessions[session]['CM'] 1807 1808 x, y = D4x, d4x 1809 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1810# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1811 dxdy = -(b+b2*t) / (a+a2*t) 1812 dxdz = 1. / (a+a2*t) 1813 dxda = -x / (a+a2*t) 1814 dxdb = -y / (a+a2*t) 1815 dxdc = -1. / (a+a2*t) 1816 dxda2 = -x * a2 / (a+a2*t) 1817 dxdb2 = -y * t / (a+a2*t) 1818 dxdc2 = -t / (a+a2*t) 1819 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1820 sx = (V @ CM @ V.T) ** .5 1821 return sx 1822 1823 1824 @make_verbal 1825 def summary(self, 1826 dir = 'output', 1827 filename = None, 1828 save_to_file = True, 1829 print_out = True, 1830 ): 1831 ''' 1832 Print out an/or save to disk a summary of the standardization results. 1833 1834 **Parameters** 1835 1836 + `dir`: the directory in which to save the table 1837 + `filename`: the name to the csv file to write to 1838 + `save_to_file`: whether to save the table to disk 1839 + `print_out`: whether to print out the table 1840 ''' 1841 1842 out = [] 1843 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1844 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1845 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1846 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1847 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1848 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1849 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1850 out += [['Model degrees of freedom', f"{self.Nf}"]] 1851 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1852 out += [['Standardization method', self.standardization_method]] 1853 1854 if save_to_file: 1855 if not os.path.exists(dir): 1856 os.makedirs(dir) 1857 if filename is None: 1858 filename = f'D{self._4x}_summary.csv' 1859 with open(f'{dir}/{filename}', 'w') as fid: 1860 fid.write(make_csv(out)) 1861 if print_out: 1862 self.msg('\n' + pretty_table(out, header = 0)) 1863 1864 1865 @make_verbal 1866 def table_of_sessions(self, 1867 dir = 'output', 1868 filename = None, 1869 save_to_file = True, 1870 print_out = True, 1871 output = None, 1872 ): 1873 ''' 1874 Print out an/or save to disk a table of sessions. 1875 1876 **Parameters** 1877 1878 + `dir`: the directory in which to save the table 1879 + `filename`: the name to the csv file to write to 1880 + `save_to_file`: whether to save the table to disk 1881 + `print_out`: whether to print out the table 1882 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1883 if set to `'raw'`: return a list of list of strings 1884 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1885 ''' 1886 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1887 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1888 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1889 1890 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1891 if include_a2: 1892 out[-1] += ['a2 ± SE'] 1893 if include_b2: 1894 out[-1] += ['b2 ± SE'] 1895 if include_c2: 1896 out[-1] += ['c2 ± SE'] 1897 for session in self.sessions: 1898 out += [[ 1899 session, 1900 f"{self.sessions[session]['Na']}", 1901 f"{self.sessions[session]['Nu']}", 1902 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1903 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1904 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1905 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1906 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1907 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1908 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1909 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1910 ]] 1911 if include_a2: 1912 if self.sessions[session]['scrambling_drift']: 1913 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1914 else: 1915 out[-1] += [''] 1916 if include_b2: 1917 if self.sessions[session]['slope_drift']: 1918 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1919 else: 1920 out[-1] += [''] 1921 if include_c2: 1922 if self.sessions[session]['wg_drift']: 1923 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1924 else: 1925 out[-1] += [''] 1926 1927 if save_to_file: 1928 if not os.path.exists(dir): 1929 os.makedirs(dir) 1930 if filename is None: 1931 filename = f'D{self._4x}_sessions.csv' 1932 with open(f'{dir}/{filename}', 'w') as fid: 1933 fid.write(make_csv(out)) 1934 if print_out: 1935 self.msg('\n' + pretty_table(out)) 1936 if output == 'raw': 1937 return out 1938 elif output == 'pretty': 1939 return pretty_table(out) 1940 1941 1942 @make_verbal 1943 def table_of_analyses( 1944 self, 1945 dir = 'output', 1946 filename = None, 1947 save_to_file = True, 1948 print_out = True, 1949 output = None, 1950 ): 1951 ''' 1952 Print out an/or save to disk a table of analyses. 1953 1954 **Parameters** 1955 1956 + `dir`: the directory in which to save the table 1957 + `filename`: the name to the csv file to write to 1958 + `save_to_file`: whether to save the table to disk 1959 + `print_out`: whether to print out the table 1960 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1961 if set to `'raw'`: return a list of list of strings 1962 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1963 ''' 1964 1965 out = [['UID','Session','Sample']] 1966 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1967 for f in extra_fields: 1968 out[-1] += [f[0]] 1969 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1970 for r in self: 1971 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1972 for f in extra_fields: 1973 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1974 out[-1] += [ 1975 f"{r['d13Cwg_VPDB']:.3f}", 1976 f"{r['d18Owg_VSMOW']:.3f}", 1977 f"{r['d45']:.6f}", 1978 f"{r['d46']:.6f}", 1979 f"{r['d47']:.6f}", 1980 f"{r['d48']:.6f}", 1981 f"{r['d49']:.6f}", 1982 f"{r['d13C_VPDB']:.6f}", 1983 f"{r['d18O_VSMOW']:.6f}", 1984 f"{r['D47raw']:.6f}", 1985 f"{r['D48raw']:.6f}", 1986 f"{r['D49raw']:.6f}", 1987 f"{r[f'D{self._4x}']:.6f}" 1988 ] 1989 if save_to_file: 1990 if not os.path.exists(dir): 1991 os.makedirs(dir) 1992 if filename is None: 1993 filename = f'D{self._4x}_analyses.csv' 1994 with open(f'{dir}/{filename}', 'w') as fid: 1995 fid.write(make_csv(out)) 1996 if print_out: 1997 self.msg('\n' + pretty_table(out)) 1998 return out 1999 2000 @make_verbal 2001 def covar_table( 2002 self, 2003 correl = False, 2004 dir = 'output', 2005 filename = None, 2006 save_to_file = True, 2007 print_out = True, 2008 output = None, 2009 ): 2010 ''' 2011 Print out, save to disk and/or return the variance-covariance matrix of D4x 2012 for all unknown samples. 2013 2014 **Parameters** 2015 2016 + `dir`: the directory in which to save the csv 2017 + `filename`: the name of the csv file to write to 2018 + `save_to_file`: whether to save the csv 2019 + `print_out`: whether to print out the matrix 2020 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2021 if set to `'raw'`: return a list of list of strings 2022 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2023 ''' 2024 samples = sorted([u for u in self.unknowns]) 2025 out = [[''] + samples] 2026 for s1 in samples: 2027 out.append([s1]) 2028 for s2 in samples: 2029 if correl: 2030 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2031 else: 2032 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2033 2034 if save_to_file: 2035 if not os.path.exists(dir): 2036 os.makedirs(dir) 2037 if filename is None: 2038 if correl: 2039 filename = f'D{self._4x}_correl.csv' 2040 else: 2041 filename = f'D{self._4x}_covar.csv' 2042 with open(f'{dir}/{filename}', 'w') as fid: 2043 fid.write(make_csv(out)) 2044 if print_out: 2045 self.msg('\n'+pretty_table(out)) 2046 if output == 'raw': 2047 return out 2048 elif output == 'pretty': 2049 return pretty_table(out) 2050 2051 @make_verbal 2052 def table_of_samples( 2053 self, 2054 dir = 'output', 2055 filename = None, 2056 save_to_file = True, 2057 print_out = True, 2058 output = None, 2059 ): 2060 ''' 2061 Print out, save to disk and/or return a table of samples. 2062 2063 **Parameters** 2064 2065 + `dir`: the directory in which to save the csv 2066 + `filename`: the name of the csv file to write to 2067 + `save_to_file`: whether to save the csv 2068 + `print_out`: whether to print out the table 2069 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2070 if set to `'raw'`: return a list of list of strings 2071 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2072 ''' 2073 2074 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2075 for sample in self.anchors: 2076 out += [[ 2077 f"{sample}", 2078 f"{self.samples[sample]['N']}", 2079 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2080 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2081 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2082 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2083 ]] 2084 for sample in self.unknowns: 2085 out += [[ 2086 f"{sample}", 2087 f"{self.samples[sample]['N']}", 2088 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2089 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2090 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2091 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2092 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2093 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2094 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2095 ]] 2096 if save_to_file: 2097 if not os.path.exists(dir): 2098 os.makedirs(dir) 2099 if filename is None: 2100 filename = f'D{self._4x}_samples.csv' 2101 with open(f'{dir}/{filename}', 'w') as fid: 2102 fid.write(make_csv(out)) 2103 if print_out: 2104 self.msg('\n'+pretty_table(out)) 2105 if output == 'raw': 2106 return out 2107 elif output == 'pretty': 2108 return pretty_table(out) 2109 2110 2111 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2112 ''' 2113 Generate session plots and save them to disk. 2114 2115 **Parameters** 2116 2117 + `dir`: the directory in which to save the plots 2118 + `figsize`: the width and height (in inches) of each plot 2119 + `filetype`: 'pdf' or 'png' 2120 + `dpi`: resolution for PNG output 2121 ''' 2122 if not os.path.exists(dir): 2123 os.makedirs(dir) 2124 2125 for session in self.sessions: 2126 sp = self.plot_single_session(session, xylimits = 'constant') 2127 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2128 ppl.close(sp.fig) 2129 2130 2131 2132 @make_verbal 2133 def consolidate_samples(self): 2134 ''' 2135 Compile various statistics for each sample. 2136 2137 For each anchor sample: 2138 2139 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2140 + `SE_D47` or `SE_D48`: set to zero by definition 2141 2142 For each unknown sample: 2143 2144 + `D47` or `D48`: the standardized Δ4x value for this unknown 2145 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2146 2147 For each anchor and unknown: 2148 2149 + `N`: the total number of analyses of this sample 2150 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2151 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2152 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2153 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2154 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2155 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2156 ''' 2157 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2158 for sample in self.samples: 2159 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2160 if self.samples[sample]['N'] > 1: 2161 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2162 2163 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2164 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2165 2166 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2167 if len(D4x_pop) > 2: 2168 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2169 2170 if self.standardization_method == 'pooled': 2171 for sample in self.anchors: 2172 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2173 self.samples[sample][f'SE_D{self._4x}'] = 0. 2174 for sample in self.unknowns: 2175 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2176 try: 2177 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2178 except ValueError: 2179 # when `sample` is constrained by self.standardize(constraints = {...}), 2180 # it is no longer listed in self.standardization.var_names. 2181 # Temporary fix: define SE as zero for now 2182 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2183 2184 elif self.standardization_method == 'indep_sessions': 2185 for sample in self.anchors: 2186 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2187 self.samples[sample][f'SE_D{self._4x}'] = 0. 2188 for sample in self.unknowns: 2189 self.msg(f'Consolidating sample {sample}') 2190 self.unknowns[sample][f'session_D{self._4x}'] = {} 2191 session_avg = [] 2192 for session in self.sessions: 2193 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2194 if sdata: 2195 self.msg(f'{sample} found in session {session}') 2196 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2197 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2198 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2199 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2200 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2201 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2202 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2203 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2204 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2205 wsum = sum([weights[s] for s in weights]) 2206 for s in weights: 2207 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2208 2209 for r in self: 2210 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2211 2212 2213 2214 def consolidate_sessions(self): 2215 ''' 2216 Compute various statistics for each session. 2217 2218 + `Na`: Number of anchor analyses in the session 2219 + `Nu`: Number of unknown analyses in the session 2220 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2221 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2222 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2223 + `a`: scrambling factor 2224 + `b`: compositional slope 2225 + `c`: WG offset 2226 + `SE_a`: Model stadard erorr of `a` 2227 + `SE_b`: Model stadard erorr of `b` 2228 + `SE_c`: Model stadard erorr of `c` 2229 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2230 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2231 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2232 + `a2`: scrambling factor drift 2233 + `b2`: compositional slope drift 2234 + `c2`: WG offset drift 2235 + `Np`: Number of standardization parameters to fit 2236 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2237 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2238 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2239 ''' 2240 for session in self.sessions: 2241 if 'd13Cwg_VPDB' not in self.sessions[session]: 2242 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2243 if 'd18Owg_VSMOW' not in self.sessions[session]: 2244 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2245 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2246 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2247 2248 self.msg(f'Computing repeatabilities for session {session}') 2249 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2250 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2251 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2252 2253 if self.standardization_method == 'pooled': 2254 for session in self.sessions: 2255 2256 # different (better?) computation of D4x repeatability for each session: 2257 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2258 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2259 2260 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2261 i = self.standardization.var_names.index(f'a_{pf(session)}') 2262 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2263 2264 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2265 i = self.standardization.var_names.index(f'b_{pf(session)}') 2266 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2267 2268 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2269 i = self.standardization.var_names.index(f'c_{pf(session)}') 2270 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2271 2272 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2273 if self.sessions[session]['scrambling_drift']: 2274 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2275 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2276 else: 2277 self.sessions[session]['SE_a2'] = 0. 2278 2279 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2280 if self.sessions[session]['slope_drift']: 2281 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2282 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2283 else: 2284 self.sessions[session]['SE_b2'] = 0. 2285 2286 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2287 if self.sessions[session]['wg_drift']: 2288 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2289 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2290 else: 2291 self.sessions[session]['SE_c2'] = 0. 2292 2293 i = self.standardization.var_names.index(f'a_{pf(session)}') 2294 j = self.standardization.var_names.index(f'b_{pf(session)}') 2295 k = self.standardization.var_names.index(f'c_{pf(session)}') 2296 CM = np.zeros((6,6)) 2297 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2298 try: 2299 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2300 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2301 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2302 try: 2303 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2304 CM[3,4] = self.standardization.covar[i2,j2] 2305 CM[4,3] = self.standardization.covar[j2,i2] 2306 except ValueError: 2307 pass 2308 try: 2309 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2310 CM[3,5] = self.standardization.covar[i2,k2] 2311 CM[5,3] = self.standardization.covar[k2,i2] 2312 except ValueError: 2313 pass 2314 except ValueError: 2315 pass 2316 try: 2317 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2318 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2319 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2320 try: 2321 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2322 CM[4,5] = self.standardization.covar[j2,k2] 2323 CM[5,4] = self.standardization.covar[k2,j2] 2324 except ValueError: 2325 pass 2326 except ValueError: 2327 pass 2328 try: 2329 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2330 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2331 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2332 except ValueError: 2333 pass 2334 2335 self.sessions[session]['CM'] = CM 2336 2337 elif self.standardization_method == 'indep_sessions': 2338 pass # Not implemented yet 2339 2340 2341 @make_verbal 2342 def repeatabilities(self): 2343 ''' 2344 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2345 (for all samples, for anchors, and for unknowns). 2346 ''' 2347 self.msg('Computing reproducibilities for all sessions') 2348 2349 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2350 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2351 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2352 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2353 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2354 2355 2356 @make_verbal 2357 def consolidate(self, tables = True, plots = True): 2358 ''' 2359 Collect information about samples, sessions and repeatabilities. 2360 ''' 2361 self.consolidate_samples() 2362 self.consolidate_sessions() 2363 self.repeatabilities() 2364 2365 if tables: 2366 self.summary() 2367 self.table_of_sessions() 2368 self.table_of_analyses() 2369 self.table_of_samples() 2370 2371 if plots: 2372 self.plot_sessions() 2373 2374 2375 @make_verbal 2376 def rmswd(self, 2377 samples = 'all samples', 2378 sessions = 'all sessions', 2379 ): 2380 ''' 2381 Compute the χ2, root mean squared weighted deviation 2382 (i.e. reduced χ2), and corresponding degrees of freedom of the 2383 Δ4x values for samples in `samples` and sessions in `sessions`. 2384 2385 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2386 ''' 2387 if samples == 'all samples': 2388 mysamples = [k for k in self.samples] 2389 elif samples == 'anchors': 2390 mysamples = [k for k in self.anchors] 2391 elif samples == 'unknowns': 2392 mysamples = [k for k in self.unknowns] 2393 else: 2394 mysamples = samples 2395 2396 if sessions == 'all sessions': 2397 sessions = [k for k in self.sessions] 2398 2399 chisq, Nf = 0, 0 2400 for sample in mysamples : 2401 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2402 if len(G) > 1 : 2403 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2404 Nf += (len(G) - 1) 2405 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2406 r = (chisq / Nf)**.5 if Nf > 0 else 0 2407 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2408 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2409 2410 2411 @make_verbal 2412 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2413 ''' 2414 Compute the repeatability of `[r[key] for r in self]` 2415 ''' 2416 2417 if samples == 'all samples': 2418 mysamples = [k for k in self.samples] 2419 elif samples == 'anchors': 2420 mysamples = [k for k in self.anchors] 2421 elif samples == 'unknowns': 2422 mysamples = [k for k in self.unknowns] 2423 else: 2424 mysamples = samples 2425 2426 if sessions == 'all sessions': 2427 sessions = [k for k in self.sessions] 2428 2429 if key in ['D47', 'D48']: 2430 # Full disclosure: the definition of Nf is tricky/debatable 2431 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2432 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2433 Nf = len(G) 2434# print(f'len(G) = {Nf}') 2435 Nf -= len([s for s in mysamples if s in self.unknowns]) 2436# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2437 for session in sessions: 2438 Np = len([ 2439 _ for _ in self.standardization.params 2440 if ( 2441 self.standardization.params[_].expr is not None 2442 and ( 2443 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2444 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2445 ) 2446 ) 2447 ]) 2448# print(f'session {session}: {Np} parameters to consider') 2449 Na = len({ 2450 r['Sample'] for r in self.sessions[session]['data'] 2451 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2452 }) 2453# print(f'session {session}: {Na} different anchors in that session') 2454 Nf -= min(Np, Na) 2455# print(f'Nf = {Nf}') 2456 2457# for sample in mysamples : 2458# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2459# if len(X) > 1 : 2460# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2461# if sample in self.unknowns: 2462# Nf += len(X) - 1 2463# else: 2464# Nf += len(X) 2465# if samples in ['anchors', 'all samples']: 2466# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2467 r = (chisq / Nf)**.5 if Nf > 0 else 0 2468 2469 else: # if key not in ['D47', 'D48'] 2470 chisq, Nf = 0, 0 2471 for sample in mysamples : 2472 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2473 if len(X) > 1 : 2474 Nf += len(X) - 1 2475 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2476 r = (chisq / Nf)**.5 if Nf > 0 else 0 2477 2478 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2479 return r 2480 2481 def sample_average(self, samples, weights = 'equal', normalize = True): 2482 ''' 2483 Weighted average Δ4x value of a group of samples, accounting for covariance. 2484 2485 Returns the weighed average Δ4x value and associated SE 2486 of a group of samples. Weights are equal by default. If `normalize` is 2487 true, `weights` will be rescaled so that their sum equals 1. 2488 2489 **Examples** 2490 2491 ```python 2492 self.sample_average(['X','Y'], [1, 2]) 2493 ``` 2494 2495 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2496 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2497 values of samples X and Y, respectively. 2498 2499 ```python 2500 self.sample_average(['X','Y'], [1, -1], normalize = False) 2501 ``` 2502 2503 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2504 ''' 2505 if weights == 'equal': 2506 weights = [1/len(samples)] * len(samples) 2507 2508 if normalize: 2509 s = sum(weights) 2510 if s: 2511 weights = [w/s for w in weights] 2512 2513 try: 2514# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2515# C = self.standardization.covar[indices,:][:,indices] 2516 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2517 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2518 return correlated_sum(X, C, weights) 2519 except ValueError: 2520 return (0., 0.) 2521 2522 2523 def sample_D4x_covar(self, sample1, sample2 = None): 2524 ''' 2525 Covariance between Δ4x values of samples 2526 2527 Returns the error covariance between the average Δ4x values of two 2528 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2529 returns the Δ4x variance for that sample. 2530 ''' 2531 if sample2 is None: 2532 sample2 = sample1 2533 if self.standardization_method == 'pooled': 2534 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2535 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2536 return self.standardization.covar[i, j] 2537 elif self.standardization_method == 'indep_sessions': 2538 if sample1 == sample2: 2539 return self.samples[sample1][f'SE_D{self._4x}']**2 2540 else: 2541 c = 0 2542 for session in self.sessions: 2543 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2544 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2545 if sdata1 and sdata2: 2546 a = self.sessions[session]['a'] 2547 # !! TODO: CM below does not account for temporal changes in standardization parameters 2548 CM = self.sessions[session]['CM'][:3,:3] 2549 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2550 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2551 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2552 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2553 c += ( 2554 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2555 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2556 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2557 @ CM 2558 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2559 ) / a**2 2560 return float(c) 2561 2562 def sample_D4x_correl(self, sample1, sample2 = None): 2563 ''' 2564 Correlation between Δ4x errors of samples 2565 2566 Returns the error correlation between the average Δ4x values of two samples. 2567 ''' 2568 if sample2 is None or sample2 == sample1: 2569 return 1. 2570 return ( 2571 self.sample_D4x_covar(sample1, sample2) 2572 / self.unknowns[sample1][f'SE_D{self._4x}'] 2573 / self.unknowns[sample2][f'SE_D{self._4x}'] 2574 ) 2575 2576 def plot_single_session(self, 2577 session, 2578 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2579 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2580 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2581 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2582 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2583 xylimits = 'free', # | 'constant' 2584 x_label = None, 2585 y_label = None, 2586 error_contour_interval = 'auto', 2587 fig = 'new', 2588 ): 2589 ''' 2590 Generate plot for a single session 2591 ''' 2592 if x_label is None: 2593 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2594 if y_label is None: 2595 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2596 2597 out = _SessionPlot() 2598 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2599 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2600 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2601 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2602 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2603 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2604 anchor_avg = (np.array([ np.array([ 2605 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2606 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2607 ]) for sample in anchors]).T, 2608 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2609 unknown_avg = (np.array([ np.array([ 2610 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2611 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2612 ]) for sample in unknowns]).T, 2613 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2614 2615 2616 if fig == 'new': 2617 out.fig = ppl.figure(figsize = (6,6)) 2618 ppl.subplots_adjust(.1,.1,.9,.9) 2619 2620 out.anchor_analyses, = ppl.plot( 2621 anchors_d, 2622 anchors_D, 2623 **kw_plot_anchors) 2624 out.unknown_analyses, = ppl.plot( 2625 unknowns_d, 2626 unknowns_D, 2627 **kw_plot_unknowns) 2628 out.anchor_avg = ppl.plot( 2629 *anchor_avg, 2630 **kw_plot_anchor_avg) 2631 out.unknown_avg = ppl.plot( 2632 *unknown_avg, 2633 **kw_plot_unknown_avg) 2634 if xylimits == 'constant': 2635 x = [r[f'd{self._4x}'] for r in self] 2636 y = [r[f'D{self._4x}'] for r in self] 2637 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2638 w, h = x2-x1, y2-y1 2639 x1 -= w/20 2640 x2 += w/20 2641 y1 -= h/20 2642 y2 += h/20 2643 ppl.axis([x1, x2, y1, y2]) 2644 elif xylimits == 'free': 2645 x1, x2, y1, y2 = ppl.axis() 2646 else: 2647 x1, x2, y1, y2 = ppl.axis(xylimits) 2648 2649 if error_contour_interval != 'none': 2650 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2651 XI,YI = np.meshgrid(xi, yi) 2652 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2653 if error_contour_interval == 'auto': 2654 rng = np.max(SI) - np.min(SI) 2655 if rng <= 0.01: 2656 cinterval = 0.001 2657 elif rng <= 0.03: 2658 cinterval = 0.004 2659 elif rng <= 0.1: 2660 cinterval = 0.01 2661 elif rng <= 0.3: 2662 cinterval = 0.03 2663 elif rng <= 1.: 2664 cinterval = 0.1 2665 else: 2666 cinterval = 0.5 2667 else: 2668 cinterval = error_contour_interval 2669 2670 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2671 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2672 out.clabel = ppl.clabel(out.contour) 2673 contour = (XI, YI, SI, cval, cinterval) 2674 2675 if fig == None: 2676 return { 2677 'anchors':anchors, 2678 'unknowns':unknowns, 2679 'anchors_d':anchors_d, 2680 'anchors_D':anchors_D, 2681 'unknowns_d':unknowns_d, 2682 'unknowns_D':unknowns_D, 2683 'anchor_avg':anchor_avg, 2684 'unknown_avg':unknown_avg, 2685 'contour':contour, 2686 } 2687 2688 ppl.xlabel(x_label) 2689 ppl.ylabel(y_label) 2690 ppl.title(session, weight = 'bold') 2691 ppl.grid(alpha = .2) 2692 out.ax = ppl.gca() 2693 2694 return out 2695 2696 def plot_residuals( 2697 self, 2698 kde = False, 2699 hist = False, 2700 binwidth = 2/3, 2701 dir = 'output', 2702 filename = None, 2703 highlight = [], 2704 colors = None, 2705 figsize = None, 2706 dpi = 100, 2707 yspan = None, 2708 ): 2709 ''' 2710 Plot residuals of each analysis as a function of time (actually, as a function of 2711 the order of analyses in the `D4xdata` object) 2712 2713 + `kde`: whether to add a kernel density estimate of residuals 2714 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2715 + `histbins`: specify bin edges for the histogram 2716 + `dir`: the directory in which to save the plot 2717 + `highlight`: a list of samples to highlight 2718 + `colors`: a dict of `{<sample>: <color>}` for all samples 2719 + `figsize`: (width, height) of figure 2720 + `dpi`: resolution for PNG output 2721 + `yspan`: factor controlling the range of y values shown in plot 2722 (by default: `yspan = 1.5 if kde else 1.0`) 2723 ''' 2724 2725 from matplotlib import ticker 2726 2727 if yspan is None: 2728 if kde: 2729 yspan = 1.5 2730 else: 2731 yspan = 1.0 2732 2733 # Layout 2734 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2735 if hist or kde: 2736 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2737 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2738 else: 2739 ppl.subplots_adjust(.08,.05,.78,.8) 2740 ax1 = ppl.subplot(111) 2741 2742 # Colors 2743 N = len(self.anchors) 2744 if colors is None: 2745 if len(highlight) > 0: 2746 Nh = len(highlight) 2747 if Nh == 1: 2748 colors = {highlight[0]: (0,0,0)} 2749 elif Nh == 3: 2750 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2751 elif Nh == 4: 2752 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2753 else: 2754 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2755 else: 2756 if N == 3: 2757 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2758 elif N == 4: 2759 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2760 else: 2761 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2762 2763 ppl.sca(ax1) 2764 2765 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2766 2767 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2768 2769 session = self[0]['Session'] 2770 x1 = 0 2771# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2772 x_sessions = {} 2773 one_or_more_singlets = False 2774 one_or_more_multiplets = False 2775 multiplets = set() 2776 for k,r in enumerate(self): 2777 if r['Session'] != session: 2778 x2 = k-1 2779 x_sessions[session] = (x1+x2)/2 2780 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2781 session = r['Session'] 2782 x1 = k 2783 singlet = len(self.samples[r['Sample']]['data']) == 1 2784 if not singlet: 2785 multiplets.add(r['Sample']) 2786 if r['Sample'] in self.unknowns: 2787 if singlet: 2788 one_or_more_singlets = True 2789 else: 2790 one_or_more_multiplets = True 2791 kw = dict( 2792 marker = 'x' if singlet else '+', 2793 ms = 4 if singlet else 5, 2794 ls = 'None', 2795 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2796 mew = 1, 2797 alpha = 0.2 if singlet else 1, 2798 ) 2799 if highlight and r['Sample'] not in highlight: 2800 kw['alpha'] = 0.2 2801 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2802 x2 = k 2803 x_sessions[session] = (x1+x2)/2 2804 2805 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2806 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2807 if not (hist or kde): 2808 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2809 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2810 2811 xmin, xmax, ymin, ymax = ppl.axis() 2812 if yspan != 1: 2813 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2814 for s in x_sessions: 2815 ppl.text( 2816 x_sessions[s], 2817 ymax +1, 2818 s, 2819 va = 'bottom', 2820 **( 2821 dict(ha = 'center') 2822 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2823 else dict(ha = 'left', rotation = 45) 2824 ) 2825 ) 2826 2827 if hist or kde: 2828 ppl.sca(ax2) 2829 2830 for s in colors: 2831 kw['marker'] = '+' 2832 kw['ms'] = 5 2833 kw['mec'] = colors[s] 2834 kw['label'] = s 2835 kw['alpha'] = 1 2836 ppl.plot([], [], **kw) 2837 2838 kw['mec'] = (0,0,0) 2839 2840 if one_or_more_singlets: 2841 kw['marker'] = 'x' 2842 kw['ms'] = 4 2843 kw['alpha'] = .2 2844 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2845 ppl.plot([], [], **kw) 2846 2847 if one_or_more_multiplets: 2848 kw['marker'] = '+' 2849 kw['ms'] = 4 2850 kw['alpha'] = 1 2851 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2852 ppl.plot([], [], **kw) 2853 2854 if hist or kde: 2855 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2856 else: 2857 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2858 leg.set_zorder(-1000) 2859 2860 ppl.sca(ax1) 2861 2862 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2863 ppl.xticks([]) 2864 ppl.axis([-1, len(self), None, None]) 2865 2866 if hist or kde: 2867 ppl.sca(ax2) 2868 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2869 2870 if kde: 2871 from scipy.stats import gaussian_kde 2872 yi = np.linspace(ymin, ymax, 201) 2873 xi = gaussian_kde(X).evaluate(yi) 2874 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2875# ppl.plot(xi, yi, 'k-', lw = 1) 2876 elif hist: 2877 ppl.hist( 2878 X, 2879 orientation = 'horizontal', 2880 histtype = 'stepfilled', 2881 ec = [.4]*3, 2882 fc = [.25]*3, 2883 alpha = .25, 2884 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2885 ) 2886 ppl.text(0, 0, 2887 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2888 size = 7.5, 2889 alpha = 1, 2890 va = 'center', 2891 ha = 'left', 2892 ) 2893 2894 ppl.axis([0, None, ymin, ymax]) 2895 ppl.xticks([]) 2896 ppl.yticks([]) 2897# ax2.spines['left'].set_visible(False) 2898 ax2.spines['right'].set_visible(False) 2899 ax2.spines['top'].set_visible(False) 2900 ax2.spines['bottom'].set_visible(False) 2901 2902 ax1.axis([None, None, ymin, ymax]) 2903 2904 if not os.path.exists(dir): 2905 os.makedirs(dir) 2906 if filename is None: 2907 return fig 2908 elif filename == '': 2909 filename = f'D{self._4x}_residuals.pdf' 2910 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2911 ppl.close(fig) 2912 2913 2914 def simulate(self, *args, **kwargs): 2915 ''' 2916 Legacy function with warning message pointing to `virtual_data()` 2917 ''' 2918 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2919 2920 def plot_distribution_of_analyses( 2921 self, 2922 dir = 'output', 2923 filename = None, 2924 vs_time = False, 2925 figsize = (6,4), 2926 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2927 output = None, 2928 dpi = 100, 2929 ): 2930 ''' 2931 Plot temporal distribution of all analyses in the data set. 2932 2933 **Parameters** 2934 2935 + `dir`: the directory in which to save the plot 2936 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2937 + `dpi`: resolution for PNG output 2938 + `figsize`: (width, height) of figure 2939 + `dpi`: resolution for PNG output 2940 ''' 2941 2942 asamples = [s for s in self.anchors] 2943 usamples = [s for s in self.unknowns] 2944 if output is None or output == 'fig': 2945 fig = ppl.figure(figsize = figsize) 2946 ppl.subplots_adjust(*subplots_adjust) 2947 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2948 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2949 Xmax += (Xmax-Xmin)/40 2950 Xmin -= (Xmax-Xmin)/41 2951 for k, s in enumerate(asamples + usamples): 2952 if vs_time: 2953 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2954 else: 2955 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2956 Y = [-k for x in X] 2957 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2958 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2959 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2960 ppl.axis([Xmin, Xmax, -k-1, 1]) 2961 ppl.xlabel('\ntime') 2962 ppl.gca().annotate('', 2963 xy = (0.6, -0.02), 2964 xycoords = 'axes fraction', 2965 xytext = (.4, -0.02), 2966 arrowprops = dict(arrowstyle = "->", color = 'k'), 2967 ) 2968 2969 2970 x2 = -1 2971 for session in self.sessions: 2972 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2973 if vs_time: 2974 ppl.axvline(x1, color = 'k', lw = .75) 2975 if x2 > -1: 2976 if not vs_time: 2977 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2978 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2979# from xlrd import xldate_as_datetime 2980# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2981 if vs_time: 2982 ppl.axvline(x2, color = 'k', lw = .75) 2983 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2984 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2985 2986 ppl.xticks([]) 2987 ppl.yticks([]) 2988 2989 if output is None: 2990 if not os.path.exists(dir): 2991 os.makedirs(dir) 2992 if filename == None: 2993 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2994 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2995 ppl.close(fig) 2996 elif output == 'ax': 2997 return ppl.gca() 2998 elif output == 'fig': 2999 return fig 3000 3001 3002 def plot_bulk_compositions( 3003 self, 3004 samples = None, 3005 dir = 'output/bulk_compositions', 3006 figsize = (6,6), 3007 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3008 show = False, 3009 sample_color = (0,.5,1), 3010 analysis_color = (.7,.7,.7), 3011 labeldist = 0.3, 3012 radius = 0.05, 3013 ): 3014 ''' 3015 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3016 3017 By default, creates a directory `./output/bulk_compositions` where plots for 3018 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3019 3020 3021 **Parameters** 3022 3023 + `samples`: Only these samples are processed (by default: all samples). 3024 + `dir`: where to save the plots 3025 + `figsize`: (width, height) of figure 3026 + `subplots_adjust`: passed to `subplots_adjust()` 3027 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3028 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3029 + `sample_color`: color used for replicate markers/labels 3030 + `analysis_color`: color used for sample markers/labels 3031 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3032 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3033 ''' 3034 3035 from matplotlib.patches import Ellipse 3036 3037 if samples is None: 3038 samples = [_ for _ in self.samples] 3039 3040 saved = {} 3041 3042 for s in samples: 3043 3044 fig = ppl.figure(figsize = figsize) 3045 fig.subplots_adjust(*subplots_adjust) 3046 ax = ppl.subplot(111) 3047 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3048 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3049 ppl.title(s) 3050 3051 3052 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3053 UID = [_['UID'] for _ in self.samples[s]['data']] 3054 XY0 = XY.mean(0) 3055 3056 for xy in XY: 3057 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3058 3059 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3060 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3061 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3062 saved[s] = [XY, XY0] 3063 3064 x1, x2, y1, y2 = ppl.axis() 3065 x0, dx = (x1+x2)/2, (x2-x1)/2 3066 y0, dy = (y1+y2)/2, (y2-y1)/2 3067 dx, dy = [max(max(dx, dy), radius)]*2 3068 3069 ppl.axis([ 3070 x0 - 1.2*dx, 3071 x0 + 1.2*dx, 3072 y0 - 1.2*dy, 3073 y0 + 1.2*dy, 3074 ]) 3075 3076 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3077 3078 for xy, uid in zip(XY, UID): 3079 3080 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3081 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3082 3083 if (vector_in_display_space**2).sum() > 0: 3084 3085 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3086 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3087 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3088 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3089 3090 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3091 3092 else: 3093 3094 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3095 3096 if radius: 3097 ax.add_artist(Ellipse( 3098 xy = XY0, 3099 width = radius*2, 3100 height = radius*2, 3101 ls = (0, (2,2)), 3102 lw = .7, 3103 ec = analysis_color, 3104 fc = 'None', 3105 )) 3106 ppl.text( 3107 XY0[0], 3108 XY0[1]-radius, 3109 f'\n± {radius*1e3:.0f} ppm', 3110 color = analysis_color, 3111 va = 'top', 3112 ha = 'center', 3113 linespacing = 0.4, 3114 size = 8, 3115 ) 3116 3117 if not os.path.exists(dir): 3118 os.makedirs(dir) 3119 fig.savefig(f'{dir}/{s}.pdf') 3120 ppl.close(fig) 3121 3122 fig = ppl.figure(figsize = figsize) 3123 fig.subplots_adjust(*subplots_adjust) 3124 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3125 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3126 3127 for s in saved: 3128 for xy in saved[s][0]: 3129 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3130 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3131 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3132 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3133 3134 x1, x2, y1, y2 = ppl.axis() 3135 ppl.axis([ 3136 x1 - (x2-x1)/10, 3137 x2 + (x2-x1)/10, 3138 y1 - (y2-y1)/10, 3139 y2 + (y2-y1)/10, 3140 ]) 3141 3142 3143 if not os.path.exists(dir): 3144 os.makedirs(dir) 3145 fig.savefig(f'{dir}/__all__.pdf') 3146 if show: 3147 ppl.show() 3148 ppl.close(fig) 3149 3150 3151 def _save_D4x_correl( 3152 self, 3153 samples = None, 3154 dir = 'output', 3155 filename = None, 3156 D4x_precision = 4, 3157 correl_precision = 4, 3158 ): 3159 ''' 3160 Save D4x values along with their SE and correlation matrix. 3161 3162 **Parameters** 3163 3164 + `samples`: Only these samples are output (by default: all samples). 3165 + `dir`: the directory in which to save the faile (by defaut: `output`) 3166 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3167 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3168 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3169 ''' 3170 if samples is None: 3171 samples = sorted([s for s in self.unknowns]) 3172 3173 out = [['Sample']] + [[s] for s in samples] 3174 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3175 for k,s in enumerate(samples): 3176 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3177 for s2 in samples: 3178 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3179 3180 if not os.path.exists(dir): 3181 os.makedirs(dir) 3182 if filename is None: 3183 filename = f'D{self._4x}_correl.csv' 3184 with open(f'{dir}/{filename}', 'w') as fid: 3185 fid.write(make_csv(out))
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
957 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 958 ''' 959 **Parameters** 960 961 + `l`: a list of dictionaries, with each dictionary including at least the keys 962 `Sample`, `d45`, `d46`, and `d47` or `d48`. 963 + `mass`: `'47'` or `'48'` 964 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 965 + `session`: define session name for analyses without a `Session` key 966 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 967 968 Returns a `D4xdata` object derived from `list`. 969 ''' 970 self._4x = mass 971 self.verbose = verbose 972 self.prefix = 'D4xdata' 973 self.logfile = logfile 974 list.__init__(self, l) 975 self.Nf = None 976 self.repeatability = {} 977 self.refresh(session = session)
Parameters
l
: a list of dictionaries, with each dictionary including at least the keysSample
,d45
,d46
, andd47
ord48
.mass
:'47'
or'48'
logfile
: if specified, write detailed logs to this file path when callingD4xdata
methods.session
: define session name for analyses without aSession
keyverbose
: ifTrue
, print out detailed logs when callingD4xdata
methods.
Returns a D4xdata
object derived from list
.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB
)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092
.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17
.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE
(by default equal to 'ETH-3'
) specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg()
,
D4xdata.standardize_d13C
, and D4xdata.standardize_d18O
.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C()
.
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}
after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O()
.
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}
after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none
: do not apply any δ13C standardization.'1pt'
: within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).
Method by which to standardize δ18O values:
none
: do not apply any δ18O standardization.'1pt'
: within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).
980 def make_verbal(oldfun): 981 ''' 982 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 983 ''' 984 @wraps(oldfun) 985 def newfun(*args, verbose = '', **kwargs): 986 myself = args[0] 987 oldprefix = myself.prefix 988 myself.prefix = oldfun.__name__ 989 if verbose != '': 990 oldverbose = myself.verbose 991 myself.verbose = verbose 992 out = oldfun(*args, **kwargs) 993 myself.prefix = oldprefix 994 if verbose != '': 995 myself.verbose = oldverbose 996 return out 997 return newfun
Decorator: allow temporarily changing self.prefix
and overriding self.verbose
.
1000 def msg(self, txt): 1001 ''' 1002 Log a message to `self.logfile`, and print it out if `verbose = True` 1003 ''' 1004 self.log(txt) 1005 if self.verbose: 1006 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
, and print it out if verbose = True
1009 def vmsg(self, txt): 1010 ''' 1011 Log a message to `self.logfile` and print it out 1012 ''' 1013 self.log(txt) 1014 print(txt)
Log a message to self.logfile
and print it out
1017 def log(self, *txts): 1018 ''' 1019 Log a message to `self.logfile` 1020 ''' 1021 if self.logfile: 1022 with open(self.logfile, 'a') as fid: 1023 for txt in txts: 1024 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
1027 def refresh(self, session = 'mySession'): 1028 ''' 1029 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1030 ''' 1031 self.fill_in_missing_info(session = session) 1032 self.refresh_sessions() 1033 self.refresh_samples()
Update self.sessions
, self.samples
, self.anchors
, and self.unknowns
.
1036 def refresh_sessions(self): 1037 ''' 1038 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1039 to `False` for all sessions. 1040 ''' 1041 self.sessions = { 1042 s: {'data': [r for r in self if r['Session'] == s]} 1043 for s in sorted({r['Session'] for r in self}) 1044 } 1045 for s in self.sessions: 1046 self.sessions[s]['scrambling_drift'] = False 1047 self.sessions[s]['slope_drift'] = False 1048 self.sessions[s]['wg_drift'] = False 1049 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1050 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions
and set scrambling_drift
, slope_drift
, and wg_drift
to False
for all sessions.
1053 def refresh_samples(self): 1054 ''' 1055 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1056 ''' 1057 self.samples = { 1058 s: {'data': [r for r in self if r['Sample'] == s]} 1059 for s in sorted({r['Sample'] for r in self}) 1060 } 1061 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1062 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples
, self.anchors
, and self.unknowns
.
1065 def read(self, filename, sep = '', session = ''): 1066 ''' 1067 Read file in csv format to load data into a `D47data` object. 1068 1069 In the csv file, spaces before and after field separators (`','` by default) 1070 are optional. Each line corresponds to a single analysis. 1071 1072 The required fields are: 1073 1074 + `UID`: a unique identifier 1075 + `Session`: an identifier for the analytical session 1076 + `Sample`: a sample identifier 1077 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1078 1079 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1080 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1081 and `d49` are optional, and set to NaN by default. 1082 1083 **Parameters** 1084 1085 + `fileneme`: the path of the file to read 1086 + `sep`: csv separator delimiting the fields 1087 + `session`: set `Session` field to this string for all analyses 1088 ''' 1089 with open(filename) as fid: 1090 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data
object.
In the csv file, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
fileneme
: the path of the file to readsep
: csv separator delimiting the fieldssession
: setSession
field to this string for all analyses
1093 def input(self, txt, sep = '', session = ''): 1094 ''' 1095 Read `txt` string in csv format to load analysis data into a `D47data` object. 1096 1097 In the csv string, spaces before and after field separators (`','` by default) 1098 are optional. Each line corresponds to a single analysis. 1099 1100 The required fields are: 1101 1102 + `UID`: a unique identifier 1103 + `Session`: an identifier for the analytical session 1104 + `Sample`: a sample identifier 1105 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1106 1107 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1108 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1109 and `d49` are optional, and set to NaN by default. 1110 1111 **Parameters** 1112 1113 + `txt`: the csv string to read 1114 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1115 whichever appers most often in `txt`. 1116 + `session`: set `Session` field to this string for all analyses 1117 ''' 1118 if sep == '': 1119 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1120 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1121 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1122 1123 if session != '': 1124 for r in data: 1125 r['Session'] = session 1126 1127 self += data 1128 self.refresh()
Read txt
string in csv format to load analysis data into a D47data
object.
In the csv string, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
txt
: the csv string to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in
txt
.session
: setSession
field to this string for all analyses
1131 @make_verbal 1132 def wg(self, samples = None, a18_acid = None): 1133 ''' 1134 Compute bulk composition of the working gas for each session based on 1135 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1136 `self.Nominal_d18O_VPDB`. 1137 ''' 1138 1139 self.msg('Computing WG composition:') 1140 1141 if a18_acid is None: 1142 a18_acid = self.ALPHA_18O_ACID_REACTION 1143 if samples is None: 1144 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1145 1146 assert a18_acid, f'Acid fractionation factor should not be zero.' 1147 1148 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1149 R45R46_standards = {} 1150 for sample in samples: 1151 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1152 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1153 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1154 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1155 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1156 1157 C12_s = 1 / (1 + R13_s) 1158 C13_s = R13_s / (1 + R13_s) 1159 C16_s = 1 / (1 + R17_s + R18_s) 1160 C17_s = R17_s / (1 + R17_s + R18_s) 1161 C18_s = R18_s / (1 + R17_s + R18_s) 1162 1163 C626_s = C12_s * C16_s ** 2 1164 C627_s = 2 * C12_s * C16_s * C17_s 1165 C628_s = 2 * C12_s * C16_s * C18_s 1166 C636_s = C13_s * C16_s ** 2 1167 C637_s = 2 * C13_s * C16_s * C17_s 1168 C727_s = C12_s * C17_s ** 2 1169 1170 R45_s = (C627_s + C636_s) / C626_s 1171 R46_s = (C628_s + C637_s + C727_s) / C626_s 1172 R45R46_standards[sample] = (R45_s, R46_s) 1173 1174 for s in self.sessions: 1175 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1176 assert db, f'No sample from {samples} found in session "{s}".' 1177# dbsamples = sorted({r['Sample'] for r in db}) 1178 1179 X = [r['d45'] for r in db] 1180 Y = [R45R46_standards[r['Sample']][0] for r in db] 1181 x1, x2 = np.min(X), np.max(X) 1182 1183 if x1 < x2: 1184 wgcoord = x1/(x1-x2) 1185 else: 1186 wgcoord = 999 1187 1188 if wgcoord < -.5 or wgcoord > 1.5: 1189 # unreasonable to extrapolate to d45 = 0 1190 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1191 else : 1192 # d45 = 0 is reasonably well bracketed 1193 R45_wg = np.polyfit(X, Y, 1)[1] 1194 1195 X = [r['d46'] for r in db] 1196 Y = [R45R46_standards[r['Sample']][1] for r in db] 1197 x1, x2 = np.min(X), np.max(X) 1198 1199 if x1 < x2: 1200 wgcoord = x1/(x1-x2) 1201 else: 1202 wgcoord = 999 1203 1204 if wgcoord < -.5 or wgcoord > 1.5: 1205 # unreasonable to extrapolate to d46 = 0 1206 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1207 else : 1208 # d46 = 0 is reasonably well bracketed 1209 R46_wg = np.polyfit(X, Y, 1)[1] 1210 1211 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1212 1213 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1214 1215 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1216 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1217 for r in self.sessions[s]['data']: 1218 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1219 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based on
the carbonate standards defined in both self.Nominal_d13C_VPDB
and
self.Nominal_d18O_VPDB
.
1222 def compute_bulk_delta(self, R45, R46, D17O = 0): 1223 ''' 1224 Compute δ13C_VPDB and δ18O_VSMOW, 1225 by solving the generalized form of equation (17) from 1226 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1227 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1228 solving the corresponding second-order Taylor polynomial. 1229 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1230 ''' 1231 1232 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1233 1234 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1235 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1236 C = 2 * self.R18_VSMOW 1237 D = -R46 1238 1239 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1240 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1241 cc = A + B + C + D 1242 1243 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1244 1245 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1246 R17 = K * R18 ** self.LAMBDA_17 1247 R13 = R45 - 2 * R17 1248 1249 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1250 1251 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1254 @make_verbal 1255 def crunch(self, verbose = ''): 1256 ''' 1257 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1258 ''' 1259 for r in self: 1260 self.compute_bulk_and_clumping_deltas(r) 1261 self.standardize_d13C() 1262 self.standardize_d18O() 1263 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1266 def fill_in_missing_info(self, session = 'mySession'): 1267 ''' 1268 Fill in optional fields with default values 1269 ''' 1270 for i,r in enumerate(self): 1271 if 'D17O' not in r: 1272 r['D17O'] = 0. 1273 if 'UID' not in r: 1274 r['UID'] = f'{i+1}' 1275 if 'Session' not in r: 1276 r['Session'] = session 1277 for k in ['d47', 'd48', 'd49']: 1278 if k not in r: 1279 r[k] = np.nan
Fill in optional fields with default values
1282 def standardize_d13C(self): 1283 ''' 1284 Perform δ13C standadization within each session `s` according to 1285 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1286 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1287 may be redefined abitrarily at a later stage. 1288 ''' 1289 for s in self.sessions: 1290 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1291 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1292 X,Y = zip(*XY) 1293 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1294 offset = np.mean(Y) - np.mean(X) 1295 for r in self.sessions[s]['data']: 1296 r['d13C_VPDB'] += offset 1297 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1298 a,b = np.polyfit(X,Y,1) 1299 for r in self.sessions[s]['data']: 1300 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s
according to
self.sessions[s]['d13C_standardization_method']
, which is defined by default
by D47data.refresh_sessions()
as equal to self.d13C_STANDARDIZATION_METHOD
, but
may be redefined abitrarily at a later stage.
1302 def standardize_d18O(self): 1303 ''' 1304 Perform δ18O standadization within each session `s` according to 1305 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1306 which is defined by default by `D47data.refresh_sessions()`as equal to 1307 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1308 ''' 1309 for s in self.sessions: 1310 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1311 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1312 X,Y = zip(*XY) 1313 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1314 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1315 offset = np.mean(Y) - np.mean(X) 1316 for r in self.sessions[s]['data']: 1317 r['d18O_VSMOW'] += offset 1318 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1319 a,b = np.polyfit(X,Y,1) 1320 for r in self.sessions[s]['data']: 1321 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s
according to
self.ALPHA_18O_ACID_REACTION
and self.sessions[s]['d18O_standardization_method']
,
which is defined by default by D47data.refresh_sessions()
as equal to
self.d18O_STANDARDIZATION_METHOD
, but may be redefined abitrarily at a later stage.
1324 def compute_bulk_and_clumping_deltas(self, r): 1325 ''' 1326 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1327 ''' 1328 1329 # Compute working gas R13, R18, and isobar ratios 1330 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1331 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1332 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1333 1334 # Compute analyte isobar ratios 1335 R45 = (1 + r['d45'] / 1000) * R45_wg 1336 R46 = (1 + r['d46'] / 1000) * R46_wg 1337 R47 = (1 + r['d47'] / 1000) * R47_wg 1338 R48 = (1 + r['d48'] / 1000) * R48_wg 1339 R49 = (1 + r['d49'] / 1000) * R49_wg 1340 1341 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1342 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1343 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1344 1345 # Compute stochastic isobar ratios of the analyte 1346 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1347 R13, R18, D17O = r['D17O'] 1348 ) 1349 1350 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1351 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1352 if (R45 / R45stoch - 1) > 5e-8: 1353 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1354 if (R46 / R46stoch - 1) > 5e-8: 1355 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1356 1357 # Compute raw clumped isotope anomalies 1358 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1359 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1360 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r
.
1363 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1364 ''' 1365 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1366 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1367 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1368 ''' 1369 1370 # Compute R17 1371 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1372 1373 # Compute isotope concentrations 1374 C12 = (1 + R13) ** -1 1375 C13 = C12 * R13 1376 C16 = (1 + R17 + R18) ** -1 1377 C17 = C16 * R17 1378 C18 = C16 * R18 1379 1380 # Compute stochastic isotopologue concentrations 1381 C626 = C16 * C12 * C16 1382 C627 = C16 * C12 * C17 * 2 1383 C628 = C16 * C12 * C18 * 2 1384 C636 = C16 * C13 * C16 1385 C637 = C16 * C13 * C17 * 2 1386 C638 = C16 * C13 * C18 * 2 1387 C727 = C17 * C12 * C17 1388 C728 = C17 * C12 * C18 * 2 1389 C737 = C17 * C13 * C17 1390 C738 = C17 * C13 * C18 * 2 1391 C828 = C18 * C12 * C18 1392 C838 = C18 * C13 * C18 1393 1394 # Compute stochastic isobar ratios 1395 R45 = (C636 + C627) / C626 1396 R46 = (C628 + C637 + C727) / C626 1397 R47 = (C638 + C728 + C737) / C626 1398 R48 = (C738 + C828) / C626 1399 R49 = C838 / C626 1400 1401 # Account for stochastic anomalies 1402 R47 *= 1 + D47 / 1000 1403 R48 *= 1 + D48 / 1000 1404 R49 *= 1 + D49 / 1000 1405 1406 # Return isobar ratios 1407 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13
and R18
,
optionally accounting for non-zero values of Δ17O (D17O
) and clumped isotope
anomalies (D47
, D48
, D49
), all expressed in permil.
1410 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1411 ''' 1412 Split unknown samples by UID (treat all analyses as different samples) 1413 or by session (treat analyses of a given sample in different sessions as 1414 different samples). 1415 1416 **Parameters** 1417 1418 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1419 + `grouping`: `by_uid` | `by_session` 1420 ''' 1421 if samples_to_split == 'all': 1422 samples_to_split = [s for s in self.unknowns] 1423 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1424 self.grouping = grouping.lower() 1425 if self.grouping in gkeys: 1426 gkey = gkeys[self.grouping] 1427 for r in self: 1428 if r['Sample'] in samples_to_split: 1429 r['Sample_original'] = r['Sample'] 1430 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1431 elif r['Sample'] in self.unknowns: 1432 r['Sample_original'] = r['Sample'] 1433 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split
: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']
grouping
:by_uid
|by_session
1436 def unsplit_samples(self, tables = False): 1437 ''' 1438 Reverse the effects of `D47data.split_samples()`. 1439 1440 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1441 1442 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1443 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1444 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1445 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1446 that case session-averaged Δ4x values are statistically independent). 1447 ''' 1448 unknowns_old = sorted({s for s in self.unknowns}) 1449 CM_old = self.standardization.covar[:,:] 1450 VD_old = self.standardization.params.valuesdict().copy() 1451 vars_old = self.standardization.var_names 1452 1453 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1454 1455 Ns = len(vars_old) - len(unknowns_old) 1456 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1457 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1458 1459 W = np.zeros((len(vars_new), len(vars_old))) 1460 W[:Ns,:Ns] = np.eye(Ns) 1461 for u in unknowns_new: 1462 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1463 if self.grouping == 'by_session': 1464 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1465 elif self.grouping == 'by_uid': 1466 weights = [1 for s in splits] 1467 sw = sum(weights) 1468 weights = [w/sw for w in weights] 1469 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1470 1471 CM_new = W @ CM_old @ W.T 1472 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1473 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1474 1475 self.standardization.covar = CM_new 1476 self.standardization.params.valuesdict = lambda : VD_new 1477 self.standardization.var_names = vars_new 1478 1479 for r in self: 1480 if r['Sample'] in self.unknowns: 1481 r['Sample_split'] = r['Sample'] 1482 r['Sample'] = r['Sample_original'] 1483 1484 self.refresh_samples() 1485 self.consolidate_samples() 1486 self.repeatabilities() 1487 1488 if tables: 1489 self.table_of_analyses() 1490 self.table_of_samples()
Reverse the effects of D47data.split_samples()
.
This should only be used after D4xdata.standardize()
with method='pooled'
.
After D4xdata.standardize()
with method='indep_sessions'
, one should
probably use D4xdata.combine_samples()
instead to reverse the effects of
D47data.split_samples()
with grouping='by_uid'
, or w_avg()
to reverse the
effects of D47data.split_samples()
with grouping='by_sessions'
(because in
that case session-averaged Δ4x values are statistically independent).
1492 def assign_timestamps(self): 1493 ''' 1494 Assign a time field `t` of type `float` to each analysis. 1495 1496 If `TimeTag` is one of the data fields, `t` is equal within a given session 1497 to `TimeTag` minus the mean value of `TimeTag` for that session. 1498 Otherwise, `TimeTag` is by default equal to the index of each analysis 1499 in the dataset and `t` is defined as above. 1500 ''' 1501 for session in self.sessions: 1502 sdata = self.sessions[session]['data'] 1503 try: 1504 t0 = np.mean([r['TimeTag'] for r in sdata]) 1505 for r in sdata: 1506 r['t'] = r['TimeTag'] - t0 1507 except KeyError: 1508 t0 = (len(sdata)-1)/2 1509 for t,r in enumerate(sdata): 1510 r['t'] = t - t0
Assign a time field t
of type float
to each analysis.
If TimeTag
is one of the data fields, t
is equal within a given session
to TimeTag
minus the mean value of TimeTag
for that session.
Otherwise, TimeTag
is by default equal to the index of each analysis
in the dataset and t
is defined as above.
1513 def report(self): 1514 ''' 1515 Prints a report on the standardization fit. 1516 Only applicable after `D4xdata.standardize(method='pooled')`. 1517 ''' 1518 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled')
.
1521 def combine_samples(self, sample_groups): 1522 ''' 1523 Combine analyses of different samples to compute weighted average Δ4x 1524 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1525 dictionary. 1526 1527 Caution: samples are weighted by number of replicate analyses, which is a 1528 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1529 correlated analytical errors for one or more samples). 1530 1531 Returns a tuplet of: 1532 1533 + the list of group names 1534 + an array of the corresponding Δ4x values 1535 + the corresponding (co)variance matrix 1536 1537 **Parameters** 1538 1539 + `sample_groups`: a dictionary of the form: 1540 ```py 1541 {'group1': ['sample_1', 'sample_2'], 1542 'group2': ['sample_3', 'sample_4', 'sample_5']} 1543 ``` 1544 ''' 1545 1546 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1547 groups = sorted(sample_groups.keys()) 1548 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1549 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1550 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1551 W = np.array([ 1552 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1553 for j in groups]) 1554 D4x_new = W @ D4x_old 1555 CM_new = W @ CM_old @ W.T 1556 1557 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups
: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1560 @make_verbal 1561 def standardize(self, 1562 method = 'pooled', 1563 weighted_sessions = [], 1564 consolidate = True, 1565 consolidate_tables = False, 1566 consolidate_plots = False, 1567 constraints = {}, 1568 ): 1569 ''' 1570 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1571 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1572 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1573 i.e. that their true Δ4x value does not change between sessions, 1574 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1575 `'indep_sessions'`, the standardization processes each session independently, based only 1576 on anchors analyses. 1577 ''' 1578 1579 self.standardization_method = method 1580 self.assign_timestamps() 1581 1582 if method == 'pooled': 1583 if weighted_sessions: 1584 for session_group in weighted_sessions: 1585 if self._4x == '47': 1586 X = D47data([r for r in self if r['Session'] in session_group]) 1587 elif self._4x == '48': 1588 X = D48data([r for r in self if r['Session'] in session_group]) 1589 X.Nominal_D4x = self.Nominal_D4x.copy() 1590 X.refresh() 1591 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1592 w = np.sqrt(result.redchi) 1593 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1594 for r in X: 1595 r[f'wD{self._4x}raw'] *= w 1596 else: 1597 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1598 for r in self: 1599 r[f'wD{self._4x}raw'] = 1. 1600 1601 params = Parameters() 1602 for k,session in enumerate(self.sessions): 1603 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1604 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1605 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1606 s = pf(session) 1607 params.add(f'a_{s}', value = 0.9) 1608 params.add(f'b_{s}', value = 0.) 1609 params.add(f'c_{s}', value = -0.9) 1610 params.add(f'a2_{s}', value = 0., 1611# vary = self.sessions[session]['scrambling_drift'], 1612 ) 1613 params.add(f'b2_{s}', value = 0., 1614# vary = self.sessions[session]['slope_drift'], 1615 ) 1616 params.add(f'c2_{s}', value = 0., 1617# vary = self.sessions[session]['wg_drift'], 1618 ) 1619 if not self.sessions[session]['scrambling_drift']: 1620 params[f'a2_{s}'].expr = '0' 1621 if not self.sessions[session]['slope_drift']: 1622 params[f'b2_{s}'].expr = '0' 1623 if not self.sessions[session]['wg_drift']: 1624 params[f'c2_{s}'].expr = '0' 1625 1626 for sample in self.unknowns: 1627 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1628 1629 for k in constraints: 1630 params[k].expr = constraints[k] 1631 1632 def residuals(p): 1633 R = [] 1634 for r in self: 1635 session = pf(r['Session']) 1636 sample = pf(r['Sample']) 1637 if r['Sample'] in self.Nominal_D4x: 1638 R += [ ( 1639 r[f'D{self._4x}raw'] - ( 1640 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1641 + p[f'b_{session}'] * r[f'd{self._4x}'] 1642 + p[f'c_{session}'] 1643 + r['t'] * ( 1644 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1645 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1646 + p[f'c2_{session}'] 1647 ) 1648 ) 1649 ) / r[f'wD{self._4x}raw'] ] 1650 else: 1651 R += [ ( 1652 r[f'D{self._4x}raw'] - ( 1653 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1654 + p[f'b_{session}'] * r[f'd{self._4x}'] 1655 + p[f'c_{session}'] 1656 + r['t'] * ( 1657 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1658 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1659 + p[f'c2_{session}'] 1660 ) 1661 ) 1662 ) / r[f'wD{self._4x}raw'] ] 1663 return R 1664 1665 M = Minimizer(residuals, params) 1666 result = M.least_squares() 1667 self.Nf = result.nfree 1668 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1669 new_names, new_covar, new_se = _fullcovar(result)[:3] 1670 result.var_names = new_names 1671 result.covar = new_covar 1672 1673 for r in self: 1674 s = pf(r["Session"]) 1675 a = result.params.valuesdict()[f'a_{s}'] 1676 b = result.params.valuesdict()[f'b_{s}'] 1677 c = result.params.valuesdict()[f'c_{s}'] 1678 a2 = result.params.valuesdict()[f'a2_{s}'] 1679 b2 = result.params.valuesdict()[f'b2_{s}'] 1680 c2 = result.params.valuesdict()[f'c2_{s}'] 1681 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1682 1683 1684 self.standardization = result 1685 1686 for session in self.sessions: 1687 self.sessions[session]['Np'] = 3 1688 for k in ['scrambling', 'slope', 'wg']: 1689 if self.sessions[session][f'{k}_drift']: 1690 self.sessions[session]['Np'] += 1 1691 1692 if consolidate: 1693 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1694 return result 1695 1696 1697 elif method == 'indep_sessions': 1698 1699 if weighted_sessions: 1700 for session_group in weighted_sessions: 1701 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1702 X.Nominal_D4x = self.Nominal_D4x.copy() 1703 X.refresh() 1704 # This is only done to assign r['wD47raw'] for r in X: 1705 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1706 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1707 else: 1708 self.msg('All weights set to 1 ‰') 1709 for r in self: 1710 r[f'wD{self._4x}raw'] = 1 1711 1712 for session in self.sessions: 1713 s = self.sessions[session] 1714 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1715 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1716 s['Np'] = sum(p_active) 1717 sdata = s['data'] 1718 1719 A = np.array([ 1720 [ 1721 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1722 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1723 1 / r[f'wD{self._4x}raw'], 1724 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1725 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1726 r['t'] / r[f'wD{self._4x}raw'] 1727 ] 1728 for r in sdata if r['Sample'] in self.anchors 1729 ])[:,p_active] # only keep columns for the active parameters 1730 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1731 s['Na'] = Y.size 1732 CM = linalg.inv(A.T @ A) 1733 bf = (CM @ A.T @ Y).T[0,:] 1734 k = 0 1735 for n,a in zip(p_names, p_active): 1736 if a: 1737 s[n] = bf[k] 1738# self.msg(f'{n} = {bf[k]}') 1739 k += 1 1740 else: 1741 s[n] = 0. 1742# self.msg(f'{n} = 0.0') 1743 1744 for r in sdata : 1745 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1746 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1747 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1748 1749 s['CM'] = np.zeros((6,6)) 1750 i = 0 1751 k_active = [j for j,a in enumerate(p_active) if a] 1752 for j,a in enumerate(p_active): 1753 if a: 1754 s['CM'][j,k_active] = CM[i,:] 1755 i += 1 1756 1757 if not weighted_sessions: 1758 w = self.rmswd()['rmswd'] 1759 for r in self: 1760 r[f'wD{self._4x}'] *= w 1761 r[f'wD{self._4x}raw'] *= w 1762 for session in self.sessions: 1763 self.sessions[session]['CM'] *= w**2 1764 1765 for session in self.sessions: 1766 s = self.sessions[session] 1767 s['SE_a'] = s['CM'][0,0]**.5 1768 s['SE_b'] = s['CM'][1,1]**.5 1769 s['SE_c'] = s['CM'][2,2]**.5 1770 s['SE_a2'] = s['CM'][3,3]**.5 1771 s['SE_b2'] = s['CM'][4,4]**.5 1772 s['SE_c2'] = s['CM'][5,5]**.5 1773 1774 if not weighted_sessions: 1775 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1776 else: 1777 self.Nf = 0 1778 for sg in weighted_sessions: 1779 self.Nf += self.rmswd(sessions = sg)['Nf'] 1780 1781 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1782 1783 avgD4x = { 1784 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1785 for sample in self.samples 1786 } 1787 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1788 rD4x = (chi2/self.Nf)**.5 1789 self.repeatability[f'sigma_{self._4x}'] = rD4x 1790 1791 if consolidate: 1792 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method
argument is set to 'pooled'
, the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method
argument is set to
'indep_sessions'
, the standardization processes each session independently, based only
on anchors analyses.
1795 def standardization_error(self, session, d4x, D4x, t = 0): 1796 ''' 1797 Compute standardization error for a given session and 1798 (δ47, Δ47) composition. 1799 ''' 1800 a = self.sessions[session]['a'] 1801 b = self.sessions[session]['b'] 1802 c = self.sessions[session]['c'] 1803 a2 = self.sessions[session]['a2'] 1804 b2 = self.sessions[session]['b2'] 1805 c2 = self.sessions[session]['c2'] 1806 CM = self.sessions[session]['CM'] 1807 1808 x, y = D4x, d4x 1809 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1810# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1811 dxdy = -(b+b2*t) / (a+a2*t) 1812 dxdz = 1. / (a+a2*t) 1813 dxda = -x / (a+a2*t) 1814 dxdb = -y / (a+a2*t) 1815 dxdc = -1. / (a+a2*t) 1816 dxda2 = -x * a2 / (a+a2*t) 1817 dxdb2 = -y * t / (a+a2*t) 1818 dxdc2 = -t / (a+a2*t) 1819 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1820 sx = (V @ CM @ V.T) ** .5 1821 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1824 @make_verbal 1825 def summary(self, 1826 dir = 'output', 1827 filename = None, 1828 save_to_file = True, 1829 print_out = True, 1830 ): 1831 ''' 1832 Print out an/or save to disk a summary of the standardization results. 1833 1834 **Parameters** 1835 1836 + `dir`: the directory in which to save the table 1837 + `filename`: the name to the csv file to write to 1838 + `save_to_file`: whether to save the table to disk 1839 + `print_out`: whether to print out the table 1840 ''' 1841 1842 out = [] 1843 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1844 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1845 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1846 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1847 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1848 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1849 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1850 out += [['Model degrees of freedom', f"{self.Nf}"]] 1851 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1852 out += [['Standardization method', self.standardization_method]] 1853 1854 if save_to_file: 1855 if not os.path.exists(dir): 1856 os.makedirs(dir) 1857 if filename is None: 1858 filename = f'D{self._4x}_summary.csv' 1859 with open(f'{dir}/{filename}', 'w') as fid: 1860 fid.write(make_csv(out)) 1861 if print_out: 1862 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the table
1865 @make_verbal 1866 def table_of_sessions(self, 1867 dir = 'output', 1868 filename = None, 1869 save_to_file = True, 1870 print_out = True, 1871 output = None, 1872 ): 1873 ''' 1874 Print out an/or save to disk a table of sessions. 1875 1876 **Parameters** 1877 1878 + `dir`: the directory in which to save the table 1879 + `filename`: the name to the csv file to write to 1880 + `save_to_file`: whether to save the table to disk 1881 + `print_out`: whether to print out the table 1882 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1883 if set to `'raw'`: return a list of list of strings 1884 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1885 ''' 1886 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1887 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1888 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1889 1890 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1891 if include_a2: 1892 out[-1] += ['a2 ± SE'] 1893 if include_b2: 1894 out[-1] += ['b2 ± SE'] 1895 if include_c2: 1896 out[-1] += ['c2 ± SE'] 1897 for session in self.sessions: 1898 out += [[ 1899 session, 1900 f"{self.sessions[session]['Na']}", 1901 f"{self.sessions[session]['Nu']}", 1902 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1903 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1904 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1905 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1906 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1907 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1908 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1909 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1910 ]] 1911 if include_a2: 1912 if self.sessions[session]['scrambling_drift']: 1913 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1914 else: 1915 out[-1] += [''] 1916 if include_b2: 1917 if self.sessions[session]['slope_drift']: 1918 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1919 else: 1920 out[-1] += [''] 1921 if include_c2: 1922 if self.sessions[session]['wg_drift']: 1923 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1924 else: 1925 out[-1] += [''] 1926 1927 if save_to_file: 1928 if not os.path.exists(dir): 1929 os.makedirs(dir) 1930 if filename is None: 1931 filename = f'D{self._4x}_sessions.csv' 1932 with open(f'{dir}/{filename}', 'w') as fid: 1933 fid.write(make_csv(out)) 1934 if print_out: 1935 self.msg('\n' + pretty_table(out)) 1936 if output == 'raw': 1937 return out 1938 elif output == 'pretty': 1939 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1942 @make_verbal 1943 def table_of_analyses( 1944 self, 1945 dir = 'output', 1946 filename = None, 1947 save_to_file = True, 1948 print_out = True, 1949 output = None, 1950 ): 1951 ''' 1952 Print out an/or save to disk a table of analyses. 1953 1954 **Parameters** 1955 1956 + `dir`: the directory in which to save the table 1957 + `filename`: the name to the csv file to write to 1958 + `save_to_file`: whether to save the table to disk 1959 + `print_out`: whether to print out the table 1960 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1961 if set to `'raw'`: return a list of list of strings 1962 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1963 ''' 1964 1965 out = [['UID','Session','Sample']] 1966 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1967 for f in extra_fields: 1968 out[-1] += [f[0]] 1969 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1970 for r in self: 1971 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1972 for f in extra_fields: 1973 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1974 out[-1] += [ 1975 f"{r['d13Cwg_VPDB']:.3f}", 1976 f"{r['d18Owg_VSMOW']:.3f}", 1977 f"{r['d45']:.6f}", 1978 f"{r['d46']:.6f}", 1979 f"{r['d47']:.6f}", 1980 f"{r['d48']:.6f}", 1981 f"{r['d49']:.6f}", 1982 f"{r['d13C_VPDB']:.6f}", 1983 f"{r['d18O_VSMOW']:.6f}", 1984 f"{r['D47raw']:.6f}", 1985 f"{r['D48raw']:.6f}", 1986 f"{r['D49raw']:.6f}", 1987 f"{r[f'D{self._4x}']:.6f}" 1988 ] 1989 if save_to_file: 1990 if not os.path.exists(dir): 1991 os.makedirs(dir) 1992 if filename is None: 1993 filename = f'D{self._4x}_analyses.csv' 1994 with open(f'{dir}/{filename}', 'w') as fid: 1995 fid.write(make_csv(out)) 1996 if print_out: 1997 self.msg('\n' + pretty_table(out)) 1998 return out
Print out an/or save to disk a table of analyses.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2000 @make_verbal 2001 def covar_table( 2002 self, 2003 correl = False, 2004 dir = 'output', 2005 filename = None, 2006 save_to_file = True, 2007 print_out = True, 2008 output = None, 2009 ): 2010 ''' 2011 Print out, save to disk and/or return the variance-covariance matrix of D4x 2012 for all unknown samples. 2013 2014 **Parameters** 2015 2016 + `dir`: the directory in which to save the csv 2017 + `filename`: the name of the csv file to write to 2018 + `save_to_file`: whether to save the csv 2019 + `print_out`: whether to print out the matrix 2020 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2021 if set to `'raw'`: return a list of list of strings 2022 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2023 ''' 2024 samples = sorted([u for u in self.unknowns]) 2025 out = [[''] + samples] 2026 for s1 in samples: 2027 out.append([s1]) 2028 for s2 in samples: 2029 if correl: 2030 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2031 else: 2032 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2033 2034 if save_to_file: 2035 if not os.path.exists(dir): 2036 os.makedirs(dir) 2037 if filename is None: 2038 if correl: 2039 filename = f'D{self._4x}_correl.csv' 2040 else: 2041 filename = f'D{self._4x}_covar.csv' 2042 with open(f'{dir}/{filename}', 'w') as fid: 2043 fid.write(make_csv(out)) 2044 if print_out: 2045 self.msg('\n'+pretty_table(out)) 2046 if output == 'raw': 2047 return out 2048 elif output == 'pretty': 2049 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the matrixoutput
: if set to'pretty'
: return a pretty text matrix (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2051 @make_verbal 2052 def table_of_samples( 2053 self, 2054 dir = 'output', 2055 filename = None, 2056 save_to_file = True, 2057 print_out = True, 2058 output = None, 2059 ): 2060 ''' 2061 Print out, save to disk and/or return a table of samples. 2062 2063 **Parameters** 2064 2065 + `dir`: the directory in which to save the csv 2066 + `filename`: the name of the csv file to write to 2067 + `save_to_file`: whether to save the csv 2068 + `print_out`: whether to print out the table 2069 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2070 if set to `'raw'`: return a list of list of strings 2071 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2072 ''' 2073 2074 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2075 for sample in self.anchors: 2076 out += [[ 2077 f"{sample}", 2078 f"{self.samples[sample]['N']}", 2079 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2080 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2081 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2082 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2083 ]] 2084 for sample in self.unknowns: 2085 out += [[ 2086 f"{sample}", 2087 f"{self.samples[sample]['N']}", 2088 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2089 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2090 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2091 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2092 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2093 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2094 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2095 ]] 2096 if save_to_file: 2097 if not os.path.exists(dir): 2098 os.makedirs(dir) 2099 if filename is None: 2100 filename = f'D{self._4x}_samples.csv' 2101 with open(f'{dir}/{filename}', 'w') as fid: 2102 fid.write(make_csv(out)) 2103 if print_out: 2104 self.msg('\n'+pretty_table(out)) 2105 if output == 'raw': 2106 return out 2107 elif output == 'pretty': 2108 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2111 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2112 ''' 2113 Generate session plots and save them to disk. 2114 2115 **Parameters** 2116 2117 + `dir`: the directory in which to save the plots 2118 + `figsize`: the width and height (in inches) of each plot 2119 + `filetype`: 'pdf' or 'png' 2120 + `dpi`: resolution for PNG output 2121 ''' 2122 if not os.path.exists(dir): 2123 os.makedirs(dir) 2124 2125 for session in self.sessions: 2126 sp = self.plot_single_session(session, xylimits = 'constant') 2127 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2128 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir
: the directory in which to save the plotsfigsize
: the width and height (in inches) of each plotfiletype
: 'pdf' or 'png'dpi
: resolution for PNG output
2132 @make_verbal 2133 def consolidate_samples(self): 2134 ''' 2135 Compile various statistics for each sample. 2136 2137 For each anchor sample: 2138 2139 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2140 + `SE_D47` or `SE_D48`: set to zero by definition 2141 2142 For each unknown sample: 2143 2144 + `D47` or `D48`: the standardized Δ4x value for this unknown 2145 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2146 2147 For each anchor and unknown: 2148 2149 + `N`: the total number of analyses of this sample 2150 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2151 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2152 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2153 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2154 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2155 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2156 ''' 2157 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2158 for sample in self.samples: 2159 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2160 if self.samples[sample]['N'] > 1: 2161 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2162 2163 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2164 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2165 2166 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2167 if len(D4x_pop) > 2: 2168 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2169 2170 if self.standardization_method == 'pooled': 2171 for sample in self.anchors: 2172 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2173 self.samples[sample][f'SE_D{self._4x}'] = 0. 2174 for sample in self.unknowns: 2175 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2176 try: 2177 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2178 except ValueError: 2179 # when `sample` is constrained by self.standardize(constraints = {...}), 2180 # it is no longer listed in self.standardization.var_names. 2181 # Temporary fix: define SE as zero for now 2182 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2183 2184 elif self.standardization_method == 'indep_sessions': 2185 for sample in self.anchors: 2186 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2187 self.samples[sample][f'SE_D{self._4x}'] = 0. 2188 for sample in self.unknowns: 2189 self.msg(f'Consolidating sample {sample}') 2190 self.unknowns[sample][f'session_D{self._4x}'] = {} 2191 session_avg = [] 2192 for session in self.sessions: 2193 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2194 if sdata: 2195 self.msg(f'{sample} found in session {session}') 2196 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2197 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2198 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2199 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2200 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2201 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2202 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2203 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2204 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2205 wsum = sum([weights[s] for s in weights]) 2206 for s in weights: 2207 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2208 2209 for r in self: 2210 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47
orD48
: the nominal Δ4x value for this anchor, specified byself.Nominal_D4x
SE_D47
orSE_D48
: set to zero by definition
For each unknown sample:
D47
orD48
: the standardized Δ4x value for this unknownSE_D47
orSE_D48
: the standard error of Δ4x for this unknown
For each anchor and unknown:
N
: the total number of analyses of this sampleSD_D47
orSD_D48
: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB
: the average δ13CVPDB value for this sampled18O_VSMOW
: the average δ18OVSMOW value for this sample (as CO2)p_Levene
: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE
.
2214 def consolidate_sessions(self): 2215 ''' 2216 Compute various statistics for each session. 2217 2218 + `Na`: Number of anchor analyses in the session 2219 + `Nu`: Number of unknown analyses in the session 2220 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2221 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2222 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2223 + `a`: scrambling factor 2224 + `b`: compositional slope 2225 + `c`: WG offset 2226 + `SE_a`: Model stadard erorr of `a` 2227 + `SE_b`: Model stadard erorr of `b` 2228 + `SE_c`: Model stadard erorr of `c` 2229 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2230 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2231 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2232 + `a2`: scrambling factor drift 2233 + `b2`: compositional slope drift 2234 + `c2`: WG offset drift 2235 + `Np`: Number of standardization parameters to fit 2236 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2237 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2238 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2239 ''' 2240 for session in self.sessions: 2241 if 'd13Cwg_VPDB' not in self.sessions[session]: 2242 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2243 if 'd18Owg_VSMOW' not in self.sessions[session]: 2244 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2245 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2246 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2247 2248 self.msg(f'Computing repeatabilities for session {session}') 2249 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2250 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2251 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2252 2253 if self.standardization_method == 'pooled': 2254 for session in self.sessions: 2255 2256 # different (better?) computation of D4x repeatability for each session: 2257 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2258 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2259 2260 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2261 i = self.standardization.var_names.index(f'a_{pf(session)}') 2262 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2263 2264 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2265 i = self.standardization.var_names.index(f'b_{pf(session)}') 2266 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2267 2268 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2269 i = self.standardization.var_names.index(f'c_{pf(session)}') 2270 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2271 2272 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2273 if self.sessions[session]['scrambling_drift']: 2274 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2275 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2276 else: 2277 self.sessions[session]['SE_a2'] = 0. 2278 2279 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2280 if self.sessions[session]['slope_drift']: 2281 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2282 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2283 else: 2284 self.sessions[session]['SE_b2'] = 0. 2285 2286 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2287 if self.sessions[session]['wg_drift']: 2288 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2289 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2290 else: 2291 self.sessions[session]['SE_c2'] = 0. 2292 2293 i = self.standardization.var_names.index(f'a_{pf(session)}') 2294 j = self.standardization.var_names.index(f'b_{pf(session)}') 2295 k = self.standardization.var_names.index(f'c_{pf(session)}') 2296 CM = np.zeros((6,6)) 2297 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2298 try: 2299 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2300 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2301 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2302 try: 2303 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2304 CM[3,4] = self.standardization.covar[i2,j2] 2305 CM[4,3] = self.standardization.covar[j2,i2] 2306 except ValueError: 2307 pass 2308 try: 2309 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2310 CM[3,5] = self.standardization.covar[i2,k2] 2311 CM[5,3] = self.standardization.covar[k2,i2] 2312 except ValueError: 2313 pass 2314 except ValueError: 2315 pass 2316 try: 2317 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2318 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2319 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2320 try: 2321 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2322 CM[4,5] = self.standardization.covar[j2,k2] 2323 CM[5,4] = self.standardization.covar[k2,j2] 2324 except ValueError: 2325 pass 2326 except ValueError: 2327 pass 2328 try: 2329 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2330 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2331 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2332 except ValueError: 2333 pass 2334 2335 self.sessions[session]['CM'] = CM 2336 2337 elif self.standardization_method == 'indep_sessions': 2338 pass # Not implemented yet
Compute various statistics for each session.
Na
: Number of anchor analyses in the sessionNu
: Number of unknown analyses in the sessionr_d13C_VPDB
: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW
: δ18OVSMOW repeatability of analyses within the sessionr_D47
orr_D48
: Δ4x repeatability of analyses within the sessiona
: scrambling factorb
: compositional slopec
: WG offsetSE_a
: Model stadard erorr ofa
SE_b
: Model stadard erorr ofb
SE_c
: Model stadard erorr ofc
scrambling_drift
(boolean): whether to allow a temporal drift in the scrambling factor (a
)slope_drift
(boolean): whether to allow a temporal drift in the compositional slope (b
)wg_drift
(boolean): whether to allow a temporal drift in the WG offset (c
)a2
: scrambling factor driftb2
: compositional slope driftc2
: WG offset driftNp
: Number of standardization parameters to fitCM
: model covariance matrix for (a
,b
,c
,a2
,b2
,c2
)d13Cwg_VPDB
: δ13CVPDB of WGd18Owg_VSMOW
: δ18OVSMOW of WG
2341 @make_verbal 2342 def repeatabilities(self): 2343 ''' 2344 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2345 (for all samples, for anchors, and for unknowns). 2346 ''' 2347 self.msg('Computing reproducibilities for all sessions') 2348 2349 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2350 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2351 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2352 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2353 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2356 @make_verbal 2357 def consolidate(self, tables = True, plots = True): 2358 ''' 2359 Collect information about samples, sessions and repeatabilities. 2360 ''' 2361 self.consolidate_samples() 2362 self.consolidate_sessions() 2363 self.repeatabilities() 2364 2365 if tables: 2366 self.summary() 2367 self.table_of_sessions() 2368 self.table_of_analyses() 2369 self.table_of_samples() 2370 2371 if plots: 2372 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2375 @make_verbal 2376 def rmswd(self, 2377 samples = 'all samples', 2378 sessions = 'all sessions', 2379 ): 2380 ''' 2381 Compute the χ2, root mean squared weighted deviation 2382 (i.e. reduced χ2), and corresponding degrees of freedom of the 2383 Δ4x values for samples in `samples` and sessions in `sessions`. 2384 2385 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2386 ''' 2387 if samples == 'all samples': 2388 mysamples = [k for k in self.samples] 2389 elif samples == 'anchors': 2390 mysamples = [k for k in self.anchors] 2391 elif samples == 'unknowns': 2392 mysamples = [k for k in self.unknowns] 2393 else: 2394 mysamples = samples 2395 2396 if sessions == 'all sessions': 2397 sessions = [k for k in self.sessions] 2398 2399 chisq, Nf = 0, 0 2400 for sample in mysamples : 2401 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2402 if len(G) > 1 : 2403 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2404 Nf += (len(G) - 1) 2405 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2406 r = (chisq / Nf)**.5 if Nf > 0 else 0 2407 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2408 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples
and sessions in sessions
.
Only used in D4xdata.standardize()
with method='indep_sessions'
.
2411 @make_verbal 2412 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2413 ''' 2414 Compute the repeatability of `[r[key] for r in self]` 2415 ''' 2416 2417 if samples == 'all samples': 2418 mysamples = [k for k in self.samples] 2419 elif samples == 'anchors': 2420 mysamples = [k for k in self.anchors] 2421 elif samples == 'unknowns': 2422 mysamples = [k for k in self.unknowns] 2423 else: 2424 mysamples = samples 2425 2426 if sessions == 'all sessions': 2427 sessions = [k for k in self.sessions] 2428 2429 if key in ['D47', 'D48']: 2430 # Full disclosure: the definition of Nf is tricky/debatable 2431 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2432 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2433 Nf = len(G) 2434# print(f'len(G) = {Nf}') 2435 Nf -= len([s for s in mysamples if s in self.unknowns]) 2436# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2437 for session in sessions: 2438 Np = len([ 2439 _ for _ in self.standardization.params 2440 if ( 2441 self.standardization.params[_].expr is not None 2442 and ( 2443 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2444 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2445 ) 2446 ) 2447 ]) 2448# print(f'session {session}: {Np} parameters to consider') 2449 Na = len({ 2450 r['Sample'] for r in self.sessions[session]['data'] 2451 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2452 }) 2453# print(f'session {session}: {Na} different anchors in that session') 2454 Nf -= min(Np, Na) 2455# print(f'Nf = {Nf}') 2456 2457# for sample in mysamples : 2458# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2459# if len(X) > 1 : 2460# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2461# if sample in self.unknowns: 2462# Nf += len(X) - 1 2463# else: 2464# Nf += len(X) 2465# if samples in ['anchors', 'all samples']: 2466# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2467 r = (chisq / Nf)**.5 if Nf > 0 else 0 2468 2469 else: # if key not in ['D47', 'D48'] 2470 chisq, Nf = 0, 0 2471 for sample in mysamples : 2472 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2473 if len(X) > 1 : 2474 Nf += len(X) - 1 2475 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2476 r = (chisq / Nf)**.5 if Nf > 0 else 0 2477 2478 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2479 return r
Compute the repeatability of [r[key] for r in self]
2481 def sample_average(self, samples, weights = 'equal', normalize = True): 2482 ''' 2483 Weighted average Δ4x value of a group of samples, accounting for covariance. 2484 2485 Returns the weighed average Δ4x value and associated SE 2486 of a group of samples. Weights are equal by default. If `normalize` is 2487 true, `weights` will be rescaled so that their sum equals 1. 2488 2489 **Examples** 2490 2491 ```python 2492 self.sample_average(['X','Y'], [1, 2]) 2493 ``` 2494 2495 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2496 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2497 values of samples X and Y, respectively. 2498 2499 ```python 2500 self.sample_average(['X','Y'], [1, -1], normalize = False) 2501 ``` 2502 2503 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2504 ''' 2505 if weights == 'equal': 2506 weights = [1/len(samples)] * len(samples) 2507 2508 if normalize: 2509 s = sum(weights) 2510 if s: 2511 weights = [w/s for w in weights] 2512 2513 try: 2514# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2515# C = self.standardization.covar[indices,:][:,indices] 2516 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2517 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2518 return correlated_sum(X, C, weights) 2519 except ValueError: 2520 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize
is
true, weights
will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2523 def sample_D4x_covar(self, sample1, sample2 = None): 2524 ''' 2525 Covariance between Δ4x values of samples 2526 2527 Returns the error covariance between the average Δ4x values of two 2528 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2529 returns the Δ4x variance for that sample. 2530 ''' 2531 if sample2 is None: 2532 sample2 = sample1 2533 if self.standardization_method == 'pooled': 2534 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2535 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2536 return self.standardization.covar[i, j] 2537 elif self.standardization_method == 'indep_sessions': 2538 if sample1 == sample2: 2539 return self.samples[sample1][f'SE_D{self._4x}']**2 2540 else: 2541 c = 0 2542 for session in self.sessions: 2543 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2544 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2545 if sdata1 and sdata2: 2546 a = self.sessions[session]['a'] 2547 # !! TODO: CM below does not account for temporal changes in standardization parameters 2548 CM = self.sessions[session]['CM'][:3,:3] 2549 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2550 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2551 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2552 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2553 c += ( 2554 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2555 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2556 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2557 @ CM 2558 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2559 ) / a**2 2560 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1
is specified, or if sample_1 == sample_2
),
returns the Δ4x variance for that sample.
2562 def sample_D4x_correl(self, sample1, sample2 = None): 2563 ''' 2564 Correlation between Δ4x errors of samples 2565 2566 Returns the error correlation between the average Δ4x values of two samples. 2567 ''' 2568 if sample2 is None or sample2 == sample1: 2569 return 1. 2570 return ( 2571 self.sample_D4x_covar(sample1, sample2) 2572 / self.unknowns[sample1][f'SE_D{self._4x}'] 2573 / self.unknowns[sample2][f'SE_D{self._4x}'] 2574 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2576 def plot_single_session(self, 2577 session, 2578 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2579 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2580 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2581 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2582 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2583 xylimits = 'free', # | 'constant' 2584 x_label = None, 2585 y_label = None, 2586 error_contour_interval = 'auto', 2587 fig = 'new', 2588 ): 2589 ''' 2590 Generate plot for a single session 2591 ''' 2592 if x_label is None: 2593 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2594 if y_label is None: 2595 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2596 2597 out = _SessionPlot() 2598 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2599 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2600 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2601 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2602 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2603 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2604 anchor_avg = (np.array([ np.array([ 2605 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2606 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2607 ]) for sample in anchors]).T, 2608 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2609 unknown_avg = (np.array([ np.array([ 2610 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2611 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2612 ]) for sample in unknowns]).T, 2613 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2614 2615 2616 if fig == 'new': 2617 out.fig = ppl.figure(figsize = (6,6)) 2618 ppl.subplots_adjust(.1,.1,.9,.9) 2619 2620 out.anchor_analyses, = ppl.plot( 2621 anchors_d, 2622 anchors_D, 2623 **kw_plot_anchors) 2624 out.unknown_analyses, = ppl.plot( 2625 unknowns_d, 2626 unknowns_D, 2627 **kw_plot_unknowns) 2628 out.anchor_avg = ppl.plot( 2629 *anchor_avg, 2630 **kw_plot_anchor_avg) 2631 out.unknown_avg = ppl.plot( 2632 *unknown_avg, 2633 **kw_plot_unknown_avg) 2634 if xylimits == 'constant': 2635 x = [r[f'd{self._4x}'] for r in self] 2636 y = [r[f'D{self._4x}'] for r in self] 2637 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2638 w, h = x2-x1, y2-y1 2639 x1 -= w/20 2640 x2 += w/20 2641 y1 -= h/20 2642 y2 += h/20 2643 ppl.axis([x1, x2, y1, y2]) 2644 elif xylimits == 'free': 2645 x1, x2, y1, y2 = ppl.axis() 2646 else: 2647 x1, x2, y1, y2 = ppl.axis(xylimits) 2648 2649 if error_contour_interval != 'none': 2650 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2651 XI,YI = np.meshgrid(xi, yi) 2652 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2653 if error_contour_interval == 'auto': 2654 rng = np.max(SI) - np.min(SI) 2655 if rng <= 0.01: 2656 cinterval = 0.001 2657 elif rng <= 0.03: 2658 cinterval = 0.004 2659 elif rng <= 0.1: 2660 cinterval = 0.01 2661 elif rng <= 0.3: 2662 cinterval = 0.03 2663 elif rng <= 1.: 2664 cinterval = 0.1 2665 else: 2666 cinterval = 0.5 2667 else: 2668 cinterval = error_contour_interval 2669 2670 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2671 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2672 out.clabel = ppl.clabel(out.contour) 2673 contour = (XI, YI, SI, cval, cinterval) 2674 2675 if fig == None: 2676 return { 2677 'anchors':anchors, 2678 'unknowns':unknowns, 2679 'anchors_d':anchors_d, 2680 'anchors_D':anchors_D, 2681 'unknowns_d':unknowns_d, 2682 'unknowns_D':unknowns_D, 2683 'anchor_avg':anchor_avg, 2684 'unknown_avg':unknown_avg, 2685 'contour':contour, 2686 } 2687 2688 ppl.xlabel(x_label) 2689 ppl.ylabel(y_label) 2690 ppl.title(session, weight = 'bold') 2691 ppl.grid(alpha = .2) 2692 out.ax = ppl.gca() 2693 2694 return out
Generate plot for a single session
2696 def plot_residuals( 2697 self, 2698 kde = False, 2699 hist = False, 2700 binwidth = 2/3, 2701 dir = 'output', 2702 filename = None, 2703 highlight = [], 2704 colors = None, 2705 figsize = None, 2706 dpi = 100, 2707 yspan = None, 2708 ): 2709 ''' 2710 Plot residuals of each analysis as a function of time (actually, as a function of 2711 the order of analyses in the `D4xdata` object) 2712 2713 + `kde`: whether to add a kernel density estimate of residuals 2714 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2715 + `histbins`: specify bin edges for the histogram 2716 + `dir`: the directory in which to save the plot 2717 + `highlight`: a list of samples to highlight 2718 + `colors`: a dict of `{<sample>: <color>}` for all samples 2719 + `figsize`: (width, height) of figure 2720 + `dpi`: resolution for PNG output 2721 + `yspan`: factor controlling the range of y values shown in plot 2722 (by default: `yspan = 1.5 if kde else 1.0`) 2723 ''' 2724 2725 from matplotlib import ticker 2726 2727 if yspan is None: 2728 if kde: 2729 yspan = 1.5 2730 else: 2731 yspan = 1.0 2732 2733 # Layout 2734 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2735 if hist or kde: 2736 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2737 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2738 else: 2739 ppl.subplots_adjust(.08,.05,.78,.8) 2740 ax1 = ppl.subplot(111) 2741 2742 # Colors 2743 N = len(self.anchors) 2744 if colors is None: 2745 if len(highlight) > 0: 2746 Nh = len(highlight) 2747 if Nh == 1: 2748 colors = {highlight[0]: (0,0,0)} 2749 elif Nh == 3: 2750 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2751 elif Nh == 4: 2752 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2753 else: 2754 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2755 else: 2756 if N == 3: 2757 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2758 elif N == 4: 2759 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2760 else: 2761 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2762 2763 ppl.sca(ax1) 2764 2765 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2766 2767 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2768 2769 session = self[0]['Session'] 2770 x1 = 0 2771# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2772 x_sessions = {} 2773 one_or_more_singlets = False 2774 one_or_more_multiplets = False 2775 multiplets = set() 2776 for k,r in enumerate(self): 2777 if r['Session'] != session: 2778 x2 = k-1 2779 x_sessions[session] = (x1+x2)/2 2780 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2781 session = r['Session'] 2782 x1 = k 2783 singlet = len(self.samples[r['Sample']]['data']) == 1 2784 if not singlet: 2785 multiplets.add(r['Sample']) 2786 if r['Sample'] in self.unknowns: 2787 if singlet: 2788 one_or_more_singlets = True 2789 else: 2790 one_or_more_multiplets = True 2791 kw = dict( 2792 marker = 'x' if singlet else '+', 2793 ms = 4 if singlet else 5, 2794 ls = 'None', 2795 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2796 mew = 1, 2797 alpha = 0.2 if singlet else 1, 2798 ) 2799 if highlight and r['Sample'] not in highlight: 2800 kw['alpha'] = 0.2 2801 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2802 x2 = k 2803 x_sessions[session] = (x1+x2)/2 2804 2805 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2806 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2807 if not (hist or kde): 2808 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2809 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2810 2811 xmin, xmax, ymin, ymax = ppl.axis() 2812 if yspan != 1: 2813 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2814 for s in x_sessions: 2815 ppl.text( 2816 x_sessions[s], 2817 ymax +1, 2818 s, 2819 va = 'bottom', 2820 **( 2821 dict(ha = 'center') 2822 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2823 else dict(ha = 'left', rotation = 45) 2824 ) 2825 ) 2826 2827 if hist or kde: 2828 ppl.sca(ax2) 2829 2830 for s in colors: 2831 kw['marker'] = '+' 2832 kw['ms'] = 5 2833 kw['mec'] = colors[s] 2834 kw['label'] = s 2835 kw['alpha'] = 1 2836 ppl.plot([], [], **kw) 2837 2838 kw['mec'] = (0,0,0) 2839 2840 if one_or_more_singlets: 2841 kw['marker'] = 'x' 2842 kw['ms'] = 4 2843 kw['alpha'] = .2 2844 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2845 ppl.plot([], [], **kw) 2846 2847 if one_or_more_multiplets: 2848 kw['marker'] = '+' 2849 kw['ms'] = 4 2850 kw['alpha'] = 1 2851 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2852 ppl.plot([], [], **kw) 2853 2854 if hist or kde: 2855 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2856 else: 2857 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2858 leg.set_zorder(-1000) 2859 2860 ppl.sca(ax1) 2861 2862 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2863 ppl.xticks([]) 2864 ppl.axis([-1, len(self), None, None]) 2865 2866 if hist or kde: 2867 ppl.sca(ax2) 2868 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2869 2870 if kde: 2871 from scipy.stats import gaussian_kde 2872 yi = np.linspace(ymin, ymax, 201) 2873 xi = gaussian_kde(X).evaluate(yi) 2874 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2875# ppl.plot(xi, yi, 'k-', lw = 1) 2876 elif hist: 2877 ppl.hist( 2878 X, 2879 orientation = 'horizontal', 2880 histtype = 'stepfilled', 2881 ec = [.4]*3, 2882 fc = [.25]*3, 2883 alpha = .25, 2884 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2885 ) 2886 ppl.text(0, 0, 2887 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2888 size = 7.5, 2889 alpha = 1, 2890 va = 'center', 2891 ha = 'left', 2892 ) 2893 2894 ppl.axis([0, None, ymin, ymax]) 2895 ppl.xticks([]) 2896 ppl.yticks([]) 2897# ax2.spines['left'].set_visible(False) 2898 ax2.spines['right'].set_visible(False) 2899 ax2.spines['top'].set_visible(False) 2900 ax2.spines['bottom'].set_visible(False) 2901 2902 ax1.axis([None, None, ymin, ymax]) 2903 2904 if not os.path.exists(dir): 2905 os.makedirs(dir) 2906 if filename is None: 2907 return fig 2908 elif filename == '': 2909 filename = f'D{self._4x}_residuals.pdf' 2910 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2911 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata
object)
kde
: whether to add a kernel density estimate of residualshist
: whether to add a histogram of residuals (incompatible withkde
)histbins
: specify bin edges for the histogramdir
: the directory in which to save the plothighlight
: a list of samples to highlightcolors
: a dict of{<sample>: <color>}
for all samplesfigsize
: (width, height) of figuredpi
: resolution for PNG outputyspan
: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0
)
2914 def simulate(self, *args, **kwargs): 2915 ''' 2916 Legacy function with warning message pointing to `virtual_data()` 2917 ''' 2918 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2920 def plot_distribution_of_analyses( 2921 self, 2922 dir = 'output', 2923 filename = None, 2924 vs_time = False, 2925 figsize = (6,4), 2926 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2927 output = None, 2928 dpi = 100, 2929 ): 2930 ''' 2931 Plot temporal distribution of all analyses in the data set. 2932 2933 **Parameters** 2934 2935 + `dir`: the directory in which to save the plot 2936 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2937 + `dpi`: resolution for PNG output 2938 + `figsize`: (width, height) of figure 2939 + `dpi`: resolution for PNG output 2940 ''' 2941 2942 asamples = [s for s in self.anchors] 2943 usamples = [s for s in self.unknowns] 2944 if output is None or output == 'fig': 2945 fig = ppl.figure(figsize = figsize) 2946 ppl.subplots_adjust(*subplots_adjust) 2947 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2948 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2949 Xmax += (Xmax-Xmin)/40 2950 Xmin -= (Xmax-Xmin)/41 2951 for k, s in enumerate(asamples + usamples): 2952 if vs_time: 2953 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2954 else: 2955 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2956 Y = [-k for x in X] 2957 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2958 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2959 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2960 ppl.axis([Xmin, Xmax, -k-1, 1]) 2961 ppl.xlabel('\ntime') 2962 ppl.gca().annotate('', 2963 xy = (0.6, -0.02), 2964 xycoords = 'axes fraction', 2965 xytext = (.4, -0.02), 2966 arrowprops = dict(arrowstyle = "->", color = 'k'), 2967 ) 2968 2969 2970 x2 = -1 2971 for session in self.sessions: 2972 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2973 if vs_time: 2974 ppl.axvline(x1, color = 'k', lw = .75) 2975 if x2 > -1: 2976 if not vs_time: 2977 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2978 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2979# from xlrd import xldate_as_datetime 2980# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2981 if vs_time: 2982 ppl.axvline(x2, color = 'k', lw = .75) 2983 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2984 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2985 2986 ppl.xticks([]) 2987 ppl.yticks([]) 2988 2989 if output is None: 2990 if not os.path.exists(dir): 2991 os.makedirs(dir) 2992 if filename == None: 2993 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2994 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2995 ppl.close(fig) 2996 elif output == 'ax': 2997 return ppl.gca() 2998 elif output == 'fig': 2999 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir
: the directory in which to save the plotvs_time
: ifTrue
, plot as a function ofTimeTag
rather than sequentially.dpi
: resolution for PNG outputfigsize
: (width, height) of figuredpi
: resolution for PNG output
3002 def plot_bulk_compositions( 3003 self, 3004 samples = None, 3005 dir = 'output/bulk_compositions', 3006 figsize = (6,6), 3007 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3008 show = False, 3009 sample_color = (0,.5,1), 3010 analysis_color = (.7,.7,.7), 3011 labeldist = 0.3, 3012 radius = 0.05, 3013 ): 3014 ''' 3015 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3016 3017 By default, creates a directory `./output/bulk_compositions` where plots for 3018 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3019 3020 3021 **Parameters** 3022 3023 + `samples`: Only these samples are processed (by default: all samples). 3024 + `dir`: where to save the plots 3025 + `figsize`: (width, height) of figure 3026 + `subplots_adjust`: passed to `subplots_adjust()` 3027 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3028 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3029 + `sample_color`: color used for replicate markers/labels 3030 + `analysis_color`: color used for sample markers/labels 3031 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3032 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3033 ''' 3034 3035 from matplotlib.patches import Ellipse 3036 3037 if samples is None: 3038 samples = [_ for _ in self.samples] 3039 3040 saved = {} 3041 3042 for s in samples: 3043 3044 fig = ppl.figure(figsize = figsize) 3045 fig.subplots_adjust(*subplots_adjust) 3046 ax = ppl.subplot(111) 3047 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3048 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3049 ppl.title(s) 3050 3051 3052 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3053 UID = [_['UID'] for _ in self.samples[s]['data']] 3054 XY0 = XY.mean(0) 3055 3056 for xy in XY: 3057 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3058 3059 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3060 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3061 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3062 saved[s] = [XY, XY0] 3063 3064 x1, x2, y1, y2 = ppl.axis() 3065 x0, dx = (x1+x2)/2, (x2-x1)/2 3066 y0, dy = (y1+y2)/2, (y2-y1)/2 3067 dx, dy = [max(max(dx, dy), radius)]*2 3068 3069 ppl.axis([ 3070 x0 - 1.2*dx, 3071 x0 + 1.2*dx, 3072 y0 - 1.2*dy, 3073 y0 + 1.2*dy, 3074 ]) 3075 3076 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3077 3078 for xy, uid in zip(XY, UID): 3079 3080 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3081 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3082 3083 if (vector_in_display_space**2).sum() > 0: 3084 3085 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3086 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3087 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3088 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3089 3090 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3091 3092 else: 3093 3094 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3095 3096 if radius: 3097 ax.add_artist(Ellipse( 3098 xy = XY0, 3099 width = radius*2, 3100 height = radius*2, 3101 ls = (0, (2,2)), 3102 lw = .7, 3103 ec = analysis_color, 3104 fc = 'None', 3105 )) 3106 ppl.text( 3107 XY0[0], 3108 XY0[1]-radius, 3109 f'\n± {radius*1e3:.0f} ppm', 3110 color = analysis_color, 3111 va = 'top', 3112 ha = 'center', 3113 linespacing = 0.4, 3114 size = 8, 3115 ) 3116 3117 if not os.path.exists(dir): 3118 os.makedirs(dir) 3119 fig.savefig(f'{dir}/{s}.pdf') 3120 ppl.close(fig) 3121 3122 fig = ppl.figure(figsize = figsize) 3123 fig.subplots_adjust(*subplots_adjust) 3124 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3125 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3126 3127 for s in saved: 3128 for xy in saved[s][0]: 3129 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3130 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3131 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3132 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3133 3134 x1, x2, y1, y2 = ppl.axis() 3135 ppl.axis([ 3136 x1 - (x2-x1)/10, 3137 x2 + (x2-x1)/10, 3138 y1 - (y2-y1)/10, 3139 y2 + (y2-y1)/10, 3140 ]) 3141 3142 3143 if not os.path.exists(dir): 3144 os.makedirs(dir) 3145 fig.savefig(f'{dir}/__all__.pdf') 3146 if show: 3147 ppl.show() 3148 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions
where plots for
each sample are saved. Another plot named __all__.pdf
shows all analyses together.
Parameters
samples
: Only these samples are processed (by default: all samples).dir
: where to save the plotsfigsize
: (width, height) of figuresubplots_adjust
: passed tosubplots_adjust()
show
: whether to callmatplotlib.pyplot.show()
on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color
: color used for replicate markers/labelsanalysis_color
: color used for sample markers/labelslabeldist
: distance (in inches) from replicate markers to replicate labelsradius
: radius of the dashed circle providing scale. No circle ifradius = 0
.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3190class D47data(D4xdata): 3191 ''' 3192 Store and process data for a large set of Δ47 analyses, 3193 usually comprising more than one analytical session. 3194 ''' 3195 3196 Nominal_D4x = { 3197 'ETH-1': 0.2052, 3198 'ETH-2': 0.2085, 3199 'ETH-3': 0.6132, 3200 'ETH-4': 0.4511, 3201 'IAEA-C1': 0.3018, 3202 'IAEA-C2': 0.6409, 3203 'MERCK': 0.5135, 3204 } # I-CDES (Bernasconi et al., 2021) 3205 ''' 3206 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3207 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3208 reference frame. 3209 3210 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3211 ```py 3212 { 3213 'ETH-1' : 0.2052, 3214 'ETH-2' : 0.2085, 3215 'ETH-3' : 0.6132, 3216 'ETH-4' : 0.4511, 3217 'IAEA-C1' : 0.3018, 3218 'IAEA-C2' : 0.6409, 3219 'MERCK' : 0.5135, 3220 } 3221 ``` 3222 ''' 3223 3224 3225 @property 3226 def Nominal_D47(self): 3227 return self.Nominal_D4x 3228 3229 3230 @Nominal_D47.setter 3231 def Nominal_D47(self, new): 3232 self.Nominal_D4x = dict(**new) 3233 self.refresh() 3234 3235 3236 def __init__(self, l = [], **kwargs): 3237 ''' 3238 **Parameters:** same as `D4xdata.__init__()` 3239 ''' 3240 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3241 3242 3243 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3244 ''' 3245 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3246 value for that temperature, and add treat these samples as additional anchors. 3247 3248 **Parameters** 3249 3250 + `fCo2eqD47`: Which CO2 equilibrium law to use 3251 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3252 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3253 + `priority`: if `replace`: forget old anchors and only use the new ones; 3254 if `new`: keep pre-existing anchors but update them in case of conflict 3255 between old and new Δ47 values; 3256 if `old`: keep pre-existing anchors but preserve their original Δ47 3257 values in case of conflict. 3258 ''' 3259 f = { 3260 'petersen': fCO2eqD47_Petersen, 3261 'wang': fCO2eqD47_Wang, 3262 }[fCo2eqD47] 3263 foo = {} 3264 for r in self: 3265 if 'Teq' in r: 3266 if r['Sample'] in foo: 3267 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3268 else: 3269 foo[r['Sample']] = f(r['Teq']) 3270 else: 3271 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3272 3273 if priority == 'replace': 3274 self.Nominal_D47 = {} 3275 for s in foo: 3276 if priority != 'old' or s not in self.Nominal_D47: 3277 self.Nominal_D47[s] = foo[s] 3278 3279 def save_D47_correl(self, *args, **kwargs): 3280 return self._save_D4x_correl(*args, **kwargs) 3281 3282 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3236 def __init__(self, l = [], **kwargs): 3237 ''' 3238 **Parameters:** same as `D4xdata.__init__()` 3239 ''' 3240 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize()
to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3243 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3244 ''' 3245 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3246 value for that temperature, and add treat these samples as additional anchors. 3247 3248 **Parameters** 3249 3250 + `fCo2eqD47`: Which CO2 equilibrium law to use 3251 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3252 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3253 + `priority`: if `replace`: forget old anchors and only use the new ones; 3254 if `new`: keep pre-existing anchors but update them in case of conflict 3255 between old and new Δ47 values; 3256 if `old`: keep pre-existing anchors but preserve their original Δ47 3257 values in case of conflict. 3258 ''' 3259 f = { 3260 'petersen': fCO2eqD47_Petersen, 3261 'wang': fCO2eqD47_Wang, 3262 }[fCo2eqD47] 3263 foo = {} 3264 for r in self: 3265 if 'Teq' in r: 3266 if r['Sample'] in foo: 3267 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3268 else: 3269 foo[r['Sample']] = f(r['Teq']) 3270 else: 3271 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3272 3273 if priority == 'replace': 3274 self.Nominal_D47 = {} 3275 for s in foo: 3276 if priority != 'old' or s not in self.Nominal_D47: 3277 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq
is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47
: Which CO2 equilibrium law to use (petersen
: Petersen et al. (2019);wang
: Wang et al. (2019)).priority
: ifreplace
: forget old anchors and only use the new ones; ifnew
: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold
: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Save D47 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D47_correl.csv
)D47_precision
: the precision to use when writingD47
andD47_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3285class D48data(D4xdata): 3286 ''' 3287 Store and process data for a large set of Δ48 analyses, 3288 usually comprising more than one analytical session. 3289 ''' 3290 3291 Nominal_D4x = { 3292 'ETH-1': 0.138, 3293 'ETH-2': 0.138, 3294 'ETH-3': 0.270, 3295 'ETH-4': 0.223, 3296 'GU-1': -0.419, 3297 } # (Fiebig et al., 2019, 2021) 3298 ''' 3299 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3300 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3301 reference frame. 3302 3303 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3304 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3305 3306 ```py 3307 { 3308 'ETH-1' : 0.138, 3309 'ETH-2' : 0.138, 3310 'ETH-3' : 0.270, 3311 'ETH-4' : 0.223, 3312 'GU-1' : -0.419, 3313 } 3314 ``` 3315 ''' 3316 3317 3318 @property 3319 def Nominal_D48(self): 3320 return self.Nominal_D4x 3321 3322 3323 @Nominal_D48.setter 3324 def Nominal_D48(self, new): 3325 self.Nominal_D4x = dict(**new) 3326 self.refresh() 3327 3328 3329 def __init__(self, l = [], **kwargs): 3330 ''' 3331 **Parameters:** same as `D4xdata.__init__()` 3332 ''' 3333 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3334 3335 def save_D48_correl(self, *args, **kwargs): 3336 return self._save_D4x_correl(*args, **kwargs) 3337 3338 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3329 def __init__(self, l = [], **kwargs): 3330 ''' 3331 **Parameters:** same as `D4xdata.__init__()` 3332 ''' 3333 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize()
to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Save D48 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D48_correl.csv
)D48_precision
: the precision to use when writingD48
andD48_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3341class D49data(D4xdata): 3342 ''' 3343 Store and process data for a large set of Δ49 analyses, 3344 usually comprising more than one analytical session. 3345 ''' 3346 3347 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3348 ''' 3349 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3350 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3351 reference frame. 3352 3353 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3354 3355 ```py 3356 { 3357 "1000C": 0.0, 3358 "25C": 2.228 3359 } 3360 ``` 3361 ''' 3362 3363 @property 3364 def Nominal_D49(self): 3365 return self.Nominal_D4x 3366 3367 @Nominal_D49.setter 3368 def Nominal_D49(self, new): 3369 self.Nominal_D4x = dict(**new) 3370 self.refresh() 3371 3372 def __init__(self, l=[], **kwargs): 3373 ''' 3374 **Parameters:** same as `D4xdata.__init__()` 3375 ''' 3376 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3377 3378 def save_D49_correl(self, *args, **kwargs): 3379 return self._save_D4x_correl(*args, **kwargs) 3380 3381 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.
3372 def __init__(self, l=[], **kwargs): 3373 ''' 3374 **Parameters:** same as `D4xdata.__init__()` 3375 ''' 3376 D4xdata.__init__(self, l=l, mass='49', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ49 values assigned to the Δ49 anchor samples, used by
D49data.standardize()
to normalize unknown samples to an absolute Δ49
reference frame.
By default equal to (after Wang et al. (2004)):
{
"1000C": 0.0,
"25C": 2.228
}
Save D49 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D49_correl.csv
)D49_precision
: the precision to use when writingD49
andD49_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort