D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip
; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
dev
branch source code here and rename it toD47crunch.py
. - Do any of the following:
- copy
D47crunch.py
to somewhere in your Python path - copy
D47crunch.py
to a working directory (import D47crunch
will only work if called within that directory) - copy
D47crunch.py
to any other location (e.g.,/foo/bar
) and then use the following code snippet in your own code to importD47crunch
:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv
with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data
object which will store and process this data:
import D47crunch
mydata = D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv
into our D47data
object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW
is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data()
. The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO
and BAR
with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data()
documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch
offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')
The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses()
for how to plot analyses as a function of “true” time (based on the TimeTag
for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.
2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)
Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions()
produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE
:
2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDB
D4xdata.Nominal_d18O_VPDB
D47data.Nominal_D4x
(also accessible throughD47data.Nominal_D47
)D48data.Nominal_D4x
(also accessible throughD48data.Nominal_D48
)
17O correction parameters are defined by:
D4xdata.R13_VPDB
D4xdata.R18_VSMOW
D4xdata.R18_VPDB
D4xdata.LAMBDA_17
D4xdata.R17_VSMOW
D4xdata.R17_VPDB
When creating a new instance of D47data
or D48data
, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW
and Nominal_D47
can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW
and D47data.Nominal_D47
_before_ creating a D47data
object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW
and Nominal_D47
_after_ creating a D47data
object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch
uses two independent classes — D47data
and D48data
— which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data
and D48data
.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:
D47crunch rawdata.csv
This will create a directory named output
and populate it by calling the following methods:
D47data.wg()
D47data.crunch()
D47data.standardize()
D47data.summary()
D47data.table_of_samples()
D47data.table_of_sessions()
D47data.plot_sessions()
D47data.plot_residuals()
D47data.table_of_analyses()
D47data.plot_distribution_of_analyses()
D47data.plot_bulk_compositions()
D47data.save_D47_correl()
You may specify a custom set of anchors instead of the default ones using the --anchors
or -a
option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv
file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB
, d18O_VPDB
, and D47
values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude
or -e
option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv
file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03
, A09
, and B06
, and those of samples MYBADSAMPLE-1
and MYBADSAMPLE-2
. It is possible to have and exclude file with only the UID
column, or only the Sample
column, or both, in any order.
The --output-dir
or -o
option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
To process Δ48 as well as Δ47, just add the --D48
option.
API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../../docpages/tutorial.md 13.. include:: ../../docpages/howto.md 14.. include:: ../../docpages/cli.md 15 16<h1>API Documentation</h1> 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) Mathieu Daëron' 23__license__ = 'MIT License - https://opensource.org/licenses/MIT' 24__date__ = '2024-11-17' 25__version__ = '2.4.2' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42 43typer.rich_utils.STYLE_HELPTEXT = '' 44 45rcParams['font.family'] = 'sans-serif' 46rcParams['font.sans-serif'] = 'Helvetica' 47rcParams['font.size'] = 10 48rcParams['mathtext.fontset'] = 'custom' 49rcParams['mathtext.rm'] = 'sans' 50rcParams['mathtext.bf'] = 'sans:bold' 51rcParams['mathtext.it'] = 'sans:italic' 52rcParams['mathtext.cal'] = 'sans:italic' 53rcParams['mathtext.default'] = 'rm' 54rcParams['xtick.major.size'] = 4 55rcParams['xtick.major.width'] = 1 56rcParams['ytick.major.size'] = 4 57rcParams['ytick.major.width'] = 1 58rcParams['axes.grid'] = False 59rcParams['axes.linewidth'] = 1 60rcParams['grid.linewidth'] = .75 61rcParams['grid.linestyle'] = '-' 62rcParams['grid.alpha'] = .15 63rcParams['savefig.dpi'] = 150 64 65Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 66_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 67def fCO2eqD47_Petersen(T): 68 ''' 69 CO2 equilibrium Δ47 value as a function of T (in degrees C) 70 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 71 72 ''' 73 return float(_fCO2eqD47_Petersen(T)) 74 75 76Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 77_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 78def fCO2eqD47_Wang(T): 79 ''' 80 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 81 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 82 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 83 ''' 84 return float(_fCO2eqD47_Wang(T)) 85 86 87def correlated_sum(X, C, w = None): 88 ''' 89 Compute covariance-aware linear combinations 90 91 **Parameters** 92 93 + `X`: list or 1-D array of values to sum 94 + `C`: covariance matrix for the elements of `X` 95 + `w`: list or 1-D array of weights to apply to the elements of `X` 96 (all equal to 1 by default) 97 98 Return the sum (and its SE) of the elements of `X`, with optional weights equal 99 to the elements of `w`, accounting for covariances between the elements of `X`. 100 ''' 101 if w is None: 102 w = [1 for x in X] 103 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 104 105 106def make_csv(x, hsep = ',', vsep = '\n'): 107 ''' 108 Formats a list of lists of strings as a CSV 109 110 **Parameters** 111 112 + `x`: the list of lists of strings to format 113 + `hsep`: the field separator (`,` by default) 114 + `vsep`: the line-ending convention to use (`\\n` by default) 115 116 **Example** 117 118 ```py 119 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 120 ``` 121 122 outputs: 123 124 ```py 125 a,b,c 126 d,e,f 127 ``` 128 ''' 129 return vsep.join([hsep.join(l) for l in x]) 130 131 132def pf(txt): 133 ''' 134 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 135 ''' 136 return txt.replace('-','_').replace('.','_').replace(' ','_') 137 138 139def smart_type(x): 140 ''' 141 Tries to convert string `x` to a float if it includes a decimal point, or 142 to an integer if it does not. If both attempts fail, return the original 143 string unchanged. 144 ''' 145 try: 146 y = float(x) 147 except ValueError: 148 return x 149 if '.' not in x: 150 return int(y) 151 return y 152 153class _Defaults(): 154 def __init__(self): 155 pass 156 157D47crunch_defaults = _Defaults() 158D47crunch_defaults.PRETTY_TABLE_VSEP = '—' 159 160def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 161 ''' 162 Reads a list of lists of strings and outputs an ascii table 163 164 **Parameters** 165 166 + `x`: a list of lists of strings 167 + `header`: the number of lines to treat as header lines 168 + `hsep`: the horizontal separator between columns 169 + `vsep`: the character to use as vertical separator 170 + `align`: string of left (`<`) or right (`>`) alignment characters. 171 172 **Example** 173 174 ```py 175 print(pretty_table([ 176 ['A', 'B', 'C'], 177 ['1', '1.9999', 'foo'], 178 ['10', 'x', 'bar'], 179 ])) 180 ``` 181 yields: 182 ``` 183 —— —————— ——— 184 A B C 185 —— —————— ——— 186 1 1.9999 foo 187 10 x bar 188 —— —————— ——— 189 ``` 190 191 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 192 193 ```py 194 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 195 print(pretty_table([ 196 ['A', 'B', 'C'], 197 ['1', '1.9999', 'foo'], 198 ['10', 'x', 'bar'], 199 ])) 200 ``` 201 yields: 202 ``` 203 == ====== === 204 A B C 205 == ====== === 206 1 1.9999 foo 207 10 x bar 208 == ====== === 209 ``` 210 ''' 211 212 if vsep is None: 213 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 214 215 txt = [] 216 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 217 218 if len(widths) > len(align): 219 align += '>' * (len(widths)-len(align)) 220 sepline = hsep.join([vsep*w for w in widths]) 221 txt += [sepline] 222 for k,l in enumerate(x): 223 if k and k == header: 224 txt += [sepline] 225 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 226 txt += [sepline] 227 txt += [''] 228 return '\n'.join(txt) 229 230 231def transpose_table(x): 232 ''' 233 Transpose a list if lists 234 235 **Parameters** 236 237 + `x`: a list of lists 238 239 **Example** 240 241 ```py 242 x = [[1, 2], [3, 4]] 243 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 244 ``` 245 ''' 246 return [[e for e in c] for c in zip(*x)] 247 248 249def w_avg(X, sX) : 250 ''' 251 Compute variance-weighted average 252 253 Returns the value and SE of the weighted average of the elements of `X`, 254 with relative weights equal to their inverse variances (`1/sX**2`). 255 256 **Parameters** 257 258 + `X`: array-like of elements to average 259 + `sX`: array-like of the corresponding SE values 260 261 **Tip** 262 263 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 264 they may be rearranged using `zip()`: 265 266 ```python 267 foo = [(0, 1), (1, 0.5), (2, 0.5)] 268 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 269 ``` 270 ''' 271 X = [ x for x in X ] 272 sX = [ sx for sx in sX ] 273 W = [ sx**-2 for sx in sX ] 274 W = [ w/sum(W) for w in W ] 275 Xavg = sum([ w*x for w,x in zip(W,X) ]) 276 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 277 return Xavg, sXavg 278 279 280def read_csv(filename, sep = ''): 281 ''' 282 Read contents of `filename` in csv format and return a list of dictionaries. 283 284 In the csv string, spaces before and after field separators (`','` by default) 285 are optional. 286 287 **Parameters** 288 289 + `filename`: the csv file to read 290 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 291 whichever appers most often in the contents of `filename`. 292 ''' 293 with open(filename) as fid: 294 txt = fid.read() 295 296 if sep == '': 297 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 298 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 299 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 300 301 302def simulate_single_analysis( 303 sample = 'MYSAMPLE', 304 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 305 d13C_VPDB = None, d18O_VPDB = None, 306 D47 = None, D48 = None, D49 = 0., D17O = 0., 307 a47 = 1., b47 = 0., c47 = -0.9, 308 a48 = 1., b48 = 0., c48 = -0.45, 309 Nominal_D47 = None, 310 Nominal_D48 = None, 311 Nominal_d13C_VPDB = None, 312 Nominal_d18O_VPDB = None, 313 ALPHA_18O_ACID_REACTION = None, 314 R13_VPDB = None, 315 R17_VSMOW = None, 316 R18_VSMOW = None, 317 LAMBDA_17 = None, 318 R18_VPDB = None, 319 ): 320 ''' 321 Compute working-gas delta values for a single analysis, assuming a stochastic working 322 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 323 324 **Parameters** 325 326 + `sample`: sample name 327 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 328 (respectively –4 and +26 ‰ by default) 329 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 330 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 331 of the carbonate sample 332 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 333 Δ48 values if `D47` or `D48` are not specified 334 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 335 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 336 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 337 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 338 correction parameters (by default equal to the `D4xdata` default values) 339 340 Returns a dictionary with fields 341 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 342 ''' 343 344 if Nominal_d13C_VPDB is None: 345 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 346 347 if Nominal_d18O_VPDB is None: 348 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 349 350 if ALPHA_18O_ACID_REACTION is None: 351 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 352 353 if R13_VPDB is None: 354 R13_VPDB = D4xdata().R13_VPDB 355 356 if R17_VSMOW is None: 357 R17_VSMOW = D4xdata().R17_VSMOW 358 359 if R18_VSMOW is None: 360 R18_VSMOW = D4xdata().R18_VSMOW 361 362 if LAMBDA_17 is None: 363 LAMBDA_17 = D4xdata().LAMBDA_17 364 365 if R18_VPDB is None: 366 R18_VPDB = D4xdata().R18_VPDB 367 368 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 369 370 if Nominal_D47 is None: 371 Nominal_D47 = D47data().Nominal_D47 372 373 if Nominal_D48 is None: 374 Nominal_D48 = D48data().Nominal_D48 375 376 if d13C_VPDB is None: 377 if sample in Nominal_d13C_VPDB: 378 d13C_VPDB = Nominal_d13C_VPDB[sample] 379 else: 380 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 381 382 if d18O_VPDB is None: 383 if sample in Nominal_d18O_VPDB: 384 d18O_VPDB = Nominal_d18O_VPDB[sample] 385 else: 386 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 387 388 if D47 is None: 389 if sample in Nominal_D47: 390 D47 = Nominal_D47[sample] 391 else: 392 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 393 394 if D48 is None: 395 if sample in Nominal_D48: 396 D48 = Nominal_D48[sample] 397 else: 398 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 399 400 X = D4xdata() 401 X.R13_VPDB = R13_VPDB 402 X.R17_VSMOW = R17_VSMOW 403 X.R18_VSMOW = R18_VSMOW 404 X.LAMBDA_17 = LAMBDA_17 405 X.R18_VPDB = R18_VPDB 406 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 407 408 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 409 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 410 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 411 ) 412 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 413 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 414 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 415 D17O=D17O, D47=D47, D48=D48, D49=D49, 416 ) 417 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 418 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 419 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 420 D17O=D17O, 421 ) 422 423 d45 = 1000 * (R45/R45wg - 1) 424 d46 = 1000 * (R46/R46wg - 1) 425 d47 = 1000 * (R47/R47wg - 1) 426 d48 = 1000 * (R48/R48wg - 1) 427 d49 = 1000 * (R49/R49wg - 1) 428 429 for k in range(3): # dumb iteration to adjust for small changes in d47 430 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 431 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 432 d47 = 1000 * (R47raw/R47wg - 1) 433 d48 = 1000 * (R48raw/R48wg - 1) 434 435 return dict( 436 Sample = sample, 437 D17O = D17O, 438 d13Cwg_VPDB = d13Cwg_VPDB, 439 d18Owg_VSMOW = d18Owg_VSMOW, 440 d45 = d45, 441 d46 = d46, 442 d47 = d47, 443 d48 = d48, 444 d49 = d49, 445 ) 446 447 448def virtual_data( 449 samples = [], 450 a47 = 1., b47 = 0., c47 = -0.9, 451 a48 = 1., b48 = 0., c48 = -0.45, 452 rd45 = 0.020, rd46 = 0.060, 453 rD47 = 0.015, rD48 = 0.045, 454 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 455 session = None, 456 Nominal_D47 = None, Nominal_D48 = None, 457 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 458 ALPHA_18O_ACID_REACTION = None, 459 R13_VPDB = None, 460 R17_VSMOW = None, 461 R18_VSMOW = None, 462 LAMBDA_17 = None, 463 R18_VPDB = None, 464 seed = 0, 465 shuffle = True, 466 ): 467 ''' 468 Return list with simulated analyses from a single session. 469 470 **Parameters** 471 472 + `samples`: a list of entries; each entry is a dictionary with the following fields: 473 * `Sample`: the name of the sample 474 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 475 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 476 * `N`: how many analyses to generate for this sample 477 + `a47`: scrambling factor for Δ47 478 + `b47`: compositional nonlinearity for Δ47 479 + `c47`: working gas offset for Δ47 480 + `a48`: scrambling factor for Δ48 481 + `b48`: compositional nonlinearity for Δ48 482 + `c48`: working gas offset for Δ48 483 + `rd45`: analytical repeatability of δ45 484 + `rd46`: analytical repeatability of δ46 485 + `rD47`: analytical repeatability of Δ47 486 + `rD48`: analytical repeatability of Δ48 487 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 488 (by default equal to the `simulate_single_analysis` default values) 489 + `session`: name of the session (no name by default) 490 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 491 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 492 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 493 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 494 (by default equal to the `simulate_single_analysis` defaults) 495 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 496 (by default equal to the `simulate_single_analysis` defaults) 497 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 498 correction parameters (by default equal to the `simulate_single_analysis` default) 499 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 500 + `shuffle`: randomly reorder the sequence of analyses 501 502 503 Here is an example of using this method to generate an arbitrary combination of 504 anchors and unknowns for a bunch of sessions: 505 506 ```py 507 .. include:: ../../code_examples/virtual_data/example.py 508 ``` 509 510 This should output something like: 511 512 ``` 513 .. include:: ../../code_examples/virtual_data/output.txt 514 ``` 515 ''' 516 517 kwargs = locals().copy() 518 519 from numpy import random as nprandom 520 if seed: 521 rng = nprandom.default_rng(seed) 522 else: 523 rng = nprandom.default_rng() 524 525 N = sum([s['N'] for s in samples]) 526 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 527 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 528 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 529 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 530 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 531 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 532 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 533 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 534 535 k = 0 536 out = [] 537 for s in samples: 538 kw = {} 539 kw['sample'] = s['Sample'] 540 kw = { 541 **kw, 542 **{var: kwargs[var] 543 for var in [ 544 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 545 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 546 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 547 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 548 ] 549 if kwargs[var] is not None}, 550 **{var: s[var] 551 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 552 if var in s}, 553 } 554 555 sN = s['N'] 556 while sN: 557 out.append(simulate_single_analysis(**kw)) 558 out[-1]['d45'] += errors45[k] 559 out[-1]['d46'] += errors46[k] 560 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 561 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 562 sN -= 1 563 k += 1 564 565 if session is not None: 566 for r in out: 567 r['Session'] = session 568 569 if shuffle: 570 nprandom.shuffle(out) 571 572 return out 573 574def table_of_samples( 575 data47 = None, 576 data48 = None, 577 dir = 'output', 578 filename = None, 579 save_to_file = True, 580 print_out = True, 581 output = None, 582 ): 583 ''' 584 Print out, save to disk and/or return a combined table of samples 585 for a pair of `D47data` and `D48data` objects. 586 587 **Parameters** 588 589 + `data47`: `D47data` instance 590 + `data48`: `D48data` instance 591 + `dir`: the directory in which to save the table 592 + `filename`: the name to the csv file to write to 593 + `save_to_file`: whether to save the table to disk 594 + `print_out`: whether to print out the table 595 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 596 if set to `'raw'`: return a list of list of strings 597 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 598 ''' 599 if data47 is None: 600 if data48 is None: 601 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 602 else: 603 return data48.table_of_samples( 604 dir = dir, 605 filename = filename, 606 save_to_file = save_to_file, 607 print_out = print_out, 608 output = output 609 ) 610 else: 611 if data48 is None: 612 return data47.table_of_samples( 613 dir = dir, 614 filename = filename, 615 save_to_file = save_to_file, 616 print_out = print_out, 617 output = output 618 ) 619 else: 620 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 621 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 622 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 623 624 if save_to_file: 625 if not os.path.exists(dir): 626 os.makedirs(dir) 627 if filename is None: 628 filename = f'D47D48_samples.csv' 629 with open(f'{dir}/{filename}', 'w') as fid: 630 fid.write(make_csv(out)) 631 if print_out: 632 print('\n'+pretty_table(out)) 633 if output == 'raw': 634 return out 635 elif output == 'pretty': 636 return pretty_table(out) 637 638 639def table_of_sessions( 640 data47 = None, 641 data48 = None, 642 dir = 'output', 643 filename = None, 644 save_to_file = True, 645 print_out = True, 646 output = None, 647 ): 648 ''' 649 Print out, save to disk and/or return a combined table of sessions 650 for a pair of `D47data` and `D48data` objects. 651 ***Only applicable if the sessions in `data47` and those in `data48` 652 consist of the exact same sets of analyses.*** 653 654 **Parameters** 655 656 + `data47`: `D47data` instance 657 + `data48`: `D48data` instance 658 + `dir`: the directory in which to save the table 659 + `filename`: the name to the csv file to write to 660 + `save_to_file`: whether to save the table to disk 661 + `print_out`: whether to print out the table 662 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 663 if set to `'raw'`: return a list of list of strings 664 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 665 ''' 666 if data47 is None: 667 if data48 is None: 668 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 669 else: 670 return data48.table_of_sessions( 671 dir = dir, 672 filename = filename, 673 save_to_file = save_to_file, 674 print_out = print_out, 675 output = output 676 ) 677 else: 678 if data48 is None: 679 return data47.table_of_sessions( 680 dir = dir, 681 filename = filename, 682 save_to_file = save_to_file, 683 print_out = print_out, 684 output = output 685 ) 686 else: 687 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 688 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 689 for k,x in enumerate(out47[0]): 690 if k>7: 691 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 692 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 693 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 694 695 if save_to_file: 696 if not os.path.exists(dir): 697 os.makedirs(dir) 698 if filename is None: 699 filename = f'D47D48_sessions.csv' 700 with open(f'{dir}/{filename}', 'w') as fid: 701 fid.write(make_csv(out)) 702 if print_out: 703 print('\n'+pretty_table(out)) 704 if output == 'raw': 705 return out 706 elif output == 'pretty': 707 return pretty_table(out) 708 709 710def table_of_analyses( 711 data47 = None, 712 data48 = None, 713 dir = 'output', 714 filename = None, 715 save_to_file = True, 716 print_out = True, 717 output = None, 718 ): 719 ''' 720 Print out, save to disk and/or return a combined table of analyses 721 for a pair of `D47data` and `D48data` objects. 722 723 If the sessions in `data47` and those in `data48` do not consist of 724 the exact same sets of analyses, the table will have two columns 725 `Session_47` and `Session_48` instead of a single `Session` column. 726 727 **Parameters** 728 729 + `data47`: `D47data` instance 730 + `data48`: `D48data` instance 731 + `dir`: the directory in which to save the table 732 + `filename`: the name to the csv file to write to 733 + `save_to_file`: whether to save the table to disk 734 + `print_out`: whether to print out the table 735 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 736 if set to `'raw'`: return a list of list of strings 737 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 738 ''' 739 if data47 is None: 740 if data48 is None: 741 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 742 else: 743 return data48.table_of_analyses( 744 dir = dir, 745 filename = filename, 746 save_to_file = save_to_file, 747 print_out = print_out, 748 output = output 749 ) 750 else: 751 if data48 is None: 752 return data47.table_of_analyses( 753 dir = dir, 754 filename = filename, 755 save_to_file = save_to_file, 756 print_out = print_out, 757 output = output 758 ) 759 else: 760 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 761 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 762 763 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 764 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 765 else: 766 out47[0][1] = 'Session_47' 767 out48[0][1] = 'Session_48' 768 out47 = transpose_table(out47) 769 out48 = transpose_table(out48) 770 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 771 772 if save_to_file: 773 if not os.path.exists(dir): 774 os.makedirs(dir) 775 if filename is None: 776 filename = f'D47D48_sessions.csv' 777 with open(f'{dir}/{filename}', 'w') as fid: 778 fid.write(make_csv(out)) 779 if print_out: 780 print('\n'+pretty_table(out)) 781 if output == 'raw': 782 return out 783 elif output == 'pretty': 784 return pretty_table(out) 785 786 787def _fullcovar(minresult, epsilon = 0.01, named = False): 788 ''' 789 Construct full covariance matrix in the case of constrained parameters 790 ''' 791 792 import asteval 793 794 def f(values): 795 interp = asteval.Interpreter() 796 for n,v in zip(minresult.var_names, values): 797 interp(f'{n} = {v}') 798 for q in minresult.params: 799 if minresult.params[q].expr: 800 interp(f'{q} = {minresult.params[q].expr}') 801 return np.array([interp.symtable[q] for q in minresult.params]) 802 803 # construct Jacobian 804 J = np.zeros((minresult.nvarys, len(minresult.params))) 805 X = np.array([minresult.params[p].value for p in minresult.var_names]) 806 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 807 808 for j in range(minresult.nvarys): 809 x1 = [_ for _ in X] 810 x1[j] += epsilon * sX[j] 811 x2 = [_ for _ in X] 812 x2[j] -= epsilon * sX[j] 813 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 814 815 _names = [q for q in minresult.params] 816 _covar = J.T @ minresult.covar @ J 817 _se = np.diag(_covar)**.5 818 _correl = _covar.copy() 819 for k,s in enumerate(_se): 820 if s: 821 _correl[k,:] /= s 822 _correl[:,k] /= s 823 824 if named: 825 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 826 _se = {i: _se[i] for i in minresult.params} 827 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 828 829 return _names, _covar, _se, _correl 830 831 832class D4xdata(list): 833 ''' 834 Store and process data for a large set of Δ47 and/or Δ48 835 analyses, usually comprising more than one analytical session. 836 ''' 837 838 ### 17O CORRECTION PARAMETERS 839 R13_VPDB = 0.01118 # (Chang & Li, 1990) 840 ''' 841 Absolute (13C/12C) ratio of VPDB. 842 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 843 ''' 844 845 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 846 ''' 847 Absolute (18O/16C) ratio of VSMOW. 848 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 849 ''' 850 851 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 852 ''' 853 Mass-dependent exponent for triple oxygen isotopes. 854 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 855 ''' 856 857 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 858 ''' 859 Absolute (17O/16C) ratio of VSMOW. 860 By default equal to 0.00038475 861 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 862 rescaled to `R13_VPDB`) 863 ''' 864 865 R18_VPDB = R18_VSMOW * 1.03092 866 ''' 867 Absolute (18O/16C) ratio of VPDB. 868 By definition equal to `R18_VSMOW * 1.03092`. 869 ''' 870 871 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 872 ''' 873 Absolute (17O/16C) ratio of VPDB. 874 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 875 ''' 876 877 LEVENE_REF_SAMPLE = 'ETH-3' 878 ''' 879 After the Δ4x standardization step, each sample is tested to 880 assess whether the Δ4x variance within all analyses for that 881 sample differs significantly from that observed for a given reference 882 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 883 which yields a p-value corresponding to the null hypothesis that the 884 underlying variances are equal). 885 886 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 887 sample should be used as a reference for this test. 888 ''' 889 890 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 891 ''' 892 Specifies the 18O/16O fractionation factor generally applicable 893 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 894 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 895 896 By default equal to 1.008129 (calcite reacted at 90 °C, 897 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 898 ''' 899 900 Nominal_d13C_VPDB = { 901 'ETH-1': 2.02, 902 'ETH-2': -10.17, 903 'ETH-3': 1.71, 904 } # (Bernasconi et al., 2018) 905 ''' 906 Nominal δ13C_VPDB values assigned to carbonate standards, used by 907 `D4xdata.standardize_d13C()`. 908 909 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 910 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 911 ''' 912 913 Nominal_d18O_VPDB = { 914 'ETH-1': -2.19, 915 'ETH-2': -18.69, 916 'ETH-3': -1.78, 917 } # (Bernasconi et al., 2018) 918 ''' 919 Nominal δ18O_VPDB values assigned to carbonate standards, used by 920 `D4xdata.standardize_d18O()`. 921 922 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 923 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 924 ''' 925 926 d13C_STANDARDIZATION_METHOD = '2pt' 927 ''' 928 Method by which to standardize δ13C values: 929 930 + `none`: do not apply any δ13C standardization. 931 + `'1pt'`: within each session, offset all initial δ13C values so as to 932 minimize the difference between final δ13C_VPDB values and 933 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 934 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 935 values so as to minimize the difference between final δ13C_VPDB 936 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 937 is defined). 938 ''' 939 940 d18O_STANDARDIZATION_METHOD = '2pt' 941 ''' 942 Method by which to standardize δ18O values: 943 944 + `none`: do not apply any δ18O standardization. 945 + `'1pt'`: within each session, offset all initial δ18O values so as to 946 minimize the difference between final δ18O_VPDB values and 947 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 948 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 949 values so as to minimize the difference between final δ18O_VPDB 950 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 951 is defined). 952 ''' 953 954 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 955 ''' 956 **Parameters** 957 958 + `l`: a list of dictionaries, with each dictionary including at least the keys 959 `Sample`, `d45`, `d46`, and `d47` or `d48`. 960 + `mass`: `'47'` or `'48'` 961 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 962 + `session`: define session name for analyses without a `Session` key 963 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 964 965 Returns a `D4xdata` object derived from `list`. 966 ''' 967 self._4x = mass 968 self.verbose = verbose 969 self.prefix = 'D4xdata' 970 self.logfile = logfile 971 list.__init__(self, l) 972 self.Nf = None 973 self.repeatability = {} 974 self.refresh(session = session) 975 976 977 def make_verbal(oldfun): 978 ''' 979 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 980 ''' 981 @wraps(oldfun) 982 def newfun(*args, verbose = '', **kwargs): 983 myself = args[0] 984 oldprefix = myself.prefix 985 myself.prefix = oldfun.__name__ 986 if verbose != '': 987 oldverbose = myself.verbose 988 myself.verbose = verbose 989 out = oldfun(*args, **kwargs) 990 myself.prefix = oldprefix 991 if verbose != '': 992 myself.verbose = oldverbose 993 return out 994 return newfun 995 996 997 def msg(self, txt): 998 ''' 999 Log a message to `self.logfile`, and print it out if `verbose = True` 1000 ''' 1001 self.log(txt) 1002 if self.verbose: 1003 print(f'{f"[{self.prefix}]":<16} {txt}') 1004 1005 1006 def vmsg(self, txt): 1007 ''' 1008 Log a message to `self.logfile` and print it out 1009 ''' 1010 self.log(txt) 1011 print(txt) 1012 1013 1014 def log(self, *txts): 1015 ''' 1016 Log a message to `self.logfile` 1017 ''' 1018 if self.logfile: 1019 with open(self.logfile, 'a') as fid: 1020 for txt in txts: 1021 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1022 1023 1024 def refresh(self, session = 'mySession'): 1025 ''' 1026 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1027 ''' 1028 self.fill_in_missing_info(session = session) 1029 self.refresh_sessions() 1030 self.refresh_samples() 1031 1032 1033 def refresh_sessions(self): 1034 ''' 1035 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1036 to `False` for all sessions. 1037 ''' 1038 self.sessions = { 1039 s: {'data': [r for r in self if r['Session'] == s]} 1040 for s in sorted({r['Session'] for r in self}) 1041 } 1042 for s in self.sessions: 1043 self.sessions[s]['scrambling_drift'] = False 1044 self.sessions[s]['slope_drift'] = False 1045 self.sessions[s]['wg_drift'] = False 1046 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1047 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1048 1049 1050 def refresh_samples(self): 1051 ''' 1052 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1053 ''' 1054 self.samples = { 1055 s: {'data': [r for r in self if r['Sample'] == s]} 1056 for s in sorted({r['Sample'] for r in self}) 1057 } 1058 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1059 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1060 1061 1062 def read(self, filename, sep = '', session = ''): 1063 ''' 1064 Read file in csv format to load data into a `D47data` object. 1065 1066 In the csv file, spaces before and after field separators (`','` by default) 1067 are optional. Each line corresponds to a single analysis. 1068 1069 The required fields are: 1070 1071 + `UID`: a unique identifier 1072 + `Session`: an identifier for the analytical session 1073 + `Sample`: a sample identifier 1074 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1075 1076 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1077 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1078 and `d49` are optional, and set to NaN by default. 1079 1080 **Parameters** 1081 1082 + `fileneme`: the path of the file to read 1083 + `sep`: csv separator delimiting the fields 1084 + `session`: set `Session` field to this string for all analyses 1085 ''' 1086 with open(filename) as fid: 1087 self.input(fid.read(), sep = sep, session = session) 1088 1089 1090 def input(self, txt, sep = '', session = ''): 1091 ''' 1092 Read `txt` string in csv format to load analysis data into a `D47data` object. 1093 1094 In the csv string, spaces before and after field separators (`','` by default) 1095 are optional. Each line corresponds to a single analysis. 1096 1097 The required fields are: 1098 1099 + `UID`: a unique identifier 1100 + `Session`: an identifier for the analytical session 1101 + `Sample`: a sample identifier 1102 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1103 1104 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1105 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1106 and `d49` are optional, and set to NaN by default. 1107 1108 **Parameters** 1109 1110 + `txt`: the csv string to read 1111 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1112 whichever appers most often in `txt`. 1113 + `session`: set `Session` field to this string for all analyses 1114 ''' 1115 if sep == '': 1116 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1117 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1118 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1119 1120 if session != '': 1121 for r in data: 1122 r['Session'] = session 1123 1124 self += data 1125 self.refresh() 1126 1127 1128 @make_verbal 1129 def wg(self, samples = None, a18_acid = None): 1130 ''' 1131 Compute bulk composition of the working gas for each session based on 1132 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1133 `self.Nominal_d18O_VPDB`. 1134 ''' 1135 1136 self.msg('Computing WG composition:') 1137 1138 if a18_acid is None: 1139 a18_acid = self.ALPHA_18O_ACID_REACTION 1140 if samples is None: 1141 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1142 1143 assert a18_acid, f'Acid fractionation factor should not be zero.' 1144 1145 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1146 R45R46_standards = {} 1147 for sample in samples: 1148 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1149 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1150 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1151 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1152 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1153 1154 C12_s = 1 / (1 + R13_s) 1155 C13_s = R13_s / (1 + R13_s) 1156 C16_s = 1 / (1 + R17_s + R18_s) 1157 C17_s = R17_s / (1 + R17_s + R18_s) 1158 C18_s = R18_s / (1 + R17_s + R18_s) 1159 1160 C626_s = C12_s * C16_s ** 2 1161 C627_s = 2 * C12_s * C16_s * C17_s 1162 C628_s = 2 * C12_s * C16_s * C18_s 1163 C636_s = C13_s * C16_s ** 2 1164 C637_s = 2 * C13_s * C16_s * C17_s 1165 C727_s = C12_s * C17_s ** 2 1166 1167 R45_s = (C627_s + C636_s) / C626_s 1168 R46_s = (C628_s + C637_s + C727_s) / C626_s 1169 R45R46_standards[sample] = (R45_s, R46_s) 1170 1171 for s in self.sessions: 1172 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1173 assert db, f'No sample from {samples} found in session "{s}".' 1174# dbsamples = sorted({r['Sample'] for r in db}) 1175 1176 X = [r['d45'] for r in db] 1177 Y = [R45R46_standards[r['Sample']][0] for r in db] 1178 x1, x2 = np.min(X), np.max(X) 1179 1180 if x1 < x2: 1181 wgcoord = x1/(x1-x2) 1182 else: 1183 wgcoord = 999 1184 1185 if wgcoord < -.5 or wgcoord > 1.5: 1186 # unreasonable to extrapolate to d45 = 0 1187 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1188 else : 1189 # d45 = 0 is reasonably well bracketed 1190 R45_wg = np.polyfit(X, Y, 1)[1] 1191 1192 X = [r['d46'] for r in db] 1193 Y = [R45R46_standards[r['Sample']][1] for r in db] 1194 x1, x2 = np.min(X), np.max(X) 1195 1196 if x1 < x2: 1197 wgcoord = x1/(x1-x2) 1198 else: 1199 wgcoord = 999 1200 1201 if wgcoord < -.5 or wgcoord > 1.5: 1202 # unreasonable to extrapolate to d46 = 0 1203 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1204 else : 1205 # d46 = 0 is reasonably well bracketed 1206 R46_wg = np.polyfit(X, Y, 1)[1] 1207 1208 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1209 1210 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1211 1212 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1213 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1214 for r in self.sessions[s]['data']: 1215 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1216 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1217 1218 1219 def compute_bulk_delta(self, R45, R46, D17O = 0): 1220 ''' 1221 Compute δ13C_VPDB and δ18O_VSMOW, 1222 by solving the generalized form of equation (17) from 1223 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1224 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1225 solving the corresponding second-order Taylor polynomial. 1226 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1227 ''' 1228 1229 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1230 1231 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1232 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1233 C = 2 * self.R18_VSMOW 1234 D = -R46 1235 1236 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1237 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1238 cc = A + B + C + D 1239 1240 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1241 1242 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1243 R17 = K * R18 ** self.LAMBDA_17 1244 R13 = R45 - 2 * R17 1245 1246 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1247 1248 return d13C_VPDB, d18O_VSMOW 1249 1250 1251 @make_verbal 1252 def crunch(self, verbose = ''): 1253 ''' 1254 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1255 ''' 1256 for r in self: 1257 self.compute_bulk_and_clumping_deltas(r) 1258 self.standardize_d13C() 1259 self.standardize_d18O() 1260 self.msg(f"Crunched {len(self)} analyses.") 1261 1262 1263 def fill_in_missing_info(self, session = 'mySession'): 1264 ''' 1265 Fill in optional fields with default values 1266 ''' 1267 for i,r in enumerate(self): 1268 if 'D17O' not in r: 1269 r['D17O'] = 0. 1270 if 'UID' not in r: 1271 r['UID'] = f'{i+1}' 1272 if 'Session' not in r: 1273 r['Session'] = session 1274 for k in ['d47', 'd48', 'd49']: 1275 if k not in r: 1276 r[k] = np.nan 1277 1278 1279 def standardize_d13C(self): 1280 ''' 1281 Perform δ13C standadization within each session `s` according to 1282 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1283 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1284 may be redefined abitrarily at a later stage. 1285 ''' 1286 for s in self.sessions: 1287 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1288 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1289 X,Y = zip(*XY) 1290 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1291 offset = np.mean(Y) - np.mean(X) 1292 for r in self.sessions[s]['data']: 1293 r['d13C_VPDB'] += offset 1294 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1295 a,b = np.polyfit(X,Y,1) 1296 for r in self.sessions[s]['data']: 1297 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1298 1299 def standardize_d18O(self): 1300 ''' 1301 Perform δ18O standadization within each session `s` according to 1302 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1303 which is defined by default by `D47data.refresh_sessions()`as equal to 1304 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1305 ''' 1306 for s in self.sessions: 1307 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1308 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1309 X,Y = zip(*XY) 1310 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1311 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1312 offset = np.mean(Y) - np.mean(X) 1313 for r in self.sessions[s]['data']: 1314 r['d18O_VSMOW'] += offset 1315 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1316 a,b = np.polyfit(X,Y,1) 1317 for r in self.sessions[s]['data']: 1318 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1319 1320 1321 def compute_bulk_and_clumping_deltas(self, r): 1322 ''' 1323 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1324 ''' 1325 1326 # Compute working gas R13, R18, and isobar ratios 1327 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1328 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1329 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1330 1331 # Compute analyte isobar ratios 1332 R45 = (1 + r['d45'] / 1000) * R45_wg 1333 R46 = (1 + r['d46'] / 1000) * R46_wg 1334 R47 = (1 + r['d47'] / 1000) * R47_wg 1335 R48 = (1 + r['d48'] / 1000) * R48_wg 1336 R49 = (1 + r['d49'] / 1000) * R49_wg 1337 1338 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1339 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1340 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1341 1342 # Compute stochastic isobar ratios of the analyte 1343 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1344 R13, R18, D17O = r['D17O'] 1345 ) 1346 1347 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1348 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1349 if (R45 / R45stoch - 1) > 5e-8: 1350 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1351 if (R46 / R46stoch - 1) > 5e-8: 1352 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1353 1354 # Compute raw clumped isotope anomalies 1355 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1356 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1357 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1358 1359 1360 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1361 ''' 1362 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1363 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1364 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1365 ''' 1366 1367 # Compute R17 1368 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1369 1370 # Compute isotope concentrations 1371 C12 = (1 + R13) ** -1 1372 C13 = C12 * R13 1373 C16 = (1 + R17 + R18) ** -1 1374 C17 = C16 * R17 1375 C18 = C16 * R18 1376 1377 # Compute stochastic isotopologue concentrations 1378 C626 = C16 * C12 * C16 1379 C627 = C16 * C12 * C17 * 2 1380 C628 = C16 * C12 * C18 * 2 1381 C636 = C16 * C13 * C16 1382 C637 = C16 * C13 * C17 * 2 1383 C638 = C16 * C13 * C18 * 2 1384 C727 = C17 * C12 * C17 1385 C728 = C17 * C12 * C18 * 2 1386 C737 = C17 * C13 * C17 1387 C738 = C17 * C13 * C18 * 2 1388 C828 = C18 * C12 * C18 1389 C838 = C18 * C13 * C18 1390 1391 # Compute stochastic isobar ratios 1392 R45 = (C636 + C627) / C626 1393 R46 = (C628 + C637 + C727) / C626 1394 R47 = (C638 + C728 + C737) / C626 1395 R48 = (C738 + C828) / C626 1396 R49 = C838 / C626 1397 1398 # Account for stochastic anomalies 1399 R47 *= 1 + D47 / 1000 1400 R48 *= 1 + D48 / 1000 1401 R49 *= 1 + D49 / 1000 1402 1403 # Return isobar ratios 1404 return R45, R46, R47, R48, R49 1405 1406 1407 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1408 ''' 1409 Split unknown samples by UID (treat all analyses as different samples) 1410 or by session (treat analyses of a given sample in different sessions as 1411 different samples). 1412 1413 **Parameters** 1414 1415 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1416 + `grouping`: `by_uid` | `by_session` 1417 ''' 1418 if samples_to_split == 'all': 1419 samples_to_split = [s for s in self.unknowns] 1420 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1421 self.grouping = grouping.lower() 1422 if self.grouping in gkeys: 1423 gkey = gkeys[self.grouping] 1424 for r in self: 1425 if r['Sample'] in samples_to_split: 1426 r['Sample_original'] = r['Sample'] 1427 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1428 elif r['Sample'] in self.unknowns: 1429 r['Sample_original'] = r['Sample'] 1430 self.refresh_samples() 1431 1432 1433 def unsplit_samples(self, tables = False): 1434 ''' 1435 Reverse the effects of `D47data.split_samples()`. 1436 1437 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1438 1439 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1440 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1441 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1442 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1443 that case session-averaged Δ4x values are statistically independent). 1444 ''' 1445 unknowns_old = sorted({s for s in self.unknowns}) 1446 CM_old = self.standardization.covar[:,:] 1447 VD_old = self.standardization.params.valuesdict().copy() 1448 vars_old = self.standardization.var_names 1449 1450 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1451 1452 Ns = len(vars_old) - len(unknowns_old) 1453 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1454 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1455 1456 W = np.zeros((len(vars_new), len(vars_old))) 1457 W[:Ns,:Ns] = np.eye(Ns) 1458 for u in unknowns_new: 1459 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1460 if self.grouping == 'by_session': 1461 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1462 elif self.grouping == 'by_uid': 1463 weights = [1 for s in splits] 1464 sw = sum(weights) 1465 weights = [w/sw for w in weights] 1466 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1467 1468 CM_new = W @ CM_old @ W.T 1469 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1470 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1471 1472 self.standardization.covar = CM_new 1473 self.standardization.params.valuesdict = lambda : VD_new 1474 self.standardization.var_names = vars_new 1475 1476 for r in self: 1477 if r['Sample'] in self.unknowns: 1478 r['Sample_split'] = r['Sample'] 1479 r['Sample'] = r['Sample_original'] 1480 1481 self.refresh_samples() 1482 self.consolidate_samples() 1483 self.repeatabilities() 1484 1485 if tables: 1486 self.table_of_analyses() 1487 self.table_of_samples() 1488 1489 def assign_timestamps(self): 1490 ''' 1491 Assign a time field `t` of type `float` to each analysis. 1492 1493 If `TimeTag` is one of the data fields, `t` is equal within a given session 1494 to `TimeTag` minus the mean value of `TimeTag` for that session. 1495 Otherwise, `TimeTag` is by default equal to the index of each analysis 1496 in the dataset and `t` is defined as above. 1497 ''' 1498 for session in self.sessions: 1499 sdata = self.sessions[session]['data'] 1500 try: 1501 t0 = np.mean([r['TimeTag'] for r in sdata]) 1502 for r in sdata: 1503 r['t'] = r['TimeTag'] - t0 1504 except KeyError: 1505 t0 = (len(sdata)-1)/2 1506 for t,r in enumerate(sdata): 1507 r['t'] = t - t0 1508 1509 1510 def report(self): 1511 ''' 1512 Prints a report on the standardization fit. 1513 Only applicable after `D4xdata.standardize(method='pooled')`. 1514 ''' 1515 report_fit(self.standardization) 1516 1517 1518 def combine_samples(self, sample_groups): 1519 ''' 1520 Combine analyses of different samples to compute weighted average Δ4x 1521 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1522 dictionary. 1523 1524 Caution: samples are weighted by number of replicate analyses, which is a 1525 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1526 correlated analytical errors for one or more samples). 1527 1528 Returns a tuplet of: 1529 1530 + the list of group names 1531 + an array of the corresponding Δ4x values 1532 + the corresponding (co)variance matrix 1533 1534 **Parameters** 1535 1536 + `sample_groups`: a dictionary of the form: 1537 ```py 1538 {'group1': ['sample_1', 'sample_2'], 1539 'group2': ['sample_3', 'sample_4', 'sample_5']} 1540 ``` 1541 ''' 1542 1543 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1544 groups = sorted(sample_groups.keys()) 1545 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1546 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1547 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1548 W = np.array([ 1549 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1550 for j in groups]) 1551 D4x_new = W @ D4x_old 1552 CM_new = W @ CM_old @ W.T 1553 1554 return groups, D4x_new[:,0], CM_new 1555 1556 1557 @make_verbal 1558 def standardize(self, 1559 method = 'pooled', 1560 weighted_sessions = [], 1561 consolidate = True, 1562 consolidate_tables = False, 1563 consolidate_plots = False, 1564 constraints = {}, 1565 ): 1566 ''' 1567 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1568 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1569 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1570 i.e. that their true Δ4x value does not change between sessions, 1571 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1572 `'indep_sessions'`, the standardization processes each session independently, based only 1573 on anchors analyses. 1574 ''' 1575 1576 self.standardization_method = method 1577 self.assign_timestamps() 1578 1579 if method == 'pooled': 1580 if weighted_sessions: 1581 for session_group in weighted_sessions: 1582 if self._4x == '47': 1583 X = D47data([r for r in self if r['Session'] in session_group]) 1584 elif self._4x == '48': 1585 X = D48data([r for r in self if r['Session'] in session_group]) 1586 X.Nominal_D4x = self.Nominal_D4x.copy() 1587 X.refresh() 1588 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1589 w = np.sqrt(result.redchi) 1590 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1591 for r in X: 1592 r[f'wD{self._4x}raw'] *= w 1593 else: 1594 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1595 for r in self: 1596 r[f'wD{self._4x}raw'] = 1. 1597 1598 params = Parameters() 1599 for k,session in enumerate(self.sessions): 1600 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1601 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1602 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1603 s = pf(session) 1604 params.add(f'a_{s}', value = 0.9) 1605 params.add(f'b_{s}', value = 0.) 1606 params.add(f'c_{s}', value = -0.9) 1607 params.add(f'a2_{s}', value = 0., 1608# vary = self.sessions[session]['scrambling_drift'], 1609 ) 1610 params.add(f'b2_{s}', value = 0., 1611# vary = self.sessions[session]['slope_drift'], 1612 ) 1613 params.add(f'c2_{s}', value = 0., 1614# vary = self.sessions[session]['wg_drift'], 1615 ) 1616 if not self.sessions[session]['scrambling_drift']: 1617 params[f'a2_{s}'].expr = '0' 1618 if not self.sessions[session]['slope_drift']: 1619 params[f'b2_{s}'].expr = '0' 1620 if not self.sessions[session]['wg_drift']: 1621 params[f'c2_{s}'].expr = '0' 1622 1623 for sample in self.unknowns: 1624 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1625 1626 for k in constraints: 1627 params[k].expr = constraints[k] 1628 1629 def residuals(p): 1630 R = [] 1631 for r in self: 1632 session = pf(r['Session']) 1633 sample = pf(r['Sample']) 1634 if r['Sample'] in self.Nominal_D4x: 1635 R += [ ( 1636 r[f'D{self._4x}raw'] - ( 1637 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1638 + p[f'b_{session}'] * r[f'd{self._4x}'] 1639 + p[f'c_{session}'] 1640 + r['t'] * ( 1641 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1642 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1643 + p[f'c2_{session}'] 1644 ) 1645 ) 1646 ) / r[f'wD{self._4x}raw'] ] 1647 else: 1648 R += [ ( 1649 r[f'D{self._4x}raw'] - ( 1650 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1651 + p[f'b_{session}'] * r[f'd{self._4x}'] 1652 + p[f'c_{session}'] 1653 + r['t'] * ( 1654 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1655 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1656 + p[f'c2_{session}'] 1657 ) 1658 ) 1659 ) / r[f'wD{self._4x}raw'] ] 1660 return R 1661 1662 M = Minimizer(residuals, params) 1663 result = M.least_squares() 1664 self.Nf = result.nfree 1665 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1666 new_names, new_covar, new_se = _fullcovar(result)[:3] 1667 result.var_names = new_names 1668 result.covar = new_covar 1669 1670 for r in self: 1671 s = pf(r["Session"]) 1672 a = result.params.valuesdict()[f'a_{s}'] 1673 b = result.params.valuesdict()[f'b_{s}'] 1674 c = result.params.valuesdict()[f'c_{s}'] 1675 a2 = result.params.valuesdict()[f'a2_{s}'] 1676 b2 = result.params.valuesdict()[f'b2_{s}'] 1677 c2 = result.params.valuesdict()[f'c2_{s}'] 1678 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1679 1680 1681 self.standardization = result 1682 1683 for session in self.sessions: 1684 self.sessions[session]['Np'] = 3 1685 for k in ['scrambling', 'slope', 'wg']: 1686 if self.sessions[session][f'{k}_drift']: 1687 self.sessions[session]['Np'] += 1 1688 1689 if consolidate: 1690 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1691 return result 1692 1693 1694 elif method == 'indep_sessions': 1695 1696 if weighted_sessions: 1697 for session_group in weighted_sessions: 1698 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1699 X.Nominal_D4x = self.Nominal_D4x.copy() 1700 X.refresh() 1701 # This is only done to assign r['wD47raw'] for r in X: 1702 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1703 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1704 else: 1705 self.msg('All weights set to 1 ‰') 1706 for r in self: 1707 r[f'wD{self._4x}raw'] = 1 1708 1709 for session in self.sessions: 1710 s = self.sessions[session] 1711 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1712 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1713 s['Np'] = sum(p_active) 1714 sdata = s['data'] 1715 1716 A = np.array([ 1717 [ 1718 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1719 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1720 1 / r[f'wD{self._4x}raw'], 1721 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1722 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1723 r['t'] / r[f'wD{self._4x}raw'] 1724 ] 1725 for r in sdata if r['Sample'] in self.anchors 1726 ])[:,p_active] # only keep columns for the active parameters 1727 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1728 s['Na'] = Y.size 1729 CM = linalg.inv(A.T @ A) 1730 bf = (CM @ A.T @ Y).T[0,:] 1731 k = 0 1732 for n,a in zip(p_names, p_active): 1733 if a: 1734 s[n] = bf[k] 1735# self.msg(f'{n} = {bf[k]}') 1736 k += 1 1737 else: 1738 s[n] = 0. 1739# self.msg(f'{n} = 0.0') 1740 1741 for r in sdata : 1742 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1743 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1744 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1745 1746 s['CM'] = np.zeros((6,6)) 1747 i = 0 1748 k_active = [j for j,a in enumerate(p_active) if a] 1749 for j,a in enumerate(p_active): 1750 if a: 1751 s['CM'][j,k_active] = CM[i,:] 1752 i += 1 1753 1754 if not weighted_sessions: 1755 w = self.rmswd()['rmswd'] 1756 for r in self: 1757 r[f'wD{self._4x}'] *= w 1758 r[f'wD{self._4x}raw'] *= w 1759 for session in self.sessions: 1760 self.sessions[session]['CM'] *= w**2 1761 1762 for session in self.sessions: 1763 s = self.sessions[session] 1764 s['SE_a'] = s['CM'][0,0]**.5 1765 s['SE_b'] = s['CM'][1,1]**.5 1766 s['SE_c'] = s['CM'][2,2]**.5 1767 s['SE_a2'] = s['CM'][3,3]**.5 1768 s['SE_b2'] = s['CM'][4,4]**.5 1769 s['SE_c2'] = s['CM'][5,5]**.5 1770 1771 if not weighted_sessions: 1772 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1773 else: 1774 self.Nf = 0 1775 for sg in weighted_sessions: 1776 self.Nf += self.rmswd(sessions = sg)['Nf'] 1777 1778 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1779 1780 avgD4x = { 1781 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1782 for sample in self.samples 1783 } 1784 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1785 rD4x = (chi2/self.Nf)**.5 1786 self.repeatability[f'sigma_{self._4x}'] = rD4x 1787 1788 if consolidate: 1789 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1790 1791 1792 def standardization_error(self, session, d4x, D4x, t = 0): 1793 ''' 1794 Compute standardization error for a given session and 1795 (δ47, Δ47) composition. 1796 ''' 1797 a = self.sessions[session]['a'] 1798 b = self.sessions[session]['b'] 1799 c = self.sessions[session]['c'] 1800 a2 = self.sessions[session]['a2'] 1801 b2 = self.sessions[session]['b2'] 1802 c2 = self.sessions[session]['c2'] 1803 CM = self.sessions[session]['CM'] 1804 1805 x, y = D4x, d4x 1806 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1807# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1808 dxdy = -(b+b2*t) / (a+a2*t) 1809 dxdz = 1. / (a+a2*t) 1810 dxda = -x / (a+a2*t) 1811 dxdb = -y / (a+a2*t) 1812 dxdc = -1. / (a+a2*t) 1813 dxda2 = -x * a2 / (a+a2*t) 1814 dxdb2 = -y * t / (a+a2*t) 1815 dxdc2 = -t / (a+a2*t) 1816 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1817 sx = (V @ CM @ V.T) ** .5 1818 return sx 1819 1820 1821 @make_verbal 1822 def summary(self, 1823 dir = 'output', 1824 filename = None, 1825 save_to_file = True, 1826 print_out = True, 1827 ): 1828 ''' 1829 Print out an/or save to disk a summary of the standardization results. 1830 1831 **Parameters** 1832 1833 + `dir`: the directory in which to save the table 1834 + `filename`: the name to the csv file to write to 1835 + `save_to_file`: whether to save the table to disk 1836 + `print_out`: whether to print out the table 1837 ''' 1838 1839 out = [] 1840 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1841 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1842 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1843 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1844 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1845 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1846 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1847 out += [['Model degrees of freedom', f"{self.Nf}"]] 1848 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1849 out += [['Standardization method', self.standardization_method]] 1850 1851 if save_to_file: 1852 if not os.path.exists(dir): 1853 os.makedirs(dir) 1854 if filename is None: 1855 filename = f'D{self._4x}_summary.csv' 1856 with open(f'{dir}/{filename}', 'w') as fid: 1857 fid.write(make_csv(out)) 1858 if print_out: 1859 self.msg('\n' + pretty_table(out, header = 0)) 1860 1861 1862 @make_verbal 1863 def table_of_sessions(self, 1864 dir = 'output', 1865 filename = None, 1866 save_to_file = True, 1867 print_out = True, 1868 output = None, 1869 ): 1870 ''' 1871 Print out an/or save to disk a table of sessions. 1872 1873 **Parameters** 1874 1875 + `dir`: the directory in which to save the table 1876 + `filename`: the name to the csv file to write to 1877 + `save_to_file`: whether to save the table to disk 1878 + `print_out`: whether to print out the table 1879 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1880 if set to `'raw'`: return a list of list of strings 1881 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1882 ''' 1883 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1884 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1885 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1886 1887 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1888 if include_a2: 1889 out[-1] += ['a2 ± SE'] 1890 if include_b2: 1891 out[-1] += ['b2 ± SE'] 1892 if include_c2: 1893 out[-1] += ['c2 ± SE'] 1894 for session in self.sessions: 1895 out += [[ 1896 session, 1897 f"{self.sessions[session]['Na']}", 1898 f"{self.sessions[session]['Nu']}", 1899 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1900 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1901 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1902 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1903 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1904 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1905 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1906 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1907 ]] 1908 if include_a2: 1909 if self.sessions[session]['scrambling_drift']: 1910 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1911 else: 1912 out[-1] += [''] 1913 if include_b2: 1914 if self.sessions[session]['slope_drift']: 1915 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1916 else: 1917 out[-1] += [''] 1918 if include_c2: 1919 if self.sessions[session]['wg_drift']: 1920 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1921 else: 1922 out[-1] += [''] 1923 1924 if save_to_file: 1925 if not os.path.exists(dir): 1926 os.makedirs(dir) 1927 if filename is None: 1928 filename = f'D{self._4x}_sessions.csv' 1929 with open(f'{dir}/{filename}', 'w') as fid: 1930 fid.write(make_csv(out)) 1931 if print_out: 1932 self.msg('\n' + pretty_table(out)) 1933 if output == 'raw': 1934 return out 1935 elif output == 'pretty': 1936 return pretty_table(out) 1937 1938 1939 @make_verbal 1940 def table_of_analyses( 1941 self, 1942 dir = 'output', 1943 filename = None, 1944 save_to_file = True, 1945 print_out = True, 1946 output = None, 1947 ): 1948 ''' 1949 Print out an/or save to disk a table of analyses. 1950 1951 **Parameters** 1952 1953 + `dir`: the directory in which to save the table 1954 + `filename`: the name to the csv file to write to 1955 + `save_to_file`: whether to save the table to disk 1956 + `print_out`: whether to print out the table 1957 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1958 if set to `'raw'`: return a list of list of strings 1959 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1960 ''' 1961 1962 out = [['UID','Session','Sample']] 1963 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1964 for f in extra_fields: 1965 out[-1] += [f[0]] 1966 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1967 for r in self: 1968 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1969 for f in extra_fields: 1970 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1971 out[-1] += [ 1972 f"{r['d13Cwg_VPDB']:.3f}", 1973 f"{r['d18Owg_VSMOW']:.3f}", 1974 f"{r['d45']:.6f}", 1975 f"{r['d46']:.6f}", 1976 f"{r['d47']:.6f}", 1977 f"{r['d48']:.6f}", 1978 f"{r['d49']:.6f}", 1979 f"{r['d13C_VPDB']:.6f}", 1980 f"{r['d18O_VSMOW']:.6f}", 1981 f"{r['D47raw']:.6f}", 1982 f"{r['D48raw']:.6f}", 1983 f"{r['D49raw']:.6f}", 1984 f"{r[f'D{self._4x}']:.6f}" 1985 ] 1986 if save_to_file: 1987 if not os.path.exists(dir): 1988 os.makedirs(dir) 1989 if filename is None: 1990 filename = f'D{self._4x}_analyses.csv' 1991 with open(f'{dir}/{filename}', 'w') as fid: 1992 fid.write(make_csv(out)) 1993 if print_out: 1994 self.msg('\n' + pretty_table(out)) 1995 return out 1996 1997 @make_verbal 1998 def covar_table( 1999 self, 2000 correl = False, 2001 dir = 'output', 2002 filename = None, 2003 save_to_file = True, 2004 print_out = True, 2005 output = None, 2006 ): 2007 ''' 2008 Print out, save to disk and/or return the variance-covariance matrix of D4x 2009 for all unknown samples. 2010 2011 **Parameters** 2012 2013 + `dir`: the directory in which to save the csv 2014 + `filename`: the name of the csv file to write to 2015 + `save_to_file`: whether to save the csv 2016 + `print_out`: whether to print out the matrix 2017 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2018 if set to `'raw'`: return a list of list of strings 2019 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2020 ''' 2021 samples = sorted([u for u in self.unknowns]) 2022 out = [[''] + samples] 2023 for s1 in samples: 2024 out.append([s1]) 2025 for s2 in samples: 2026 if correl: 2027 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2028 else: 2029 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2030 2031 if save_to_file: 2032 if not os.path.exists(dir): 2033 os.makedirs(dir) 2034 if filename is None: 2035 if correl: 2036 filename = f'D{self._4x}_correl.csv' 2037 else: 2038 filename = f'D{self._4x}_covar.csv' 2039 with open(f'{dir}/{filename}', 'w') as fid: 2040 fid.write(make_csv(out)) 2041 if print_out: 2042 self.msg('\n'+pretty_table(out)) 2043 if output == 'raw': 2044 return out 2045 elif output == 'pretty': 2046 return pretty_table(out) 2047 2048 @make_verbal 2049 def table_of_samples( 2050 self, 2051 dir = 'output', 2052 filename = None, 2053 save_to_file = True, 2054 print_out = True, 2055 output = None, 2056 ): 2057 ''' 2058 Print out, save to disk and/or return a table of samples. 2059 2060 **Parameters** 2061 2062 + `dir`: the directory in which to save the csv 2063 + `filename`: the name of the csv file to write to 2064 + `save_to_file`: whether to save the csv 2065 + `print_out`: whether to print out the table 2066 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2067 if set to `'raw'`: return a list of list of strings 2068 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2069 ''' 2070 2071 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2072 for sample in self.anchors: 2073 out += [[ 2074 f"{sample}", 2075 f"{self.samples[sample]['N']}", 2076 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2077 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2078 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2079 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2080 ]] 2081 for sample in self.unknowns: 2082 out += [[ 2083 f"{sample}", 2084 f"{self.samples[sample]['N']}", 2085 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2086 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2087 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2088 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2089 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2090 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2091 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2092 ]] 2093 if save_to_file: 2094 if not os.path.exists(dir): 2095 os.makedirs(dir) 2096 if filename is None: 2097 filename = f'D{self._4x}_samples.csv' 2098 with open(f'{dir}/{filename}', 'w') as fid: 2099 fid.write(make_csv(out)) 2100 if print_out: 2101 self.msg('\n'+pretty_table(out)) 2102 if output == 'raw': 2103 return out 2104 elif output == 'pretty': 2105 return pretty_table(out) 2106 2107 2108 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2109 ''' 2110 Generate session plots and save them to disk. 2111 2112 **Parameters** 2113 2114 + `dir`: the directory in which to save the plots 2115 + `figsize`: the width and height (in inches) of each plot 2116 + `filetype`: 'pdf' or 'png' 2117 + `dpi`: resolution for PNG output 2118 ''' 2119 if not os.path.exists(dir): 2120 os.makedirs(dir) 2121 2122 for session in self.sessions: 2123 sp = self.plot_single_session(session, xylimits = 'constant') 2124 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2125 ppl.close(sp.fig) 2126 2127 2128 2129 @make_verbal 2130 def consolidate_samples(self): 2131 ''' 2132 Compile various statistics for each sample. 2133 2134 For each anchor sample: 2135 2136 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2137 + `SE_D47` or `SE_D48`: set to zero by definition 2138 2139 For each unknown sample: 2140 2141 + `D47` or `D48`: the standardized Δ4x value for this unknown 2142 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2143 2144 For each anchor and unknown: 2145 2146 + `N`: the total number of analyses of this sample 2147 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2148 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2149 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2150 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2151 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2152 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2153 ''' 2154 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2155 for sample in self.samples: 2156 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2157 if self.samples[sample]['N'] > 1: 2158 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2159 2160 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2161 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2162 2163 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2164 if len(D4x_pop) > 2: 2165 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2166 2167 if self.standardization_method == 'pooled': 2168 for sample in self.anchors: 2169 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2170 self.samples[sample][f'SE_D{self._4x}'] = 0. 2171 for sample in self.unknowns: 2172 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2173 try: 2174 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2175 except ValueError: 2176 # when `sample` is constrained by self.standardize(constraints = {...}), 2177 # it is no longer listed in self.standardization.var_names. 2178 # Temporary fix: define SE as zero for now 2179 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2180 2181 elif self.standardization_method == 'indep_sessions': 2182 for sample in self.anchors: 2183 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2184 self.samples[sample][f'SE_D{self._4x}'] = 0. 2185 for sample in self.unknowns: 2186 self.msg(f'Consolidating sample {sample}') 2187 self.unknowns[sample][f'session_D{self._4x}'] = {} 2188 session_avg = [] 2189 for session in self.sessions: 2190 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2191 if sdata: 2192 self.msg(f'{sample} found in session {session}') 2193 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2194 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2195 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2196 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2197 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2198 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2199 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2200 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2201 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2202 wsum = sum([weights[s] for s in weights]) 2203 for s in weights: 2204 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2205 2206 for r in self: 2207 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2208 2209 2210 2211 def consolidate_sessions(self): 2212 ''' 2213 Compute various statistics for each session. 2214 2215 + `Na`: Number of anchor analyses in the session 2216 + `Nu`: Number of unknown analyses in the session 2217 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2218 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2219 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2220 + `a`: scrambling factor 2221 + `b`: compositional slope 2222 + `c`: WG offset 2223 + `SE_a`: Model stadard erorr of `a` 2224 + `SE_b`: Model stadard erorr of `b` 2225 + `SE_c`: Model stadard erorr of `c` 2226 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2227 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2228 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2229 + `a2`: scrambling factor drift 2230 + `b2`: compositional slope drift 2231 + `c2`: WG offset drift 2232 + `Np`: Number of standardization parameters to fit 2233 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2234 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2235 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2236 ''' 2237 for session in self.sessions: 2238 if 'd13Cwg_VPDB' not in self.sessions[session]: 2239 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2240 if 'd18Owg_VSMOW' not in self.sessions[session]: 2241 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2242 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2243 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2244 2245 self.msg(f'Computing repeatabilities for session {session}') 2246 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2247 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2248 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2249 2250 if self.standardization_method == 'pooled': 2251 for session in self.sessions: 2252 2253 # different (better?) computation of D4x repeatability for each session: 2254 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2255 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2256 2257 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2258 i = self.standardization.var_names.index(f'a_{pf(session)}') 2259 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2260 2261 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2262 i = self.standardization.var_names.index(f'b_{pf(session)}') 2263 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2264 2265 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2266 i = self.standardization.var_names.index(f'c_{pf(session)}') 2267 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2268 2269 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2270 if self.sessions[session]['scrambling_drift']: 2271 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2272 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2273 else: 2274 self.sessions[session]['SE_a2'] = 0. 2275 2276 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2277 if self.sessions[session]['slope_drift']: 2278 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2279 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2280 else: 2281 self.sessions[session]['SE_b2'] = 0. 2282 2283 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2284 if self.sessions[session]['wg_drift']: 2285 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2286 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2287 else: 2288 self.sessions[session]['SE_c2'] = 0. 2289 2290 i = self.standardization.var_names.index(f'a_{pf(session)}') 2291 j = self.standardization.var_names.index(f'b_{pf(session)}') 2292 k = self.standardization.var_names.index(f'c_{pf(session)}') 2293 CM = np.zeros((6,6)) 2294 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2295 try: 2296 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2297 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2298 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2299 try: 2300 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2301 CM[3,4] = self.standardization.covar[i2,j2] 2302 CM[4,3] = self.standardization.covar[j2,i2] 2303 except ValueError: 2304 pass 2305 try: 2306 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2307 CM[3,5] = self.standardization.covar[i2,k2] 2308 CM[5,3] = self.standardization.covar[k2,i2] 2309 except ValueError: 2310 pass 2311 except ValueError: 2312 pass 2313 try: 2314 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2315 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2316 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2317 try: 2318 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2319 CM[4,5] = self.standardization.covar[j2,k2] 2320 CM[5,4] = self.standardization.covar[k2,j2] 2321 except ValueError: 2322 pass 2323 except ValueError: 2324 pass 2325 try: 2326 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2327 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2328 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2329 except ValueError: 2330 pass 2331 2332 self.sessions[session]['CM'] = CM 2333 2334 elif self.standardization_method == 'indep_sessions': 2335 pass # Not implemented yet 2336 2337 2338 @make_verbal 2339 def repeatabilities(self): 2340 ''' 2341 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2342 (for all samples, for anchors, and for unknowns). 2343 ''' 2344 self.msg('Computing reproducibilities for all sessions') 2345 2346 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2347 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2348 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2349 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2350 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2351 2352 2353 @make_verbal 2354 def consolidate(self, tables = True, plots = True): 2355 ''' 2356 Collect information about samples, sessions and repeatabilities. 2357 ''' 2358 self.consolidate_samples() 2359 self.consolidate_sessions() 2360 self.repeatabilities() 2361 2362 if tables: 2363 self.summary() 2364 self.table_of_sessions() 2365 self.table_of_analyses() 2366 self.table_of_samples() 2367 2368 if plots: 2369 self.plot_sessions() 2370 2371 2372 @make_verbal 2373 def rmswd(self, 2374 samples = 'all samples', 2375 sessions = 'all sessions', 2376 ): 2377 ''' 2378 Compute the χ2, root mean squared weighted deviation 2379 (i.e. reduced χ2), and corresponding degrees of freedom of the 2380 Δ4x values for samples in `samples` and sessions in `sessions`. 2381 2382 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2383 ''' 2384 if samples == 'all samples': 2385 mysamples = [k for k in self.samples] 2386 elif samples == 'anchors': 2387 mysamples = [k for k in self.anchors] 2388 elif samples == 'unknowns': 2389 mysamples = [k for k in self.unknowns] 2390 else: 2391 mysamples = samples 2392 2393 if sessions == 'all sessions': 2394 sessions = [k for k in self.sessions] 2395 2396 chisq, Nf = 0, 0 2397 for sample in mysamples : 2398 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2399 if len(G) > 1 : 2400 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2401 Nf += (len(G) - 1) 2402 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2403 r = (chisq / Nf)**.5 if Nf > 0 else 0 2404 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2405 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2406 2407 2408 @make_verbal 2409 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2410 ''' 2411 Compute the repeatability of `[r[key] for r in self]` 2412 ''' 2413 2414 if samples == 'all samples': 2415 mysamples = [k for k in self.samples] 2416 elif samples == 'anchors': 2417 mysamples = [k for k in self.anchors] 2418 elif samples == 'unknowns': 2419 mysamples = [k for k in self.unknowns] 2420 else: 2421 mysamples = samples 2422 2423 if sessions == 'all sessions': 2424 sessions = [k for k in self.sessions] 2425 2426 if key in ['D47', 'D48']: 2427 # Full disclosure: the definition of Nf is tricky/debatable 2428 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2429 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2430 Nf = len(G) 2431# print(f'len(G) = {Nf}') 2432 Nf -= len([s for s in mysamples if s in self.unknowns]) 2433# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2434 for session in sessions: 2435 Np = len([ 2436 _ for _ in self.standardization.params 2437 if ( 2438 self.standardization.params[_].expr is not None 2439 and ( 2440 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2441 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2442 ) 2443 ) 2444 ]) 2445# print(f'session {session}: {Np} parameters to consider') 2446 Na = len({ 2447 r['Sample'] for r in self.sessions[session]['data'] 2448 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2449 }) 2450# print(f'session {session}: {Na} different anchors in that session') 2451 Nf -= min(Np, Na) 2452# print(f'Nf = {Nf}') 2453 2454# for sample in mysamples : 2455# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2456# if len(X) > 1 : 2457# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2458# if sample in self.unknowns: 2459# Nf += len(X) - 1 2460# else: 2461# Nf += len(X) 2462# if samples in ['anchors', 'all samples']: 2463# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2464 r = (chisq / Nf)**.5 if Nf > 0 else 0 2465 2466 else: # if key not in ['D47', 'D48'] 2467 chisq, Nf = 0, 0 2468 for sample in mysamples : 2469 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2470 if len(X) > 1 : 2471 Nf += len(X) - 1 2472 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2473 r = (chisq / Nf)**.5 if Nf > 0 else 0 2474 2475 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2476 return r 2477 2478 def sample_average(self, samples, weights = 'equal', normalize = True): 2479 ''' 2480 Weighted average Δ4x value of a group of samples, accounting for covariance. 2481 2482 Returns the weighed average Δ4x value and associated SE 2483 of a group of samples. Weights are equal by default. If `normalize` is 2484 true, `weights` will be rescaled so that their sum equals 1. 2485 2486 **Examples** 2487 2488 ```python 2489 self.sample_average(['X','Y'], [1, 2]) 2490 ``` 2491 2492 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2493 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2494 values of samples X and Y, respectively. 2495 2496 ```python 2497 self.sample_average(['X','Y'], [1, -1], normalize = False) 2498 ``` 2499 2500 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2501 ''' 2502 if weights == 'equal': 2503 weights = [1/len(samples)] * len(samples) 2504 2505 if normalize: 2506 s = sum(weights) 2507 if s: 2508 weights = [w/s for w in weights] 2509 2510 try: 2511# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2512# C = self.standardization.covar[indices,:][:,indices] 2513 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2514 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2515 return correlated_sum(X, C, weights) 2516 except ValueError: 2517 return (0., 0.) 2518 2519 2520 def sample_D4x_covar(self, sample1, sample2 = None): 2521 ''' 2522 Covariance between Δ4x values of samples 2523 2524 Returns the error covariance between the average Δ4x values of two 2525 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2526 returns the Δ4x variance for that sample. 2527 ''' 2528 if sample2 is None: 2529 sample2 = sample1 2530 if self.standardization_method == 'pooled': 2531 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2532 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2533 return self.standardization.covar[i, j] 2534 elif self.standardization_method == 'indep_sessions': 2535 if sample1 == sample2: 2536 return self.samples[sample1][f'SE_D{self._4x}']**2 2537 else: 2538 c = 0 2539 for session in self.sessions: 2540 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2541 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2542 if sdata1 and sdata2: 2543 a = self.sessions[session]['a'] 2544 # !! TODO: CM below does not account for temporal changes in standardization parameters 2545 CM = self.sessions[session]['CM'][:3,:3] 2546 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2547 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2548 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2549 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2550 c += ( 2551 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2552 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2553 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2554 @ CM 2555 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2556 ) / a**2 2557 return float(c) 2558 2559 def sample_D4x_correl(self, sample1, sample2 = None): 2560 ''' 2561 Correlation between Δ4x errors of samples 2562 2563 Returns the error correlation between the average Δ4x values of two samples. 2564 ''' 2565 if sample2 is None or sample2 == sample1: 2566 return 1. 2567 return ( 2568 self.sample_D4x_covar(sample1, sample2) 2569 / self.unknowns[sample1][f'SE_D{self._4x}'] 2570 / self.unknowns[sample2][f'SE_D{self._4x}'] 2571 ) 2572 2573 def plot_single_session(self, 2574 session, 2575 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2576 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2577 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2578 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2579 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2580 xylimits = 'free', # | 'constant' 2581 x_label = None, 2582 y_label = None, 2583 error_contour_interval = 'auto', 2584 fig = 'new', 2585 ): 2586 ''' 2587 Generate plot for a single session 2588 ''' 2589 if x_label is None: 2590 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2591 if y_label is None: 2592 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2593 2594 out = _SessionPlot() 2595 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2596 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2597 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2598 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2599 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2600 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2601 anchor_avg = (np.array([ np.array([ 2602 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2603 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2604 ]) for sample in anchors]).T, 2605 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2606 unknown_avg = (np.array([ np.array([ 2607 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2608 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2609 ]) for sample in unknowns]).T, 2610 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2611 2612 2613 if fig == 'new': 2614 out.fig = ppl.figure(figsize = (6,6)) 2615 ppl.subplots_adjust(.1,.1,.9,.9) 2616 2617 out.anchor_analyses, = ppl.plot( 2618 anchors_d, 2619 anchors_D, 2620 **kw_plot_anchors) 2621 out.unknown_analyses, = ppl.plot( 2622 unknowns_d, 2623 unknowns_D, 2624 **kw_plot_unknowns) 2625 out.anchor_avg = ppl.plot( 2626 *anchor_avg, 2627 **kw_plot_anchor_avg) 2628 out.unknown_avg = ppl.plot( 2629 *unknown_avg, 2630 **kw_plot_unknown_avg) 2631 if xylimits == 'constant': 2632 x = [r[f'd{self._4x}'] for r in self] 2633 y = [r[f'D{self._4x}'] for r in self] 2634 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2635 w, h = x2-x1, y2-y1 2636 x1 -= w/20 2637 x2 += w/20 2638 y1 -= h/20 2639 y2 += h/20 2640 ppl.axis([x1, x2, y1, y2]) 2641 elif xylimits == 'free': 2642 x1, x2, y1, y2 = ppl.axis() 2643 else: 2644 x1, x2, y1, y2 = ppl.axis(xylimits) 2645 2646 if error_contour_interval != 'none': 2647 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2648 XI,YI = np.meshgrid(xi, yi) 2649 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2650 if error_contour_interval == 'auto': 2651 rng = np.max(SI) - np.min(SI) 2652 if rng <= 0.01: 2653 cinterval = 0.001 2654 elif rng <= 0.03: 2655 cinterval = 0.004 2656 elif rng <= 0.1: 2657 cinterval = 0.01 2658 elif rng <= 0.3: 2659 cinterval = 0.03 2660 elif rng <= 1.: 2661 cinterval = 0.1 2662 else: 2663 cinterval = 0.5 2664 else: 2665 cinterval = error_contour_interval 2666 2667 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2668 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2669 out.clabel = ppl.clabel(out.contour) 2670 contour = (XI, YI, SI, cval, cinterval) 2671 2672 if fig == None: 2673 return { 2674 'anchors':anchors, 2675 'unknowns':unknowns, 2676 'anchors_d':anchors_d, 2677 'anchors_D':anchors_D, 2678 'unknowns_d':unknowns_d, 2679 'unknowns_D':unknowns_D, 2680 'anchor_avg':anchor_avg, 2681 'unknown_avg':unknown_avg, 2682 'contour':contour, 2683 } 2684 2685 ppl.xlabel(x_label) 2686 ppl.ylabel(y_label) 2687 ppl.title(session, weight = 'bold') 2688 ppl.grid(alpha = .2) 2689 out.ax = ppl.gca() 2690 2691 return out 2692 2693 def plot_residuals( 2694 self, 2695 kde = False, 2696 hist = False, 2697 binwidth = 2/3, 2698 dir = 'output', 2699 filename = None, 2700 highlight = [], 2701 colors = None, 2702 figsize = None, 2703 dpi = 100, 2704 yspan = None, 2705 ): 2706 ''' 2707 Plot residuals of each analysis as a function of time (actually, as a function of 2708 the order of analyses in the `D4xdata` object) 2709 2710 + `kde`: whether to add a kernel density estimate of residuals 2711 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2712 + `histbins`: specify bin edges for the histogram 2713 + `dir`: the directory in which to save the plot 2714 + `highlight`: a list of samples to highlight 2715 + `colors`: a dict of `{<sample>: <color>}` for all samples 2716 + `figsize`: (width, height) of figure 2717 + `dpi`: resolution for PNG output 2718 + `yspan`: factor controlling the range of y values shown in plot 2719 (by default: `yspan = 1.5 if kde else 1.0`) 2720 ''' 2721 2722 from matplotlib import ticker 2723 2724 if yspan is None: 2725 if kde: 2726 yspan = 1.5 2727 else: 2728 yspan = 1.0 2729 2730 # Layout 2731 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2732 if hist or kde: 2733 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2734 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2735 else: 2736 ppl.subplots_adjust(.08,.05,.78,.8) 2737 ax1 = ppl.subplot(111) 2738 2739 # Colors 2740 N = len(self.anchors) 2741 if colors is None: 2742 if len(highlight) > 0: 2743 Nh = len(highlight) 2744 if Nh == 1: 2745 colors = {highlight[0]: (0,0,0)} 2746 elif Nh == 3: 2747 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2748 elif Nh == 4: 2749 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2750 else: 2751 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2752 else: 2753 if N == 3: 2754 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2755 elif N == 4: 2756 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2757 else: 2758 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2759 2760 ppl.sca(ax1) 2761 2762 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2763 2764 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2765 2766 session = self[0]['Session'] 2767 x1 = 0 2768# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2769 x_sessions = {} 2770 one_or_more_singlets = False 2771 one_or_more_multiplets = False 2772 multiplets = set() 2773 for k,r in enumerate(self): 2774 if r['Session'] != session: 2775 x2 = k-1 2776 x_sessions[session] = (x1+x2)/2 2777 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2778 session = r['Session'] 2779 x1 = k 2780 singlet = len(self.samples[r['Sample']]['data']) == 1 2781 if not singlet: 2782 multiplets.add(r['Sample']) 2783 if r['Sample'] in self.unknowns: 2784 if singlet: 2785 one_or_more_singlets = True 2786 else: 2787 one_or_more_multiplets = True 2788 kw = dict( 2789 marker = 'x' if singlet else '+', 2790 ms = 4 if singlet else 5, 2791 ls = 'None', 2792 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2793 mew = 1, 2794 alpha = 0.2 if singlet else 1, 2795 ) 2796 if highlight and r['Sample'] not in highlight: 2797 kw['alpha'] = 0.2 2798 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2799 x2 = k 2800 x_sessions[session] = (x1+x2)/2 2801 2802 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2803 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2804 if not (hist or kde): 2805 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2806 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2807 2808 xmin, xmax, ymin, ymax = ppl.axis() 2809 if yspan != 1: 2810 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2811 for s in x_sessions: 2812 ppl.text( 2813 x_sessions[s], 2814 ymax +1, 2815 s, 2816 va = 'bottom', 2817 **( 2818 dict(ha = 'center') 2819 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2820 else dict(ha = 'left', rotation = 45) 2821 ) 2822 ) 2823 2824 if hist or kde: 2825 ppl.sca(ax2) 2826 2827 for s in colors: 2828 kw['marker'] = '+' 2829 kw['ms'] = 5 2830 kw['mec'] = colors[s] 2831 kw['label'] = s 2832 kw['alpha'] = 1 2833 ppl.plot([], [], **kw) 2834 2835 kw['mec'] = (0,0,0) 2836 2837 if one_or_more_singlets: 2838 kw['marker'] = 'x' 2839 kw['ms'] = 4 2840 kw['alpha'] = .2 2841 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2842 ppl.plot([], [], **kw) 2843 2844 if one_or_more_multiplets: 2845 kw['marker'] = '+' 2846 kw['ms'] = 4 2847 kw['alpha'] = 1 2848 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2849 ppl.plot([], [], **kw) 2850 2851 if hist or kde: 2852 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2853 else: 2854 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2855 leg.set_zorder(-1000) 2856 2857 ppl.sca(ax1) 2858 2859 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2860 ppl.xticks([]) 2861 ppl.axis([-1, len(self), None, None]) 2862 2863 if hist or kde: 2864 ppl.sca(ax2) 2865 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2866 2867 if kde: 2868 from scipy.stats import gaussian_kde 2869 yi = np.linspace(ymin, ymax, 201) 2870 xi = gaussian_kde(X).evaluate(yi) 2871 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2872# ppl.plot(xi, yi, 'k-', lw = 1) 2873 elif hist: 2874 ppl.hist( 2875 X, 2876 orientation = 'horizontal', 2877 histtype = 'stepfilled', 2878 ec = [.4]*3, 2879 fc = [.25]*3, 2880 alpha = .25, 2881 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2882 ) 2883 ppl.text(0, 0, 2884 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2885 size = 7.5, 2886 alpha = 1, 2887 va = 'center', 2888 ha = 'left', 2889 ) 2890 2891 ppl.axis([0, None, ymin, ymax]) 2892 ppl.xticks([]) 2893 ppl.yticks([]) 2894# ax2.spines['left'].set_visible(False) 2895 ax2.spines['right'].set_visible(False) 2896 ax2.spines['top'].set_visible(False) 2897 ax2.spines['bottom'].set_visible(False) 2898 2899 ax1.axis([None, None, ymin, ymax]) 2900 2901 if not os.path.exists(dir): 2902 os.makedirs(dir) 2903 if filename is None: 2904 return fig 2905 elif filename == '': 2906 filename = f'D{self._4x}_residuals.pdf' 2907 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2908 ppl.close(fig) 2909 2910 2911 def simulate(self, *args, **kwargs): 2912 ''' 2913 Legacy function with warning message pointing to `virtual_data()` 2914 ''' 2915 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2916 2917 def plot_distribution_of_analyses( 2918 self, 2919 dir = 'output', 2920 filename = None, 2921 vs_time = False, 2922 figsize = (6,4), 2923 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2924 output = None, 2925 dpi = 100, 2926 ): 2927 ''' 2928 Plot temporal distribution of all analyses in the data set. 2929 2930 **Parameters** 2931 2932 + `dir`: the directory in which to save the plot 2933 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2934 + `dpi`: resolution for PNG output 2935 + `figsize`: (width, height) of figure 2936 + `dpi`: resolution for PNG output 2937 ''' 2938 2939 asamples = [s for s in self.anchors] 2940 usamples = [s for s in self.unknowns] 2941 if output is None or output == 'fig': 2942 fig = ppl.figure(figsize = figsize) 2943 ppl.subplots_adjust(*subplots_adjust) 2944 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2945 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2946 Xmax += (Xmax-Xmin)/40 2947 Xmin -= (Xmax-Xmin)/41 2948 for k, s in enumerate(asamples + usamples): 2949 if vs_time: 2950 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2951 else: 2952 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2953 Y = [-k for x in X] 2954 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2955 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2956 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2957 ppl.axis([Xmin, Xmax, -k-1, 1]) 2958 ppl.xlabel('\ntime') 2959 ppl.gca().annotate('', 2960 xy = (0.6, -0.02), 2961 xycoords = 'axes fraction', 2962 xytext = (.4, -0.02), 2963 arrowprops = dict(arrowstyle = "->", color = 'k'), 2964 ) 2965 2966 2967 x2 = -1 2968 for session in self.sessions: 2969 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2970 if vs_time: 2971 ppl.axvline(x1, color = 'k', lw = .75) 2972 if x2 > -1: 2973 if not vs_time: 2974 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2975 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2976# from xlrd import xldate_as_datetime 2977# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2978 if vs_time: 2979 ppl.axvline(x2, color = 'k', lw = .75) 2980 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2981 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2982 2983 ppl.xticks([]) 2984 ppl.yticks([]) 2985 2986 if output is None: 2987 if not os.path.exists(dir): 2988 os.makedirs(dir) 2989 if filename == None: 2990 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2991 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2992 ppl.close(fig) 2993 elif output == 'ax': 2994 return ppl.gca() 2995 elif output == 'fig': 2996 return fig 2997 2998 2999 def plot_bulk_compositions( 3000 self, 3001 samples = None, 3002 dir = 'output/bulk_compositions', 3003 figsize = (6,6), 3004 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3005 show = False, 3006 sample_color = (0,.5,1), 3007 analysis_color = (.7,.7,.7), 3008 labeldist = 0.3, 3009 radius = 0.05, 3010 ): 3011 ''' 3012 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3013 3014 By default, creates a directory `./output/bulk_compositions` where plots for 3015 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3016 3017 3018 **Parameters** 3019 3020 + `samples`: Only these samples are processed (by default: all samples). 3021 + `dir`: where to save the plots 3022 + `figsize`: (width, height) of figure 3023 + `subplots_adjust`: passed to `subplots_adjust()` 3024 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3025 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3026 + `sample_color`: color used for replicate markers/labels 3027 + `analysis_color`: color used for sample markers/labels 3028 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3029 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3030 ''' 3031 3032 from matplotlib.patches import Ellipse 3033 3034 if samples is None: 3035 samples = [_ for _ in self.samples] 3036 3037 saved = {} 3038 3039 for s in samples: 3040 3041 fig = ppl.figure(figsize = figsize) 3042 fig.subplots_adjust(*subplots_adjust) 3043 ax = ppl.subplot(111) 3044 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3045 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3046 ppl.title(s) 3047 3048 3049 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3050 UID = [_['UID'] for _ in self.samples[s]['data']] 3051 XY0 = XY.mean(0) 3052 3053 for xy in XY: 3054 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3055 3056 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3057 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3058 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3059 saved[s] = [XY, XY0] 3060 3061 x1, x2, y1, y2 = ppl.axis() 3062 x0, dx = (x1+x2)/2, (x2-x1)/2 3063 y0, dy = (y1+y2)/2, (y2-y1)/2 3064 dx, dy = [max(max(dx, dy), radius)]*2 3065 3066 ppl.axis([ 3067 x0 - 1.2*dx, 3068 x0 + 1.2*dx, 3069 y0 - 1.2*dy, 3070 y0 + 1.2*dy, 3071 ]) 3072 3073 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3074 3075 for xy, uid in zip(XY, UID): 3076 3077 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3078 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3079 3080 if (vector_in_display_space**2).sum() > 0: 3081 3082 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3083 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3084 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3085 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3086 3087 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3088 3089 else: 3090 3091 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3092 3093 if radius: 3094 ax.add_artist(Ellipse( 3095 xy = XY0, 3096 width = radius*2, 3097 height = radius*2, 3098 ls = (0, (2,2)), 3099 lw = .7, 3100 ec = analysis_color, 3101 fc = 'None', 3102 )) 3103 ppl.text( 3104 XY0[0], 3105 XY0[1]-radius, 3106 f'\n± {radius*1e3:.0f} ppm', 3107 color = analysis_color, 3108 va = 'top', 3109 ha = 'center', 3110 linespacing = 0.4, 3111 size = 8, 3112 ) 3113 3114 if not os.path.exists(dir): 3115 os.makedirs(dir) 3116 fig.savefig(f'{dir}/{s}.pdf') 3117 ppl.close(fig) 3118 3119 fig = ppl.figure(figsize = figsize) 3120 fig.subplots_adjust(*subplots_adjust) 3121 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3122 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3123 3124 for s in saved: 3125 for xy in saved[s][0]: 3126 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3127 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3128 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3129 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3130 3131 x1, x2, y1, y2 = ppl.axis() 3132 ppl.axis([ 3133 x1 - (x2-x1)/10, 3134 x2 + (x2-x1)/10, 3135 y1 - (y2-y1)/10, 3136 y2 + (y2-y1)/10, 3137 ]) 3138 3139 3140 if not os.path.exists(dir): 3141 os.makedirs(dir) 3142 fig.savefig(f'{dir}/__all__.pdf') 3143 if show: 3144 ppl.show() 3145 ppl.close(fig) 3146 3147 3148 def _save_D4x_correl( 3149 self, 3150 samples = None, 3151 dir = 'output', 3152 filename = None, 3153 D4x_precision = 4, 3154 correl_precision = 4, 3155 ): 3156 ''' 3157 Save D4x values along with their SE and correlation matrix. 3158 3159 **Parameters** 3160 3161 + `samples`: Only these samples are output (by default: all samples). 3162 + `dir`: the directory in which to save the faile (by defaut: `output`) 3163 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3164 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3165 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3166 ''' 3167 if samples is None: 3168 samples = sorted([s for s in self.unknowns]) 3169 3170 out = [['Sample']] + [[s] for s in samples] 3171 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3172 for k,s in enumerate(samples): 3173 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3174 for s2 in samples: 3175 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3176 3177 if not os.path.exists(dir): 3178 os.makedirs(dir) 3179 if filename is None: 3180 filename = f'D{self._4x}_correl.csv' 3181 with open(f'{dir}/{filename}', 'w') as fid: 3182 fid.write(make_csv(out)) 3183 3184 3185 3186 3187class D47data(D4xdata): 3188 ''' 3189 Store and process data for a large set of Δ47 analyses, 3190 usually comprising more than one analytical session. 3191 ''' 3192 3193 Nominal_D4x = { 3194 'ETH-1': 0.2052, 3195 'ETH-2': 0.2085, 3196 'ETH-3': 0.6132, 3197 'ETH-4': 0.4511, 3198 'IAEA-C1': 0.3018, 3199 'IAEA-C2': 0.6409, 3200 'MERCK': 0.5135, 3201 } # I-CDES (Bernasconi et al., 2021) 3202 ''' 3203 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3204 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3205 reference frame. 3206 3207 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3208 ```py 3209 { 3210 'ETH-1' : 0.2052, 3211 'ETH-2' : 0.2085, 3212 'ETH-3' : 0.6132, 3213 'ETH-4' : 0.4511, 3214 'IAEA-C1' : 0.3018, 3215 'IAEA-C2' : 0.6409, 3216 'MERCK' : 0.5135, 3217 } 3218 ``` 3219 ''' 3220 3221 3222 @property 3223 def Nominal_D47(self): 3224 return self.Nominal_D4x 3225 3226 3227 @Nominal_D47.setter 3228 def Nominal_D47(self, new): 3229 self.Nominal_D4x = dict(**new) 3230 self.refresh() 3231 3232 3233 def __init__(self, l = [], **kwargs): 3234 ''' 3235 **Parameters:** same as `D4xdata.__init__()` 3236 ''' 3237 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3238 3239 3240 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3241 ''' 3242 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3243 value for that temperature, and add treat these samples as additional anchors. 3244 3245 **Parameters** 3246 3247 + `fCo2eqD47`: Which CO2 equilibrium law to use 3248 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3249 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3250 + `priority`: if `replace`: forget old anchors and only use the new ones; 3251 if `new`: keep pre-existing anchors but update them in case of conflict 3252 between old and new Δ47 values; 3253 if `old`: keep pre-existing anchors but preserve their original Δ47 3254 values in case of conflict. 3255 ''' 3256 f = { 3257 'petersen': fCO2eqD47_Petersen, 3258 'wang': fCO2eqD47_Wang, 3259 }[fCo2eqD47] 3260 foo = {} 3261 for r in self: 3262 if 'Teq' in r: 3263 if r['Sample'] in foo: 3264 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3265 else: 3266 foo[r['Sample']] = f(r['Teq']) 3267 else: 3268 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3269 3270 if priority == 'replace': 3271 self.Nominal_D47 = {} 3272 for s in foo: 3273 if priority != 'old' or s not in self.Nominal_D47: 3274 self.Nominal_D47[s] = foo[s] 3275 3276 def save_D47_correl(self, *args, **kwargs): 3277 return self._save_D4x_correl(*args, **kwargs) 3278 3279 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47') 3280 3281 3282class D48data(D4xdata): 3283 ''' 3284 Store and process data for a large set of Δ48 analyses, 3285 usually comprising more than one analytical session. 3286 ''' 3287 3288 Nominal_D4x = { 3289 'ETH-1': 0.138, 3290 'ETH-2': 0.138, 3291 'ETH-3': 0.270, 3292 'ETH-4': 0.223, 3293 'GU-1': -0.419, 3294 } # (Fiebig et al., 2019, 2021) 3295 ''' 3296 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3297 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3298 reference frame. 3299 3300 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3301 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3302 3303 ```py 3304 { 3305 'ETH-1' : 0.138, 3306 'ETH-2' : 0.138, 3307 'ETH-3' : 0.270, 3308 'ETH-4' : 0.223, 3309 'GU-1' : -0.419, 3310 } 3311 ``` 3312 ''' 3313 3314 3315 @property 3316 def Nominal_D48(self): 3317 return self.Nominal_D4x 3318 3319 3320 @Nominal_D48.setter 3321 def Nominal_D48(self, new): 3322 self.Nominal_D4x = dict(**new) 3323 self.refresh() 3324 3325 3326 def __init__(self, l = [], **kwargs): 3327 ''' 3328 **Parameters:** same as `D4xdata.__init__()` 3329 ''' 3330 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3331 3332 def save_D48_correl(self, *args, **kwargs): 3333 return self._save_D4x_correl(*args, **kwargs) 3334 3335 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48') 3336 3337 3338class D49data(D4xdata): 3339 ''' 3340 Store and process data for a large set of Δ49 analyses, 3341 usually comprising more than one analytical session. 3342 ''' 3343 3344 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3345 ''' 3346 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3347 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3348 reference frame. 3349 3350 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3351 3352 ```py 3353 { 3354 "1000C": 0.0, 3355 "25C": 2.228 3356 } 3357 ``` 3358 ''' 3359 3360 @property 3361 def Nominal_D49(self): 3362 return self.Nominal_D4x 3363 3364 @Nominal_D49.setter 3365 def Nominal_D49(self, new): 3366 self.Nominal_D4x = dict(**new) 3367 self.refresh() 3368 3369 def __init__(self, l=[], **kwargs): 3370 ''' 3371 **Parameters:** same as `D4xdata.__init__()` 3372 ''' 3373 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3374 3375 def save_D49_correl(self, *args, **kwargs): 3376 return self._save_D4x_correl(*args, **kwargs) 3377 3378 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49') 3379 3380class _SessionPlot(): 3381 ''' 3382 Simple placeholder class 3383 ''' 3384 def __init__(self): 3385 pass 3386 3387_app = typer.Typer( 3388 add_completion = False, 3389 context_settings={'help_option_names': ['-h', '--help']}, 3390 rich_markup_mode = 'rich', 3391 ) 3392 3393@_app.command() 3394def _cli( 3395 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3396 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3397 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3398 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3399 run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False, 3400 ): 3401 """ 3402 Process raw D47 data and return standardized results. 3403 3404 See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details. 3405 3406 Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes 3407 the data based either on the default [b]d13C_VPDB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different 3408 user-specified anchors. A new directory (named `output` by default) is created to store the results and 3409 the following sequence is applied: 3410 3411 * [b]D47data.wg()[/b] 3412 * [b]D47data.crunch()[/b] 3413 * [b]D47data.standardize()[/b] 3414 * [b]D47data.summary()[/b] 3415 * [b]D47data.table_of_samples()[/b] 3416 * [b]D47data.table_of_sessions()[/b] 3417 * [b]D47data.plot_sessions()[/b] 3418 * [b]D47data.plot_residuals()[/b] 3419 * [b]D47data.table_of_analyses()[/b] 3420 * [b]D47data.plot_distribution_of_analyses()[/b] 3421 * [b]D47data.plot_bulk_compositions()[/b] 3422 * [b]D47data.save_D47_correl()[/b] 3423 3424 Optionally, also apply similar methods for [b]]D48[/b]. 3425 3426 [b]Example CSV file for --anchors option:[/b] 3427 [i] 3428 Sample, d13C_VPDB, d18O_VPDB, D47, D48 3429 ETH-1, 2.02, -2.19, 0.2052, 0.138 3430 ETH-2, -10.17, -18.69, 0.2085, 0.138 3431 ETH-3, 1.71, -1.78, 0.6132, 0.270 3432 ETH-4, , , 0.4511, 0.223 3433 [/i] 3434 Except for [i]Sample[/i], none of the columns above are mandatory. 3435 3436 [b]Example CSV file for --exclude option:[/b] 3437 [i] 3438 Sample, UID 3439 FOO-1, 3440 BAR-2, 3441 , A04 3442 , A17 3443 , A88 3444 [/i] 3445 This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i], 3446 and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i]. 3447 Neither column is mandatory. 3448 """ 3449 3450 data = D47data() 3451 data.read(rawdata) 3452 3453 if exclude != 'none': 3454 exclude = read_csv(exclude) 3455 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3456 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3457 else: 3458 exclude_uid = [] 3459 exclude_sample = [] 3460 3461 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3462 3463 if anchors != 'none': 3464 anchors = read_csv(anchors) 3465 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3466 data.Nominal_d13C_VPDB = { 3467 _['Sample']: _['d13C_VPDB'] 3468 for _ in anchors 3469 if 'd13C_VPDB' in _ 3470 } 3471 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3472 data.Nominal_d18O_VPDB = { 3473 _['Sample']: _['d18O_VPDB'] 3474 for _ in anchors 3475 if 'd18O_VPDB' in _ 3476 } 3477 if len([_ for _ in anchors if 'D47' in _]): 3478 data.Nominal_D4x = { 3479 _['Sample']: _['D47'] 3480 for _ in anchors 3481 if 'D47' in _ 3482 } 3483 3484 data.refresh() 3485 data.wg() 3486 data.crunch() 3487 data.standardize() 3488 data.summary(dir = output_dir) 3489 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3490 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3491 data.plot_sessions(dir = output_dir) 3492 data.save_D47_correl(dir = output_dir) 3493 3494 if not run_D48: 3495 data.table_of_samples(dir = output_dir) 3496 data.table_of_analyses(dir = output_dir) 3497 data.table_of_sessions(dir = output_dir) 3498 3499 3500 if run_D48: 3501 data2 = D48data() 3502 print(rawdata) 3503 data2.read(rawdata) 3504 3505 data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3506 3507 if anchors != 'none': 3508 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3509 data2.Nominal_d13C_VPDB = { 3510 _['Sample']: _['d13C_VPDB'] 3511 for _ in anchors 3512 if 'd13C_VPDB' in _ 3513 } 3514 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3515 data2.Nominal_d18O_VPDB = { 3516 _['Sample']: _['d18O_VPDB'] 3517 for _ in anchors 3518 if 'd18O_VPDB' in _ 3519 } 3520 if len([_ for _ in anchors if 'D48' in _]): 3521 data2.Nominal_D4x = { 3522 _['Sample']: _['D48'] 3523 for _ in anchors 3524 if 'D48' in _ 3525 } 3526 3527 data2.refresh() 3528 data2.wg() 3529 data2.crunch() 3530 data2.standardize() 3531 data2.summary(dir = output_dir) 3532 data2.plot_sessions(dir = output_dir) 3533 data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True) 3534 data2.plot_distribution_of_analyses(dir = output_dir) 3535 data2.save_D48_correl(dir = output_dir) 3536 3537 table_of_analyses(data, data2, dir = output_dir) 3538 table_of_samples(data, data2, dir = output_dir) 3539 table_of_sessions(data, data2, dir = output_dir) 3540 3541def __cli(): 3542 _app()
68def fCO2eqD47_Petersen(T): 69 ''' 70 CO2 equilibrium Δ47 value as a function of T (in degrees C) 71 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 72 73 ''' 74 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
79def fCO2eqD47_Wang(T): 80 ''' 81 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 82 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 83 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 84 ''' 85 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T
(in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
107def make_csv(x, hsep = ',', vsep = '\n'): 108 ''' 109 Formats a list of lists of strings as a CSV 110 111 **Parameters** 112 113 + `x`: the list of lists of strings to format 114 + `hsep`: the field separator (`,` by default) 115 + `vsep`: the line-ending convention to use (`\\n` by default) 116 117 **Example** 118 119 ```py 120 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 121 ``` 122 123 outputs: 124 125 ```py 126 a,b,c 127 d,e,f 128 ``` 129 ''' 130 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x
: the list of lists of strings to formathsep
: the field separator (,
by default)vsep
: the line-ending convention to use (\n
by default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
133def pf(txt): 134 ''' 135 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 136 ''' 137 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt
to follow lmfit.Parameter()
naming rules.
140def smart_type(x): 141 ''' 142 Tries to convert string `x` to a float if it includes a decimal point, or 143 to an integer if it does not. If both attempts fail, return the original 144 string unchanged. 145 ''' 146 try: 147 y = float(x) 148 except ValueError: 149 return x 150 if '.' not in x: 151 return int(y) 152 return y
Tries to convert string x
to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
161def pretty_table(x, header = 1, hsep = ' ', vsep = None, align = '<'): 162 ''' 163 Reads a list of lists of strings and outputs an ascii table 164 165 **Parameters** 166 167 + `x`: a list of lists of strings 168 + `header`: the number of lines to treat as header lines 169 + `hsep`: the horizontal separator between columns 170 + `vsep`: the character to use as vertical separator 171 + `align`: string of left (`<`) or right (`>`) alignment characters. 172 173 **Example** 174 175 ```py 176 print(pretty_table([ 177 ['A', 'B', 'C'], 178 ['1', '1.9999', 'foo'], 179 ['10', 'x', 'bar'], 180 ])) 181 ``` 182 yields: 183 ``` 184 —— —————— ——— 185 A B C 186 —— —————— ——— 187 1 1.9999 foo 188 10 x bar 189 —— —————— ——— 190 ``` 191 192 To change the default `vsep` globally, redefine `D47crunch_defaults.PRETTY_TABLE_VSEP`: 193 194 ```py 195 D47crunch_defaults.PRETTY_TABLE_VSEP = '=' 196 print(pretty_table([ 197 ['A', 'B', 'C'], 198 ['1', '1.9999', 'foo'], 199 ['10', 'x', 'bar'], 200 ])) 201 ``` 202 yields: 203 ``` 204 == ====== === 205 A B C 206 == ====== === 207 1 1.9999 foo 208 10 x bar 209 == ====== === 210 ``` 211 ''' 212 213 if vsep is None: 214 vsep = D47crunch_defaults.PRETTY_TABLE_VSEP 215 216 txt = [] 217 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 218 219 if len(widths) > len(align): 220 align += '>' * (len(widths)-len(align)) 221 sepline = hsep.join([vsep*w for w in widths]) 222 txt += [sepline] 223 for k,l in enumerate(x): 224 if k and k == header: 225 txt += [sepline] 226 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 227 txt += [sepline] 228 txt += [''] 229 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x
: a list of lists of stringsheader
: the number of lines to treat as header lineshsep
: the horizontal separator between columnsvsep
: the character to use as vertical separatoralign
: string of left (<
) or right (>
) alignment characters.
Example
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
—— —————— ———
A B C
—— —————— ———
1 1.9999 foo
10 x bar
—— —————— ———
To change the default vsep
globally, redefine D47crunch_defaults.PRETTY_TABLE_VSEP
:
D47crunch_defaults.PRETTY_TABLE_VSEP = '='
print(pretty_table([
['A', 'B', 'C'],
['1', '1.9999', 'foo'],
['10', 'x', 'bar'],
]))
yields:
== ====== ===
A B C
== ====== ===
1 1.9999 foo
10 x bar
== ====== ===
232def transpose_table(x): 233 ''' 234 Transpose a list if lists 235 236 **Parameters** 237 238 + `x`: a list of lists 239 240 **Example** 241 242 ```py 243 x = [[1, 2], [3, 4]] 244 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 245 ``` 246 ''' 247 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x
: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
250def w_avg(X, sX) : 251 ''' 252 Compute variance-weighted average 253 254 Returns the value and SE of the weighted average of the elements of `X`, 255 with relative weights equal to their inverse variances (`1/sX**2`). 256 257 **Parameters** 258 259 + `X`: array-like of elements to average 260 + `sX`: array-like of the corresponding SE values 261 262 **Tip** 263 264 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 265 they may be rearranged using `zip()`: 266 267 ```python 268 foo = [(0, 1), (1, 0.5), (2, 0.5)] 269 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 270 ``` 271 ''' 272 X = [ x for x in X ] 273 sX = [ sx for sx in sX ] 274 W = [ sx**-2 for sx in sX ] 275 W = [ w/sum(W) for w in W ] 276 Xavg = sum([ w*x for w,x in zip(W,X) ]) 277 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 278 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X
,
with relative weights equal to their inverse variances (1/sX**2
).
Parameters
X
: array-like of elements to averagesX
: array-like of the corresponding SE values
Tip
If X
and sX
are initially arranged as a list of (x, sx)
doublets,
they may be rearranged using zip()
:
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
281def read_csv(filename, sep = ''): 282 ''' 283 Read contents of `filename` in csv format and return a list of dictionaries. 284 285 In the csv string, spaces before and after field separators (`','` by default) 286 are optional. 287 288 **Parameters** 289 290 + `filename`: the csv file to read 291 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 292 whichever appers most often in the contents of `filename`. 293 ''' 294 with open(filename) as fid: 295 txt = fid.read() 296 297 if sep == '': 298 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 299 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 300 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename
in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (','
by default)
are optional.
Parameters
filename
: the csv file to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in the contents of
filename
.
303def simulate_single_analysis( 304 sample = 'MYSAMPLE', 305 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 306 d13C_VPDB = None, d18O_VPDB = None, 307 D47 = None, D48 = None, D49 = 0., D17O = 0., 308 a47 = 1., b47 = 0., c47 = -0.9, 309 a48 = 1., b48 = 0., c48 = -0.45, 310 Nominal_D47 = None, 311 Nominal_D48 = None, 312 Nominal_d13C_VPDB = None, 313 Nominal_d18O_VPDB = None, 314 ALPHA_18O_ACID_REACTION = None, 315 R13_VPDB = None, 316 R17_VSMOW = None, 317 R18_VSMOW = None, 318 LAMBDA_17 = None, 319 R18_VPDB = None, 320 ): 321 ''' 322 Compute working-gas delta values for a single analysis, assuming a stochastic working 323 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 324 325 **Parameters** 326 327 + `sample`: sample name 328 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 329 (respectively –4 and +26 ‰ by default) 330 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 331 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 332 of the carbonate sample 333 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 334 Δ48 values if `D47` or `D48` are not specified 335 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 336 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 337 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 338 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 339 correction parameters (by default equal to the `D4xdata` default values) 340 341 Returns a dictionary with fields 342 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 343 ''' 344 345 if Nominal_d13C_VPDB is None: 346 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 347 348 if Nominal_d18O_VPDB is None: 349 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 350 351 if ALPHA_18O_ACID_REACTION is None: 352 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 353 354 if R13_VPDB is None: 355 R13_VPDB = D4xdata().R13_VPDB 356 357 if R17_VSMOW is None: 358 R17_VSMOW = D4xdata().R17_VSMOW 359 360 if R18_VSMOW is None: 361 R18_VSMOW = D4xdata().R18_VSMOW 362 363 if LAMBDA_17 is None: 364 LAMBDA_17 = D4xdata().LAMBDA_17 365 366 if R18_VPDB is None: 367 R18_VPDB = D4xdata().R18_VPDB 368 369 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 370 371 if Nominal_D47 is None: 372 Nominal_D47 = D47data().Nominal_D47 373 374 if Nominal_D48 is None: 375 Nominal_D48 = D48data().Nominal_D48 376 377 if d13C_VPDB is None: 378 if sample in Nominal_d13C_VPDB: 379 d13C_VPDB = Nominal_d13C_VPDB[sample] 380 else: 381 raise KeyError(f"Sample {sample} is missing d13C_VPDB value, and it is not defined in Nominal_d13C_VPDB.") 382 383 if d18O_VPDB is None: 384 if sample in Nominal_d18O_VPDB: 385 d18O_VPDB = Nominal_d18O_VPDB[sample] 386 else: 387 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 388 389 if D47 is None: 390 if sample in Nominal_D47: 391 D47 = Nominal_D47[sample] 392 else: 393 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 394 395 if D48 is None: 396 if sample in Nominal_D48: 397 D48 = Nominal_D48[sample] 398 else: 399 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 400 401 X = D4xdata() 402 X.R13_VPDB = R13_VPDB 403 X.R17_VSMOW = R17_VSMOW 404 X.R18_VSMOW = R18_VSMOW 405 X.LAMBDA_17 = LAMBDA_17 406 X.R18_VPDB = R18_VPDB 407 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 408 409 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 410 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 411 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 412 ) 413 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 414 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 415 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 416 D17O=D17O, D47=D47, D48=D48, D49=D49, 417 ) 418 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 419 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 420 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 421 D17O=D17O, 422 ) 423 424 d45 = 1000 * (R45/R45wg - 1) 425 d46 = 1000 * (R46/R46wg - 1) 426 d47 = 1000 * (R47/R47wg - 1) 427 d48 = 1000 * (R48/R48wg - 1) 428 d49 = 1000 * (R49/R49wg - 1) 429 430 for k in range(3): # dumb iteration to adjust for small changes in d47 431 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 432 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 433 d47 = 1000 * (R47raw/R47wg - 1) 434 d48 = 1000 * (R48raw/R48wg - 1) 435 436 return dict( 437 Sample = sample, 438 D17O = D17O, 439 d13Cwg_VPDB = d13Cwg_VPDB, 440 d18Owg_VSMOW = d18Owg_VSMOW, 441 d45 = d45, 442 d46 = d46, 443 d47 = d47, 444 d48 = d48, 445 d49 = d49, 446 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample
: sample named13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specifiedNominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specifiedALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factorR13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to theD4xdata
default values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']
.
449def virtual_data( 450 samples = [], 451 a47 = 1., b47 = 0., c47 = -0.9, 452 a48 = 1., b48 = 0., c48 = -0.45, 453 rd45 = 0.020, rd46 = 0.060, 454 rD47 = 0.015, rD48 = 0.045, 455 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 456 session = None, 457 Nominal_D47 = None, Nominal_D48 = None, 458 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 459 ALPHA_18O_ACID_REACTION = None, 460 R13_VPDB = None, 461 R17_VSMOW = None, 462 R18_VSMOW = None, 463 LAMBDA_17 = None, 464 R18_VPDB = None, 465 seed = 0, 466 shuffle = True, 467 ): 468 ''' 469 Return list with simulated analyses from a single session. 470 471 **Parameters** 472 473 + `samples`: a list of entries; each entry is a dictionary with the following fields: 474 * `Sample`: the name of the sample 475 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 476 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 477 * `N`: how many analyses to generate for this sample 478 + `a47`: scrambling factor for Δ47 479 + `b47`: compositional nonlinearity for Δ47 480 + `c47`: working gas offset for Δ47 481 + `a48`: scrambling factor for Δ48 482 + `b48`: compositional nonlinearity for Δ48 483 + `c48`: working gas offset for Δ48 484 + `rd45`: analytical repeatability of δ45 485 + `rd46`: analytical repeatability of δ46 486 + `rD47`: analytical repeatability of Δ47 487 + `rD48`: analytical repeatability of Δ48 488 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 489 (by default equal to the `simulate_single_analysis` default values) 490 + `session`: name of the session (no name by default) 491 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 492 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 493 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 494 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 495 (by default equal to the `simulate_single_analysis` defaults) 496 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 497 (by default equal to the `simulate_single_analysis` defaults) 498 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 499 correction parameters (by default equal to the `simulate_single_analysis` default) 500 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 501 + `shuffle`: randomly reorder the sequence of analyses 502 503 504 Here is an example of using this method to generate an arbitrary combination of 505 anchors and unknowns for a bunch of sessions: 506 507 ```py 508 .. include:: ../../code_examples/virtual_data/example.py 509 ``` 510 511 This should output something like: 512 513 ``` 514 .. include:: ../../code_examples/virtual_data/output.txt 515 ``` 516 ''' 517 518 kwargs = locals().copy() 519 520 from numpy import random as nprandom 521 if seed: 522 rng = nprandom.default_rng(seed) 523 else: 524 rng = nprandom.default_rng() 525 526 N = sum([s['N'] for s in samples]) 527 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 528 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 529 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 530 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 531 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 532 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 533 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 534 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 535 536 k = 0 537 out = [] 538 for s in samples: 539 kw = {} 540 kw['sample'] = s['Sample'] 541 kw = { 542 **kw, 543 **{var: kwargs[var] 544 for var in [ 545 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 546 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 547 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 548 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 549 ] 550 if kwargs[var] is not None}, 551 **{var: s[var] 552 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 553 if var in s}, 554 } 555 556 sN = s['N'] 557 while sN: 558 out.append(simulate_single_analysis(**kw)) 559 out[-1]['d45'] += errors45[k] 560 out[-1]['d46'] += errors46[k] 561 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 562 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 563 sN -= 1 564 k += 1 565 566 if session is not None: 567 for r in out: 568 r['Session'] = session 569 570 if shuffle: 571 nprandom.shuffle(out) 572 573 return out
Return list with simulated analyses from a single session.
Parameters
samples
: a list of entries; each entry is a dictionary with the following fields:Sample
: the name of the sampled13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN
: how many analyses to generate for this sample
a47
: scrambling factor for Δ47b47
: compositional nonlinearity for Δ47c47
: working gas offset for Δ47a48
: scrambling factor for Δ48b48
: compositional nonlinearity for Δ48c48
: working gas offset for Δ48rd45
: analytical repeatability of δ45rd46
: analytical repeatability of δ46rD47
: analytical repeatability of Δ47rD48
: analytical repeatability of Δ48d13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (by default equal to thesimulate_single_analysis
default values)session
: name of the session (no name by default)Nominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specified (by default equal to thesimulate_single_analysis
defaults)Nominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specified (by default equal to thesimulate_single_analysis
defaults)ALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysis
defaults)R13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to thesimulate_single_analysis
default)seed
: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle
: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0075 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0082 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0091 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0070 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
—————————— —— —— ——————————— ———————————— —————— —————— —————— ————————————— ————————————— ——————————————
[table_of_samples]
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
—————— —— ————————— —————————— —————— —————— ———————— —————— ————————
[table_of_analyses]
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
1 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
2 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
3 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
4 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
5 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
6 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
7 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
8 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
9 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
10 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
11 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
12 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
13 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
14 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
15 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
16 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
17 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
18 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
19 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
20 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
21 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
22 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
23 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
24 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
25 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
26 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
27 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
28 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
29 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
30 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
31 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
32 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
33 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
34 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
35 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
36 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
37 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
38 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
39 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
40 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
41 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
42 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
43 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
44 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
45 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
46 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
47 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
48 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
49 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
50 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
51 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
52 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
53 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
54 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
55 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
56 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
57 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
58 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
59 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
60 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
——— —————————— —————— ——————————— ———————————— ————————— ————————— —————————— —————————— —————————— —————————— —————————— ————————— ————————— ————————— ————————
575def table_of_samples( 576 data47 = None, 577 data48 = None, 578 dir = 'output', 579 filename = None, 580 save_to_file = True, 581 print_out = True, 582 output = None, 583 ): 584 ''' 585 Print out, save to disk and/or return a combined table of samples 586 for a pair of `D47data` and `D48data` objects. 587 588 **Parameters** 589 590 + `data47`: `D47data` instance 591 + `data48`: `D48data` instance 592 + `dir`: the directory in which to save the table 593 + `filename`: the name to the csv file to write to 594 + `save_to_file`: whether to save the table to disk 595 + `print_out`: whether to print out the table 596 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 597 if set to `'raw'`: return a list of list of strings 598 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 599 ''' 600 if data47 is None: 601 if data48 is None: 602 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 603 else: 604 return data48.table_of_samples( 605 dir = dir, 606 filename = filename, 607 save_to_file = save_to_file, 608 print_out = print_out, 609 output = output 610 ) 611 else: 612 if data48 is None: 613 return data47.table_of_samples( 614 dir = dir, 615 filename = filename, 616 save_to_file = save_to_file, 617 print_out = print_out, 618 output = output 619 ) 620 else: 621 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 622 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 623 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 624 625 if save_to_file: 626 if not os.path.exists(dir): 627 os.makedirs(dir) 628 if filename is None: 629 filename = f'D47D48_samples.csv' 630 with open(f'{dir}/{filename}', 'w') as fid: 631 fid.write(make_csv(out)) 632 if print_out: 633 print('\n'+pretty_table(out)) 634 if output == 'raw': 635 return out 636 elif output == 'pretty': 637 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data
and D48data
objects.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
640def table_of_sessions( 641 data47 = None, 642 data48 = None, 643 dir = 'output', 644 filename = None, 645 save_to_file = True, 646 print_out = True, 647 output = None, 648 ): 649 ''' 650 Print out, save to disk and/or return a combined table of sessions 651 for a pair of `D47data` and `D48data` objects. 652 ***Only applicable if the sessions in `data47` and those in `data48` 653 consist of the exact same sets of analyses.*** 654 655 **Parameters** 656 657 + `data47`: `D47data` instance 658 + `data48`: `D48data` instance 659 + `dir`: the directory in which to save the table 660 + `filename`: the name to the csv file to write to 661 + `save_to_file`: whether to save the table to disk 662 + `print_out`: whether to print out the table 663 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 664 if set to `'raw'`: return a list of list of strings 665 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 666 ''' 667 if data47 is None: 668 if data48 is None: 669 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 670 else: 671 return data48.table_of_sessions( 672 dir = dir, 673 filename = filename, 674 save_to_file = save_to_file, 675 print_out = print_out, 676 output = output 677 ) 678 else: 679 if data48 is None: 680 return data47.table_of_sessions( 681 dir = dir, 682 filename = filename, 683 save_to_file = save_to_file, 684 print_out = print_out, 685 output = output 686 ) 687 else: 688 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 689 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 690 for k,x in enumerate(out47[0]): 691 if k>7: 692 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 693 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 694 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 695 696 if save_to_file: 697 if not os.path.exists(dir): 698 os.makedirs(dir) 699 if filename is None: 700 filename = f'D47D48_sessions.csv' 701 with open(f'{dir}/{filename}', 'w') as fid: 702 fid.write(make_csv(out)) 703 if print_out: 704 print('\n'+pretty_table(out)) 705 if output == 'raw': 706 return out 707 elif output == 'pretty': 708 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data
and D48data
objects.
Only applicable if the sessions in data47
and those in data48
consist of the exact same sets of analyses.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
711def table_of_analyses( 712 data47 = None, 713 data48 = None, 714 dir = 'output', 715 filename = None, 716 save_to_file = True, 717 print_out = True, 718 output = None, 719 ): 720 ''' 721 Print out, save to disk and/or return a combined table of analyses 722 for a pair of `D47data` and `D48data` objects. 723 724 If the sessions in `data47` and those in `data48` do not consist of 725 the exact same sets of analyses, the table will have two columns 726 `Session_47` and `Session_48` instead of a single `Session` column. 727 728 **Parameters** 729 730 + `data47`: `D47data` instance 731 + `data48`: `D48data` instance 732 + `dir`: the directory in which to save the table 733 + `filename`: the name to the csv file to write to 734 + `save_to_file`: whether to save the table to disk 735 + `print_out`: whether to print out the table 736 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 737 if set to `'raw'`: return a list of list of strings 738 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 739 ''' 740 if data47 is None: 741 if data48 is None: 742 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 743 else: 744 return data48.table_of_analyses( 745 dir = dir, 746 filename = filename, 747 save_to_file = save_to_file, 748 print_out = print_out, 749 output = output 750 ) 751 else: 752 if data48 is None: 753 return data47.table_of_analyses( 754 dir = dir, 755 filename = filename, 756 save_to_file = save_to_file, 757 print_out = print_out, 758 output = output 759 ) 760 else: 761 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 762 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 763 764 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 765 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 766 else: 767 out47[0][1] = 'Session_47' 768 out48[0][1] = 'Session_48' 769 out47 = transpose_table(out47) 770 out48 = transpose_table(out48) 771 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 772 773 if save_to_file: 774 if not os.path.exists(dir): 775 os.makedirs(dir) 776 if filename is None: 777 filename = f'D47D48_sessions.csv' 778 with open(f'{dir}/{filename}', 'w') as fid: 779 fid.write(make_csv(out)) 780 if print_out: 781 print('\n'+pretty_table(out)) 782 if output == 'raw': 783 return out 784 elif output == 'pretty': 785 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data
and D48data
objects.
If the sessions in data47
and those in data48
do not consist of
the exact same sets of analyses, the table will have two columns
Session_47
and Session_48
instead of a single Session
column.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
833class D4xdata(list): 834 ''' 835 Store and process data for a large set of Δ47 and/or Δ48 836 analyses, usually comprising more than one analytical session. 837 ''' 838 839 ### 17O CORRECTION PARAMETERS 840 R13_VPDB = 0.01118 # (Chang & Li, 1990) 841 ''' 842 Absolute (13C/12C) ratio of VPDB. 843 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 844 ''' 845 846 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 847 ''' 848 Absolute (18O/16C) ratio of VSMOW. 849 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 850 ''' 851 852 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 853 ''' 854 Mass-dependent exponent for triple oxygen isotopes. 855 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 856 ''' 857 858 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 859 ''' 860 Absolute (17O/16C) ratio of VSMOW. 861 By default equal to 0.00038475 862 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 863 rescaled to `R13_VPDB`) 864 ''' 865 866 R18_VPDB = R18_VSMOW * 1.03092 867 ''' 868 Absolute (18O/16C) ratio of VPDB. 869 By definition equal to `R18_VSMOW * 1.03092`. 870 ''' 871 872 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 873 ''' 874 Absolute (17O/16C) ratio of VPDB. 875 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 876 ''' 877 878 LEVENE_REF_SAMPLE = 'ETH-3' 879 ''' 880 After the Δ4x standardization step, each sample is tested to 881 assess whether the Δ4x variance within all analyses for that 882 sample differs significantly from that observed for a given reference 883 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 884 which yields a p-value corresponding to the null hypothesis that the 885 underlying variances are equal). 886 887 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 888 sample should be used as a reference for this test. 889 ''' 890 891 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 892 ''' 893 Specifies the 18O/16O fractionation factor generally applicable 894 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 895 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 896 897 By default equal to 1.008129 (calcite reacted at 90 °C, 898 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 899 ''' 900 901 Nominal_d13C_VPDB = { 902 'ETH-1': 2.02, 903 'ETH-2': -10.17, 904 'ETH-3': 1.71, 905 } # (Bernasconi et al., 2018) 906 ''' 907 Nominal δ13C_VPDB values assigned to carbonate standards, used by 908 `D4xdata.standardize_d13C()`. 909 910 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 911 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 912 ''' 913 914 Nominal_d18O_VPDB = { 915 'ETH-1': -2.19, 916 'ETH-2': -18.69, 917 'ETH-3': -1.78, 918 } # (Bernasconi et al., 2018) 919 ''' 920 Nominal δ18O_VPDB values assigned to carbonate standards, used by 921 `D4xdata.standardize_d18O()`. 922 923 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 924 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 925 ''' 926 927 d13C_STANDARDIZATION_METHOD = '2pt' 928 ''' 929 Method by which to standardize δ13C values: 930 931 + `none`: do not apply any δ13C standardization. 932 + `'1pt'`: within each session, offset all initial δ13C values so as to 933 minimize the difference between final δ13C_VPDB values and 934 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 935 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 936 values so as to minimize the difference between final δ13C_VPDB 937 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 938 is defined). 939 ''' 940 941 d18O_STANDARDIZATION_METHOD = '2pt' 942 ''' 943 Method by which to standardize δ18O values: 944 945 + `none`: do not apply any δ18O standardization. 946 + `'1pt'`: within each session, offset all initial δ18O values so as to 947 minimize the difference between final δ18O_VPDB values and 948 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 949 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 950 values so as to minimize the difference between final δ18O_VPDB 951 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 952 is defined). 953 ''' 954 955 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 956 ''' 957 **Parameters** 958 959 + `l`: a list of dictionaries, with each dictionary including at least the keys 960 `Sample`, `d45`, `d46`, and `d47` or `d48`. 961 + `mass`: `'47'` or `'48'` 962 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 963 + `session`: define session name for analyses without a `Session` key 964 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 965 966 Returns a `D4xdata` object derived from `list`. 967 ''' 968 self._4x = mass 969 self.verbose = verbose 970 self.prefix = 'D4xdata' 971 self.logfile = logfile 972 list.__init__(self, l) 973 self.Nf = None 974 self.repeatability = {} 975 self.refresh(session = session) 976 977 978 def make_verbal(oldfun): 979 ''' 980 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 981 ''' 982 @wraps(oldfun) 983 def newfun(*args, verbose = '', **kwargs): 984 myself = args[0] 985 oldprefix = myself.prefix 986 myself.prefix = oldfun.__name__ 987 if verbose != '': 988 oldverbose = myself.verbose 989 myself.verbose = verbose 990 out = oldfun(*args, **kwargs) 991 myself.prefix = oldprefix 992 if verbose != '': 993 myself.verbose = oldverbose 994 return out 995 return newfun 996 997 998 def msg(self, txt): 999 ''' 1000 Log a message to `self.logfile`, and print it out if `verbose = True` 1001 ''' 1002 self.log(txt) 1003 if self.verbose: 1004 print(f'{f"[{self.prefix}]":<16} {txt}') 1005 1006 1007 def vmsg(self, txt): 1008 ''' 1009 Log a message to `self.logfile` and print it out 1010 ''' 1011 self.log(txt) 1012 print(txt) 1013 1014 1015 def log(self, *txts): 1016 ''' 1017 Log a message to `self.logfile` 1018 ''' 1019 if self.logfile: 1020 with open(self.logfile, 'a') as fid: 1021 for txt in txts: 1022 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 1023 1024 1025 def refresh(self, session = 'mySession'): 1026 ''' 1027 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1028 ''' 1029 self.fill_in_missing_info(session = session) 1030 self.refresh_sessions() 1031 self.refresh_samples() 1032 1033 1034 def refresh_sessions(self): 1035 ''' 1036 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1037 to `False` for all sessions. 1038 ''' 1039 self.sessions = { 1040 s: {'data': [r for r in self if r['Session'] == s]} 1041 for s in sorted({r['Session'] for r in self}) 1042 } 1043 for s in self.sessions: 1044 self.sessions[s]['scrambling_drift'] = False 1045 self.sessions[s]['slope_drift'] = False 1046 self.sessions[s]['wg_drift'] = False 1047 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1048 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1049 1050 1051 def refresh_samples(self): 1052 ''' 1053 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1054 ''' 1055 self.samples = { 1056 s: {'data': [r for r in self if r['Sample'] == s]} 1057 for s in sorted({r['Sample'] for r in self}) 1058 } 1059 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1060 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1061 1062 1063 def read(self, filename, sep = '', session = ''): 1064 ''' 1065 Read file in csv format to load data into a `D47data` object. 1066 1067 In the csv file, spaces before and after field separators (`','` by default) 1068 are optional. Each line corresponds to a single analysis. 1069 1070 The required fields are: 1071 1072 + `UID`: a unique identifier 1073 + `Session`: an identifier for the analytical session 1074 + `Sample`: a sample identifier 1075 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1076 1077 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1078 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1079 and `d49` are optional, and set to NaN by default. 1080 1081 **Parameters** 1082 1083 + `fileneme`: the path of the file to read 1084 + `sep`: csv separator delimiting the fields 1085 + `session`: set `Session` field to this string for all analyses 1086 ''' 1087 with open(filename) as fid: 1088 self.input(fid.read(), sep = sep, session = session) 1089 1090 1091 def input(self, txt, sep = '', session = ''): 1092 ''' 1093 Read `txt` string in csv format to load analysis data into a `D47data` object. 1094 1095 In the csv string, spaces before and after field separators (`','` by default) 1096 are optional. Each line corresponds to a single analysis. 1097 1098 The required fields are: 1099 1100 + `UID`: a unique identifier 1101 + `Session`: an identifier for the analytical session 1102 + `Sample`: a sample identifier 1103 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1104 1105 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1106 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1107 and `d49` are optional, and set to NaN by default. 1108 1109 **Parameters** 1110 1111 + `txt`: the csv string to read 1112 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1113 whichever appers most often in `txt`. 1114 + `session`: set `Session` field to this string for all analyses 1115 ''' 1116 if sep == '': 1117 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1118 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1119 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1120 1121 if session != '': 1122 for r in data: 1123 r['Session'] = session 1124 1125 self += data 1126 self.refresh() 1127 1128 1129 @make_verbal 1130 def wg(self, samples = None, a18_acid = None): 1131 ''' 1132 Compute bulk composition of the working gas for each session based on 1133 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1134 `self.Nominal_d18O_VPDB`. 1135 ''' 1136 1137 self.msg('Computing WG composition:') 1138 1139 if a18_acid is None: 1140 a18_acid = self.ALPHA_18O_ACID_REACTION 1141 if samples is None: 1142 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1143 1144 assert a18_acid, f'Acid fractionation factor should not be zero.' 1145 1146 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1147 R45R46_standards = {} 1148 for sample in samples: 1149 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1150 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1151 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1152 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1153 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1154 1155 C12_s = 1 / (1 + R13_s) 1156 C13_s = R13_s / (1 + R13_s) 1157 C16_s = 1 / (1 + R17_s + R18_s) 1158 C17_s = R17_s / (1 + R17_s + R18_s) 1159 C18_s = R18_s / (1 + R17_s + R18_s) 1160 1161 C626_s = C12_s * C16_s ** 2 1162 C627_s = 2 * C12_s * C16_s * C17_s 1163 C628_s = 2 * C12_s * C16_s * C18_s 1164 C636_s = C13_s * C16_s ** 2 1165 C637_s = 2 * C13_s * C16_s * C17_s 1166 C727_s = C12_s * C17_s ** 2 1167 1168 R45_s = (C627_s + C636_s) / C626_s 1169 R46_s = (C628_s + C637_s + C727_s) / C626_s 1170 R45R46_standards[sample] = (R45_s, R46_s) 1171 1172 for s in self.sessions: 1173 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1174 assert db, f'No sample from {samples} found in session "{s}".' 1175# dbsamples = sorted({r['Sample'] for r in db}) 1176 1177 X = [r['d45'] for r in db] 1178 Y = [R45R46_standards[r['Sample']][0] for r in db] 1179 x1, x2 = np.min(X), np.max(X) 1180 1181 if x1 < x2: 1182 wgcoord = x1/(x1-x2) 1183 else: 1184 wgcoord = 999 1185 1186 if wgcoord < -.5 or wgcoord > 1.5: 1187 # unreasonable to extrapolate to d45 = 0 1188 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1189 else : 1190 # d45 = 0 is reasonably well bracketed 1191 R45_wg = np.polyfit(X, Y, 1)[1] 1192 1193 X = [r['d46'] for r in db] 1194 Y = [R45R46_standards[r['Sample']][1] for r in db] 1195 x1, x2 = np.min(X), np.max(X) 1196 1197 if x1 < x2: 1198 wgcoord = x1/(x1-x2) 1199 else: 1200 wgcoord = 999 1201 1202 if wgcoord < -.5 or wgcoord > 1.5: 1203 # unreasonable to extrapolate to d46 = 0 1204 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1205 else : 1206 # d46 = 0 is reasonably well bracketed 1207 R46_wg = np.polyfit(X, Y, 1)[1] 1208 1209 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1210 1211 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1212 1213 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1214 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1215 for r in self.sessions[s]['data']: 1216 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1217 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1218 1219 1220 def compute_bulk_delta(self, R45, R46, D17O = 0): 1221 ''' 1222 Compute δ13C_VPDB and δ18O_VSMOW, 1223 by solving the generalized form of equation (17) from 1224 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1225 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1226 solving the corresponding second-order Taylor polynomial. 1227 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1228 ''' 1229 1230 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1231 1232 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1233 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1234 C = 2 * self.R18_VSMOW 1235 D = -R46 1236 1237 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1238 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1239 cc = A + B + C + D 1240 1241 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1242 1243 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1244 R17 = K * R18 ** self.LAMBDA_17 1245 R13 = R45 - 2 * R17 1246 1247 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1248 1249 return d13C_VPDB, d18O_VSMOW 1250 1251 1252 @make_verbal 1253 def crunch(self, verbose = ''): 1254 ''' 1255 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1256 ''' 1257 for r in self: 1258 self.compute_bulk_and_clumping_deltas(r) 1259 self.standardize_d13C() 1260 self.standardize_d18O() 1261 self.msg(f"Crunched {len(self)} analyses.") 1262 1263 1264 def fill_in_missing_info(self, session = 'mySession'): 1265 ''' 1266 Fill in optional fields with default values 1267 ''' 1268 for i,r in enumerate(self): 1269 if 'D17O' not in r: 1270 r['D17O'] = 0. 1271 if 'UID' not in r: 1272 r['UID'] = f'{i+1}' 1273 if 'Session' not in r: 1274 r['Session'] = session 1275 for k in ['d47', 'd48', 'd49']: 1276 if k not in r: 1277 r[k] = np.nan 1278 1279 1280 def standardize_d13C(self): 1281 ''' 1282 Perform δ13C standadization within each session `s` according to 1283 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1284 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1285 may be redefined abitrarily at a later stage. 1286 ''' 1287 for s in self.sessions: 1288 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1289 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1290 X,Y = zip(*XY) 1291 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1292 offset = np.mean(Y) - np.mean(X) 1293 for r in self.sessions[s]['data']: 1294 r['d13C_VPDB'] += offset 1295 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1296 a,b = np.polyfit(X,Y,1) 1297 for r in self.sessions[s]['data']: 1298 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1299 1300 def standardize_d18O(self): 1301 ''' 1302 Perform δ18O standadization within each session `s` according to 1303 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1304 which is defined by default by `D47data.refresh_sessions()`as equal to 1305 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1306 ''' 1307 for s in self.sessions: 1308 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1309 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1310 X,Y = zip(*XY) 1311 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1312 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1313 offset = np.mean(Y) - np.mean(X) 1314 for r in self.sessions[s]['data']: 1315 r['d18O_VSMOW'] += offset 1316 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1317 a,b = np.polyfit(X,Y,1) 1318 for r in self.sessions[s]['data']: 1319 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1320 1321 1322 def compute_bulk_and_clumping_deltas(self, r): 1323 ''' 1324 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1325 ''' 1326 1327 # Compute working gas R13, R18, and isobar ratios 1328 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1329 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1330 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1331 1332 # Compute analyte isobar ratios 1333 R45 = (1 + r['d45'] / 1000) * R45_wg 1334 R46 = (1 + r['d46'] / 1000) * R46_wg 1335 R47 = (1 + r['d47'] / 1000) * R47_wg 1336 R48 = (1 + r['d48'] / 1000) * R48_wg 1337 R49 = (1 + r['d49'] / 1000) * R49_wg 1338 1339 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1340 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1341 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1342 1343 # Compute stochastic isobar ratios of the analyte 1344 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1345 R13, R18, D17O = r['D17O'] 1346 ) 1347 1348 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1349 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1350 if (R45 / R45stoch - 1) > 5e-8: 1351 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1352 if (R46 / R46stoch - 1) > 5e-8: 1353 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1354 1355 # Compute raw clumped isotope anomalies 1356 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1357 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1358 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1359 1360 1361 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1362 ''' 1363 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1364 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1365 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1366 ''' 1367 1368 # Compute R17 1369 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1370 1371 # Compute isotope concentrations 1372 C12 = (1 + R13) ** -1 1373 C13 = C12 * R13 1374 C16 = (1 + R17 + R18) ** -1 1375 C17 = C16 * R17 1376 C18 = C16 * R18 1377 1378 # Compute stochastic isotopologue concentrations 1379 C626 = C16 * C12 * C16 1380 C627 = C16 * C12 * C17 * 2 1381 C628 = C16 * C12 * C18 * 2 1382 C636 = C16 * C13 * C16 1383 C637 = C16 * C13 * C17 * 2 1384 C638 = C16 * C13 * C18 * 2 1385 C727 = C17 * C12 * C17 1386 C728 = C17 * C12 * C18 * 2 1387 C737 = C17 * C13 * C17 1388 C738 = C17 * C13 * C18 * 2 1389 C828 = C18 * C12 * C18 1390 C838 = C18 * C13 * C18 1391 1392 # Compute stochastic isobar ratios 1393 R45 = (C636 + C627) / C626 1394 R46 = (C628 + C637 + C727) / C626 1395 R47 = (C638 + C728 + C737) / C626 1396 R48 = (C738 + C828) / C626 1397 R49 = C838 / C626 1398 1399 # Account for stochastic anomalies 1400 R47 *= 1 + D47 / 1000 1401 R48 *= 1 + D48 / 1000 1402 R49 *= 1 + D49 / 1000 1403 1404 # Return isobar ratios 1405 return R45, R46, R47, R48, R49 1406 1407 1408 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1409 ''' 1410 Split unknown samples by UID (treat all analyses as different samples) 1411 or by session (treat analyses of a given sample in different sessions as 1412 different samples). 1413 1414 **Parameters** 1415 1416 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1417 + `grouping`: `by_uid` | `by_session` 1418 ''' 1419 if samples_to_split == 'all': 1420 samples_to_split = [s for s in self.unknowns] 1421 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1422 self.grouping = grouping.lower() 1423 if self.grouping in gkeys: 1424 gkey = gkeys[self.grouping] 1425 for r in self: 1426 if r['Sample'] in samples_to_split: 1427 r['Sample_original'] = r['Sample'] 1428 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1429 elif r['Sample'] in self.unknowns: 1430 r['Sample_original'] = r['Sample'] 1431 self.refresh_samples() 1432 1433 1434 def unsplit_samples(self, tables = False): 1435 ''' 1436 Reverse the effects of `D47data.split_samples()`. 1437 1438 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1439 1440 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1441 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1442 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1443 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1444 that case session-averaged Δ4x values are statistically independent). 1445 ''' 1446 unknowns_old = sorted({s for s in self.unknowns}) 1447 CM_old = self.standardization.covar[:,:] 1448 VD_old = self.standardization.params.valuesdict().copy() 1449 vars_old = self.standardization.var_names 1450 1451 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1452 1453 Ns = len(vars_old) - len(unknowns_old) 1454 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1455 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1456 1457 W = np.zeros((len(vars_new), len(vars_old))) 1458 W[:Ns,:Ns] = np.eye(Ns) 1459 for u in unknowns_new: 1460 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1461 if self.grouping == 'by_session': 1462 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1463 elif self.grouping == 'by_uid': 1464 weights = [1 for s in splits] 1465 sw = sum(weights) 1466 weights = [w/sw for w in weights] 1467 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1468 1469 CM_new = W @ CM_old @ W.T 1470 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1471 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1472 1473 self.standardization.covar = CM_new 1474 self.standardization.params.valuesdict = lambda : VD_new 1475 self.standardization.var_names = vars_new 1476 1477 for r in self: 1478 if r['Sample'] in self.unknowns: 1479 r['Sample_split'] = r['Sample'] 1480 r['Sample'] = r['Sample_original'] 1481 1482 self.refresh_samples() 1483 self.consolidate_samples() 1484 self.repeatabilities() 1485 1486 if tables: 1487 self.table_of_analyses() 1488 self.table_of_samples() 1489 1490 def assign_timestamps(self): 1491 ''' 1492 Assign a time field `t` of type `float` to each analysis. 1493 1494 If `TimeTag` is one of the data fields, `t` is equal within a given session 1495 to `TimeTag` minus the mean value of `TimeTag` for that session. 1496 Otherwise, `TimeTag` is by default equal to the index of each analysis 1497 in the dataset and `t` is defined as above. 1498 ''' 1499 for session in self.sessions: 1500 sdata = self.sessions[session]['data'] 1501 try: 1502 t0 = np.mean([r['TimeTag'] for r in sdata]) 1503 for r in sdata: 1504 r['t'] = r['TimeTag'] - t0 1505 except KeyError: 1506 t0 = (len(sdata)-1)/2 1507 for t,r in enumerate(sdata): 1508 r['t'] = t - t0 1509 1510 1511 def report(self): 1512 ''' 1513 Prints a report on the standardization fit. 1514 Only applicable after `D4xdata.standardize(method='pooled')`. 1515 ''' 1516 report_fit(self.standardization) 1517 1518 1519 def combine_samples(self, sample_groups): 1520 ''' 1521 Combine analyses of different samples to compute weighted average Δ4x 1522 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1523 dictionary. 1524 1525 Caution: samples are weighted by number of replicate analyses, which is a 1526 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1527 correlated analytical errors for one or more samples). 1528 1529 Returns a tuplet of: 1530 1531 + the list of group names 1532 + an array of the corresponding Δ4x values 1533 + the corresponding (co)variance matrix 1534 1535 **Parameters** 1536 1537 + `sample_groups`: a dictionary of the form: 1538 ```py 1539 {'group1': ['sample_1', 'sample_2'], 1540 'group2': ['sample_3', 'sample_4', 'sample_5']} 1541 ``` 1542 ''' 1543 1544 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1545 groups = sorted(sample_groups.keys()) 1546 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1547 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1548 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1549 W = np.array([ 1550 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1551 for j in groups]) 1552 D4x_new = W @ D4x_old 1553 CM_new = W @ CM_old @ W.T 1554 1555 return groups, D4x_new[:,0], CM_new 1556 1557 1558 @make_verbal 1559 def standardize(self, 1560 method = 'pooled', 1561 weighted_sessions = [], 1562 consolidate = True, 1563 consolidate_tables = False, 1564 consolidate_plots = False, 1565 constraints = {}, 1566 ): 1567 ''' 1568 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1569 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1570 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1571 i.e. that their true Δ4x value does not change between sessions, 1572 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1573 `'indep_sessions'`, the standardization processes each session independently, based only 1574 on anchors analyses. 1575 ''' 1576 1577 self.standardization_method = method 1578 self.assign_timestamps() 1579 1580 if method == 'pooled': 1581 if weighted_sessions: 1582 for session_group in weighted_sessions: 1583 if self._4x == '47': 1584 X = D47data([r for r in self if r['Session'] in session_group]) 1585 elif self._4x == '48': 1586 X = D48data([r for r in self if r['Session'] in session_group]) 1587 X.Nominal_D4x = self.Nominal_D4x.copy() 1588 X.refresh() 1589 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1590 w = np.sqrt(result.redchi) 1591 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1592 for r in X: 1593 r[f'wD{self._4x}raw'] *= w 1594 else: 1595 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1596 for r in self: 1597 r[f'wD{self._4x}raw'] = 1. 1598 1599 params = Parameters() 1600 for k,session in enumerate(self.sessions): 1601 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1602 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1603 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1604 s = pf(session) 1605 params.add(f'a_{s}', value = 0.9) 1606 params.add(f'b_{s}', value = 0.) 1607 params.add(f'c_{s}', value = -0.9) 1608 params.add(f'a2_{s}', value = 0., 1609# vary = self.sessions[session]['scrambling_drift'], 1610 ) 1611 params.add(f'b2_{s}', value = 0., 1612# vary = self.sessions[session]['slope_drift'], 1613 ) 1614 params.add(f'c2_{s}', value = 0., 1615# vary = self.sessions[session]['wg_drift'], 1616 ) 1617 if not self.sessions[session]['scrambling_drift']: 1618 params[f'a2_{s}'].expr = '0' 1619 if not self.sessions[session]['slope_drift']: 1620 params[f'b2_{s}'].expr = '0' 1621 if not self.sessions[session]['wg_drift']: 1622 params[f'c2_{s}'].expr = '0' 1623 1624 for sample in self.unknowns: 1625 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1626 1627 for k in constraints: 1628 params[k].expr = constraints[k] 1629 1630 def residuals(p): 1631 R = [] 1632 for r in self: 1633 session = pf(r['Session']) 1634 sample = pf(r['Sample']) 1635 if r['Sample'] in self.Nominal_D4x: 1636 R += [ ( 1637 r[f'D{self._4x}raw'] - ( 1638 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1639 + p[f'b_{session}'] * r[f'd{self._4x}'] 1640 + p[f'c_{session}'] 1641 + r['t'] * ( 1642 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1643 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1644 + p[f'c2_{session}'] 1645 ) 1646 ) 1647 ) / r[f'wD{self._4x}raw'] ] 1648 else: 1649 R += [ ( 1650 r[f'D{self._4x}raw'] - ( 1651 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1652 + p[f'b_{session}'] * r[f'd{self._4x}'] 1653 + p[f'c_{session}'] 1654 + r['t'] * ( 1655 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1656 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1657 + p[f'c2_{session}'] 1658 ) 1659 ) 1660 ) / r[f'wD{self._4x}raw'] ] 1661 return R 1662 1663 M = Minimizer(residuals, params) 1664 result = M.least_squares() 1665 self.Nf = result.nfree 1666 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1667 new_names, new_covar, new_se = _fullcovar(result)[:3] 1668 result.var_names = new_names 1669 result.covar = new_covar 1670 1671 for r in self: 1672 s = pf(r["Session"]) 1673 a = result.params.valuesdict()[f'a_{s}'] 1674 b = result.params.valuesdict()[f'b_{s}'] 1675 c = result.params.valuesdict()[f'c_{s}'] 1676 a2 = result.params.valuesdict()[f'a2_{s}'] 1677 b2 = result.params.valuesdict()[f'b2_{s}'] 1678 c2 = result.params.valuesdict()[f'c2_{s}'] 1679 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1680 1681 1682 self.standardization = result 1683 1684 for session in self.sessions: 1685 self.sessions[session]['Np'] = 3 1686 for k in ['scrambling', 'slope', 'wg']: 1687 if self.sessions[session][f'{k}_drift']: 1688 self.sessions[session]['Np'] += 1 1689 1690 if consolidate: 1691 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1692 return result 1693 1694 1695 elif method == 'indep_sessions': 1696 1697 if weighted_sessions: 1698 for session_group in weighted_sessions: 1699 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1700 X.Nominal_D4x = self.Nominal_D4x.copy() 1701 X.refresh() 1702 # This is only done to assign r['wD47raw'] for r in X: 1703 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1704 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1705 else: 1706 self.msg('All weights set to 1 ‰') 1707 for r in self: 1708 r[f'wD{self._4x}raw'] = 1 1709 1710 for session in self.sessions: 1711 s = self.sessions[session] 1712 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1713 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1714 s['Np'] = sum(p_active) 1715 sdata = s['data'] 1716 1717 A = np.array([ 1718 [ 1719 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1720 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1721 1 / r[f'wD{self._4x}raw'], 1722 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1723 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1724 r['t'] / r[f'wD{self._4x}raw'] 1725 ] 1726 for r in sdata if r['Sample'] in self.anchors 1727 ])[:,p_active] # only keep columns for the active parameters 1728 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1729 s['Na'] = Y.size 1730 CM = linalg.inv(A.T @ A) 1731 bf = (CM @ A.T @ Y).T[0,:] 1732 k = 0 1733 for n,a in zip(p_names, p_active): 1734 if a: 1735 s[n] = bf[k] 1736# self.msg(f'{n} = {bf[k]}') 1737 k += 1 1738 else: 1739 s[n] = 0. 1740# self.msg(f'{n} = 0.0') 1741 1742 for r in sdata : 1743 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1744 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1745 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1746 1747 s['CM'] = np.zeros((6,6)) 1748 i = 0 1749 k_active = [j for j,a in enumerate(p_active) if a] 1750 for j,a in enumerate(p_active): 1751 if a: 1752 s['CM'][j,k_active] = CM[i,:] 1753 i += 1 1754 1755 if not weighted_sessions: 1756 w = self.rmswd()['rmswd'] 1757 for r in self: 1758 r[f'wD{self._4x}'] *= w 1759 r[f'wD{self._4x}raw'] *= w 1760 for session in self.sessions: 1761 self.sessions[session]['CM'] *= w**2 1762 1763 for session in self.sessions: 1764 s = self.sessions[session] 1765 s['SE_a'] = s['CM'][0,0]**.5 1766 s['SE_b'] = s['CM'][1,1]**.5 1767 s['SE_c'] = s['CM'][2,2]**.5 1768 s['SE_a2'] = s['CM'][3,3]**.5 1769 s['SE_b2'] = s['CM'][4,4]**.5 1770 s['SE_c2'] = s['CM'][5,5]**.5 1771 1772 if not weighted_sessions: 1773 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1774 else: 1775 self.Nf = 0 1776 for sg in weighted_sessions: 1777 self.Nf += self.rmswd(sessions = sg)['Nf'] 1778 1779 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1780 1781 avgD4x = { 1782 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1783 for sample in self.samples 1784 } 1785 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1786 rD4x = (chi2/self.Nf)**.5 1787 self.repeatability[f'sigma_{self._4x}'] = rD4x 1788 1789 if consolidate: 1790 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1791 1792 1793 def standardization_error(self, session, d4x, D4x, t = 0): 1794 ''' 1795 Compute standardization error for a given session and 1796 (δ47, Δ47) composition. 1797 ''' 1798 a = self.sessions[session]['a'] 1799 b = self.sessions[session]['b'] 1800 c = self.sessions[session]['c'] 1801 a2 = self.sessions[session]['a2'] 1802 b2 = self.sessions[session]['b2'] 1803 c2 = self.sessions[session]['c2'] 1804 CM = self.sessions[session]['CM'] 1805 1806 x, y = D4x, d4x 1807 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1808# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1809 dxdy = -(b+b2*t) / (a+a2*t) 1810 dxdz = 1. / (a+a2*t) 1811 dxda = -x / (a+a2*t) 1812 dxdb = -y / (a+a2*t) 1813 dxdc = -1. / (a+a2*t) 1814 dxda2 = -x * a2 / (a+a2*t) 1815 dxdb2 = -y * t / (a+a2*t) 1816 dxdc2 = -t / (a+a2*t) 1817 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1818 sx = (V @ CM @ V.T) ** .5 1819 return sx 1820 1821 1822 @make_verbal 1823 def summary(self, 1824 dir = 'output', 1825 filename = None, 1826 save_to_file = True, 1827 print_out = True, 1828 ): 1829 ''' 1830 Print out an/or save to disk a summary of the standardization results. 1831 1832 **Parameters** 1833 1834 + `dir`: the directory in which to save the table 1835 + `filename`: the name to the csv file to write to 1836 + `save_to_file`: whether to save the table to disk 1837 + `print_out`: whether to print out the table 1838 ''' 1839 1840 out = [] 1841 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1842 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1843 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1844 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1845 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1846 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1847 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1848 out += [['Model degrees of freedom', f"{self.Nf}"]] 1849 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1850 out += [['Standardization method', self.standardization_method]] 1851 1852 if save_to_file: 1853 if not os.path.exists(dir): 1854 os.makedirs(dir) 1855 if filename is None: 1856 filename = f'D{self._4x}_summary.csv' 1857 with open(f'{dir}/{filename}', 'w') as fid: 1858 fid.write(make_csv(out)) 1859 if print_out: 1860 self.msg('\n' + pretty_table(out, header = 0)) 1861 1862 1863 @make_verbal 1864 def table_of_sessions(self, 1865 dir = 'output', 1866 filename = None, 1867 save_to_file = True, 1868 print_out = True, 1869 output = None, 1870 ): 1871 ''' 1872 Print out an/or save to disk a table of sessions. 1873 1874 **Parameters** 1875 1876 + `dir`: the directory in which to save the table 1877 + `filename`: the name to the csv file to write to 1878 + `save_to_file`: whether to save the table to disk 1879 + `print_out`: whether to print out the table 1880 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1881 if set to `'raw'`: return a list of list of strings 1882 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1883 ''' 1884 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1885 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1886 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1887 1888 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1889 if include_a2: 1890 out[-1] += ['a2 ± SE'] 1891 if include_b2: 1892 out[-1] += ['b2 ± SE'] 1893 if include_c2: 1894 out[-1] += ['c2 ± SE'] 1895 for session in self.sessions: 1896 out += [[ 1897 session, 1898 f"{self.sessions[session]['Na']}", 1899 f"{self.sessions[session]['Nu']}", 1900 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1901 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1902 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1903 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1904 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1905 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1906 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1907 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1908 ]] 1909 if include_a2: 1910 if self.sessions[session]['scrambling_drift']: 1911 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1912 else: 1913 out[-1] += [''] 1914 if include_b2: 1915 if self.sessions[session]['slope_drift']: 1916 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1917 else: 1918 out[-1] += [''] 1919 if include_c2: 1920 if self.sessions[session]['wg_drift']: 1921 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1922 else: 1923 out[-1] += [''] 1924 1925 if save_to_file: 1926 if not os.path.exists(dir): 1927 os.makedirs(dir) 1928 if filename is None: 1929 filename = f'D{self._4x}_sessions.csv' 1930 with open(f'{dir}/{filename}', 'w') as fid: 1931 fid.write(make_csv(out)) 1932 if print_out: 1933 self.msg('\n' + pretty_table(out)) 1934 if output == 'raw': 1935 return out 1936 elif output == 'pretty': 1937 return pretty_table(out) 1938 1939 1940 @make_verbal 1941 def table_of_analyses( 1942 self, 1943 dir = 'output', 1944 filename = None, 1945 save_to_file = True, 1946 print_out = True, 1947 output = None, 1948 ): 1949 ''' 1950 Print out an/or save to disk a table of analyses. 1951 1952 **Parameters** 1953 1954 + `dir`: the directory in which to save the table 1955 + `filename`: the name to the csv file to write to 1956 + `save_to_file`: whether to save the table to disk 1957 + `print_out`: whether to print out the table 1958 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1959 if set to `'raw'`: return a list of list of strings 1960 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1961 ''' 1962 1963 out = [['UID','Session','Sample']] 1964 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1965 for f in extra_fields: 1966 out[-1] += [f[0]] 1967 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1968 for r in self: 1969 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1970 for f in extra_fields: 1971 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1972 out[-1] += [ 1973 f"{r['d13Cwg_VPDB']:.3f}", 1974 f"{r['d18Owg_VSMOW']:.3f}", 1975 f"{r['d45']:.6f}", 1976 f"{r['d46']:.6f}", 1977 f"{r['d47']:.6f}", 1978 f"{r['d48']:.6f}", 1979 f"{r['d49']:.6f}", 1980 f"{r['d13C_VPDB']:.6f}", 1981 f"{r['d18O_VSMOW']:.6f}", 1982 f"{r['D47raw']:.6f}", 1983 f"{r['D48raw']:.6f}", 1984 f"{r['D49raw']:.6f}", 1985 f"{r[f'D{self._4x}']:.6f}" 1986 ] 1987 if save_to_file: 1988 if not os.path.exists(dir): 1989 os.makedirs(dir) 1990 if filename is None: 1991 filename = f'D{self._4x}_analyses.csv' 1992 with open(f'{dir}/{filename}', 'w') as fid: 1993 fid.write(make_csv(out)) 1994 if print_out: 1995 self.msg('\n' + pretty_table(out)) 1996 return out 1997 1998 @make_verbal 1999 def covar_table( 2000 self, 2001 correl = False, 2002 dir = 'output', 2003 filename = None, 2004 save_to_file = True, 2005 print_out = True, 2006 output = None, 2007 ): 2008 ''' 2009 Print out, save to disk and/or return the variance-covariance matrix of D4x 2010 for all unknown samples. 2011 2012 **Parameters** 2013 2014 + `dir`: the directory in which to save the csv 2015 + `filename`: the name of the csv file to write to 2016 + `save_to_file`: whether to save the csv 2017 + `print_out`: whether to print out the matrix 2018 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2019 if set to `'raw'`: return a list of list of strings 2020 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2021 ''' 2022 samples = sorted([u for u in self.unknowns]) 2023 out = [[''] + samples] 2024 for s1 in samples: 2025 out.append([s1]) 2026 for s2 in samples: 2027 if correl: 2028 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2029 else: 2030 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2031 2032 if save_to_file: 2033 if not os.path.exists(dir): 2034 os.makedirs(dir) 2035 if filename is None: 2036 if correl: 2037 filename = f'D{self._4x}_correl.csv' 2038 else: 2039 filename = f'D{self._4x}_covar.csv' 2040 with open(f'{dir}/{filename}', 'w') as fid: 2041 fid.write(make_csv(out)) 2042 if print_out: 2043 self.msg('\n'+pretty_table(out)) 2044 if output == 'raw': 2045 return out 2046 elif output == 'pretty': 2047 return pretty_table(out) 2048 2049 @make_verbal 2050 def table_of_samples( 2051 self, 2052 dir = 'output', 2053 filename = None, 2054 save_to_file = True, 2055 print_out = True, 2056 output = None, 2057 ): 2058 ''' 2059 Print out, save to disk and/or return a table of samples. 2060 2061 **Parameters** 2062 2063 + `dir`: the directory in which to save the csv 2064 + `filename`: the name of the csv file to write to 2065 + `save_to_file`: whether to save the csv 2066 + `print_out`: whether to print out the table 2067 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2068 if set to `'raw'`: return a list of list of strings 2069 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2070 ''' 2071 2072 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2073 for sample in self.anchors: 2074 out += [[ 2075 f"{sample}", 2076 f"{self.samples[sample]['N']}", 2077 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2078 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2079 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2080 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2081 ]] 2082 for sample in self.unknowns: 2083 out += [[ 2084 f"{sample}", 2085 f"{self.samples[sample]['N']}", 2086 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2087 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2088 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2089 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2090 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2091 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2092 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2093 ]] 2094 if save_to_file: 2095 if not os.path.exists(dir): 2096 os.makedirs(dir) 2097 if filename is None: 2098 filename = f'D{self._4x}_samples.csv' 2099 with open(f'{dir}/{filename}', 'w') as fid: 2100 fid.write(make_csv(out)) 2101 if print_out: 2102 self.msg('\n'+pretty_table(out)) 2103 if output == 'raw': 2104 return out 2105 elif output == 'pretty': 2106 return pretty_table(out) 2107 2108 2109 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2110 ''' 2111 Generate session plots and save them to disk. 2112 2113 **Parameters** 2114 2115 + `dir`: the directory in which to save the plots 2116 + `figsize`: the width and height (in inches) of each plot 2117 + `filetype`: 'pdf' or 'png' 2118 + `dpi`: resolution for PNG output 2119 ''' 2120 if not os.path.exists(dir): 2121 os.makedirs(dir) 2122 2123 for session in self.sessions: 2124 sp = self.plot_single_session(session, xylimits = 'constant') 2125 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2126 ppl.close(sp.fig) 2127 2128 2129 2130 @make_verbal 2131 def consolidate_samples(self): 2132 ''' 2133 Compile various statistics for each sample. 2134 2135 For each anchor sample: 2136 2137 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2138 + `SE_D47` or `SE_D48`: set to zero by definition 2139 2140 For each unknown sample: 2141 2142 + `D47` or `D48`: the standardized Δ4x value for this unknown 2143 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2144 2145 For each anchor and unknown: 2146 2147 + `N`: the total number of analyses of this sample 2148 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2149 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2150 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2151 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2152 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2153 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2154 ''' 2155 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2156 for sample in self.samples: 2157 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2158 if self.samples[sample]['N'] > 1: 2159 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2160 2161 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2162 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2163 2164 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2165 if len(D4x_pop) > 2: 2166 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2167 2168 if self.standardization_method == 'pooled': 2169 for sample in self.anchors: 2170 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2171 self.samples[sample][f'SE_D{self._4x}'] = 0. 2172 for sample in self.unknowns: 2173 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2174 try: 2175 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2176 except ValueError: 2177 # when `sample` is constrained by self.standardize(constraints = {...}), 2178 # it is no longer listed in self.standardization.var_names. 2179 # Temporary fix: define SE as zero for now 2180 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2181 2182 elif self.standardization_method == 'indep_sessions': 2183 for sample in self.anchors: 2184 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2185 self.samples[sample][f'SE_D{self._4x}'] = 0. 2186 for sample in self.unknowns: 2187 self.msg(f'Consolidating sample {sample}') 2188 self.unknowns[sample][f'session_D{self._4x}'] = {} 2189 session_avg = [] 2190 for session in self.sessions: 2191 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2192 if sdata: 2193 self.msg(f'{sample} found in session {session}') 2194 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2195 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2196 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2197 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2198 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2199 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2200 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2201 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2202 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2203 wsum = sum([weights[s] for s in weights]) 2204 for s in weights: 2205 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2206 2207 for r in self: 2208 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2209 2210 2211 2212 def consolidate_sessions(self): 2213 ''' 2214 Compute various statistics for each session. 2215 2216 + `Na`: Number of anchor analyses in the session 2217 + `Nu`: Number of unknown analyses in the session 2218 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2219 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2220 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2221 + `a`: scrambling factor 2222 + `b`: compositional slope 2223 + `c`: WG offset 2224 + `SE_a`: Model stadard erorr of `a` 2225 + `SE_b`: Model stadard erorr of `b` 2226 + `SE_c`: Model stadard erorr of `c` 2227 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2228 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2229 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2230 + `a2`: scrambling factor drift 2231 + `b2`: compositional slope drift 2232 + `c2`: WG offset drift 2233 + `Np`: Number of standardization parameters to fit 2234 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2235 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2236 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2237 ''' 2238 for session in self.sessions: 2239 if 'd13Cwg_VPDB' not in self.sessions[session]: 2240 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2241 if 'd18Owg_VSMOW' not in self.sessions[session]: 2242 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2243 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2244 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2245 2246 self.msg(f'Computing repeatabilities for session {session}') 2247 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2248 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2249 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2250 2251 if self.standardization_method == 'pooled': 2252 for session in self.sessions: 2253 2254 # different (better?) computation of D4x repeatability for each session: 2255 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2256 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2257 2258 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2259 i = self.standardization.var_names.index(f'a_{pf(session)}') 2260 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2261 2262 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2263 i = self.standardization.var_names.index(f'b_{pf(session)}') 2264 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2265 2266 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2267 i = self.standardization.var_names.index(f'c_{pf(session)}') 2268 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2269 2270 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2271 if self.sessions[session]['scrambling_drift']: 2272 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2273 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2274 else: 2275 self.sessions[session]['SE_a2'] = 0. 2276 2277 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2278 if self.sessions[session]['slope_drift']: 2279 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2280 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2281 else: 2282 self.sessions[session]['SE_b2'] = 0. 2283 2284 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2285 if self.sessions[session]['wg_drift']: 2286 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2287 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2288 else: 2289 self.sessions[session]['SE_c2'] = 0. 2290 2291 i = self.standardization.var_names.index(f'a_{pf(session)}') 2292 j = self.standardization.var_names.index(f'b_{pf(session)}') 2293 k = self.standardization.var_names.index(f'c_{pf(session)}') 2294 CM = np.zeros((6,6)) 2295 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2296 try: 2297 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2298 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2299 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2300 try: 2301 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2302 CM[3,4] = self.standardization.covar[i2,j2] 2303 CM[4,3] = self.standardization.covar[j2,i2] 2304 except ValueError: 2305 pass 2306 try: 2307 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2308 CM[3,5] = self.standardization.covar[i2,k2] 2309 CM[5,3] = self.standardization.covar[k2,i2] 2310 except ValueError: 2311 pass 2312 except ValueError: 2313 pass 2314 try: 2315 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2316 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2317 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2318 try: 2319 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2320 CM[4,5] = self.standardization.covar[j2,k2] 2321 CM[5,4] = self.standardization.covar[k2,j2] 2322 except ValueError: 2323 pass 2324 except ValueError: 2325 pass 2326 try: 2327 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2328 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2329 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2330 except ValueError: 2331 pass 2332 2333 self.sessions[session]['CM'] = CM 2334 2335 elif self.standardization_method == 'indep_sessions': 2336 pass # Not implemented yet 2337 2338 2339 @make_verbal 2340 def repeatabilities(self): 2341 ''' 2342 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2343 (for all samples, for anchors, and for unknowns). 2344 ''' 2345 self.msg('Computing reproducibilities for all sessions') 2346 2347 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2348 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2349 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2350 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2351 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2352 2353 2354 @make_verbal 2355 def consolidate(self, tables = True, plots = True): 2356 ''' 2357 Collect information about samples, sessions and repeatabilities. 2358 ''' 2359 self.consolidate_samples() 2360 self.consolidate_sessions() 2361 self.repeatabilities() 2362 2363 if tables: 2364 self.summary() 2365 self.table_of_sessions() 2366 self.table_of_analyses() 2367 self.table_of_samples() 2368 2369 if plots: 2370 self.plot_sessions() 2371 2372 2373 @make_verbal 2374 def rmswd(self, 2375 samples = 'all samples', 2376 sessions = 'all sessions', 2377 ): 2378 ''' 2379 Compute the χ2, root mean squared weighted deviation 2380 (i.e. reduced χ2), and corresponding degrees of freedom of the 2381 Δ4x values for samples in `samples` and sessions in `sessions`. 2382 2383 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2384 ''' 2385 if samples == 'all samples': 2386 mysamples = [k for k in self.samples] 2387 elif samples == 'anchors': 2388 mysamples = [k for k in self.anchors] 2389 elif samples == 'unknowns': 2390 mysamples = [k for k in self.unknowns] 2391 else: 2392 mysamples = samples 2393 2394 if sessions == 'all sessions': 2395 sessions = [k for k in self.sessions] 2396 2397 chisq, Nf = 0, 0 2398 for sample in mysamples : 2399 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2400 if len(G) > 1 : 2401 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2402 Nf += (len(G) - 1) 2403 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2404 r = (chisq / Nf)**.5 if Nf > 0 else 0 2405 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2406 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2407 2408 2409 @make_verbal 2410 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2411 ''' 2412 Compute the repeatability of `[r[key] for r in self]` 2413 ''' 2414 2415 if samples == 'all samples': 2416 mysamples = [k for k in self.samples] 2417 elif samples == 'anchors': 2418 mysamples = [k for k in self.anchors] 2419 elif samples == 'unknowns': 2420 mysamples = [k for k in self.unknowns] 2421 else: 2422 mysamples = samples 2423 2424 if sessions == 'all sessions': 2425 sessions = [k for k in self.sessions] 2426 2427 if key in ['D47', 'D48']: 2428 # Full disclosure: the definition of Nf is tricky/debatable 2429 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2430 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2431 Nf = len(G) 2432# print(f'len(G) = {Nf}') 2433 Nf -= len([s for s in mysamples if s in self.unknowns]) 2434# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2435 for session in sessions: 2436 Np = len([ 2437 _ for _ in self.standardization.params 2438 if ( 2439 self.standardization.params[_].expr is not None 2440 and ( 2441 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2442 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2443 ) 2444 ) 2445 ]) 2446# print(f'session {session}: {Np} parameters to consider') 2447 Na = len({ 2448 r['Sample'] for r in self.sessions[session]['data'] 2449 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2450 }) 2451# print(f'session {session}: {Na} different anchors in that session') 2452 Nf -= min(Np, Na) 2453# print(f'Nf = {Nf}') 2454 2455# for sample in mysamples : 2456# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2457# if len(X) > 1 : 2458# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2459# if sample in self.unknowns: 2460# Nf += len(X) - 1 2461# else: 2462# Nf += len(X) 2463# if samples in ['anchors', 'all samples']: 2464# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2465 r = (chisq / Nf)**.5 if Nf > 0 else 0 2466 2467 else: # if key not in ['D47', 'D48'] 2468 chisq, Nf = 0, 0 2469 for sample in mysamples : 2470 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2471 if len(X) > 1 : 2472 Nf += len(X) - 1 2473 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2474 r = (chisq / Nf)**.5 if Nf > 0 else 0 2475 2476 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2477 return r 2478 2479 def sample_average(self, samples, weights = 'equal', normalize = True): 2480 ''' 2481 Weighted average Δ4x value of a group of samples, accounting for covariance. 2482 2483 Returns the weighed average Δ4x value and associated SE 2484 of a group of samples. Weights are equal by default. If `normalize` is 2485 true, `weights` will be rescaled so that their sum equals 1. 2486 2487 **Examples** 2488 2489 ```python 2490 self.sample_average(['X','Y'], [1, 2]) 2491 ``` 2492 2493 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2494 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2495 values of samples X and Y, respectively. 2496 2497 ```python 2498 self.sample_average(['X','Y'], [1, -1], normalize = False) 2499 ``` 2500 2501 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2502 ''' 2503 if weights == 'equal': 2504 weights = [1/len(samples)] * len(samples) 2505 2506 if normalize: 2507 s = sum(weights) 2508 if s: 2509 weights = [w/s for w in weights] 2510 2511 try: 2512# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2513# C = self.standardization.covar[indices,:][:,indices] 2514 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2515 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2516 return correlated_sum(X, C, weights) 2517 except ValueError: 2518 return (0., 0.) 2519 2520 2521 def sample_D4x_covar(self, sample1, sample2 = None): 2522 ''' 2523 Covariance between Δ4x values of samples 2524 2525 Returns the error covariance between the average Δ4x values of two 2526 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2527 returns the Δ4x variance for that sample. 2528 ''' 2529 if sample2 is None: 2530 sample2 = sample1 2531 if self.standardization_method == 'pooled': 2532 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2533 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2534 return self.standardization.covar[i, j] 2535 elif self.standardization_method == 'indep_sessions': 2536 if sample1 == sample2: 2537 return self.samples[sample1][f'SE_D{self._4x}']**2 2538 else: 2539 c = 0 2540 for session in self.sessions: 2541 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2542 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2543 if sdata1 and sdata2: 2544 a = self.sessions[session]['a'] 2545 # !! TODO: CM below does not account for temporal changes in standardization parameters 2546 CM = self.sessions[session]['CM'][:3,:3] 2547 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2548 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2549 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2550 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2551 c += ( 2552 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2553 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2554 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2555 @ CM 2556 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2557 ) / a**2 2558 return float(c) 2559 2560 def sample_D4x_correl(self, sample1, sample2 = None): 2561 ''' 2562 Correlation between Δ4x errors of samples 2563 2564 Returns the error correlation between the average Δ4x values of two samples. 2565 ''' 2566 if sample2 is None or sample2 == sample1: 2567 return 1. 2568 return ( 2569 self.sample_D4x_covar(sample1, sample2) 2570 / self.unknowns[sample1][f'SE_D{self._4x}'] 2571 / self.unknowns[sample2][f'SE_D{self._4x}'] 2572 ) 2573 2574 def plot_single_session(self, 2575 session, 2576 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2577 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2578 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2579 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2580 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2581 xylimits = 'free', # | 'constant' 2582 x_label = None, 2583 y_label = None, 2584 error_contour_interval = 'auto', 2585 fig = 'new', 2586 ): 2587 ''' 2588 Generate plot for a single session 2589 ''' 2590 if x_label is None: 2591 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2592 if y_label is None: 2593 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2594 2595 out = _SessionPlot() 2596 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2597 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2598 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2599 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2600 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2601 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2602 anchor_avg = (np.array([ np.array([ 2603 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2604 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2605 ]) for sample in anchors]).T, 2606 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2607 unknown_avg = (np.array([ np.array([ 2608 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2609 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2610 ]) for sample in unknowns]).T, 2611 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2612 2613 2614 if fig == 'new': 2615 out.fig = ppl.figure(figsize = (6,6)) 2616 ppl.subplots_adjust(.1,.1,.9,.9) 2617 2618 out.anchor_analyses, = ppl.plot( 2619 anchors_d, 2620 anchors_D, 2621 **kw_plot_anchors) 2622 out.unknown_analyses, = ppl.plot( 2623 unknowns_d, 2624 unknowns_D, 2625 **kw_plot_unknowns) 2626 out.anchor_avg = ppl.plot( 2627 *anchor_avg, 2628 **kw_plot_anchor_avg) 2629 out.unknown_avg = ppl.plot( 2630 *unknown_avg, 2631 **kw_plot_unknown_avg) 2632 if xylimits == 'constant': 2633 x = [r[f'd{self._4x}'] for r in self] 2634 y = [r[f'D{self._4x}'] for r in self] 2635 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2636 w, h = x2-x1, y2-y1 2637 x1 -= w/20 2638 x2 += w/20 2639 y1 -= h/20 2640 y2 += h/20 2641 ppl.axis([x1, x2, y1, y2]) 2642 elif xylimits == 'free': 2643 x1, x2, y1, y2 = ppl.axis() 2644 else: 2645 x1, x2, y1, y2 = ppl.axis(xylimits) 2646 2647 if error_contour_interval != 'none': 2648 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2649 XI,YI = np.meshgrid(xi, yi) 2650 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2651 if error_contour_interval == 'auto': 2652 rng = np.max(SI) - np.min(SI) 2653 if rng <= 0.01: 2654 cinterval = 0.001 2655 elif rng <= 0.03: 2656 cinterval = 0.004 2657 elif rng <= 0.1: 2658 cinterval = 0.01 2659 elif rng <= 0.3: 2660 cinterval = 0.03 2661 elif rng <= 1.: 2662 cinterval = 0.1 2663 else: 2664 cinterval = 0.5 2665 else: 2666 cinterval = error_contour_interval 2667 2668 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2669 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2670 out.clabel = ppl.clabel(out.contour) 2671 contour = (XI, YI, SI, cval, cinterval) 2672 2673 if fig == None: 2674 return { 2675 'anchors':anchors, 2676 'unknowns':unknowns, 2677 'anchors_d':anchors_d, 2678 'anchors_D':anchors_D, 2679 'unknowns_d':unknowns_d, 2680 'unknowns_D':unknowns_D, 2681 'anchor_avg':anchor_avg, 2682 'unknown_avg':unknown_avg, 2683 'contour':contour, 2684 } 2685 2686 ppl.xlabel(x_label) 2687 ppl.ylabel(y_label) 2688 ppl.title(session, weight = 'bold') 2689 ppl.grid(alpha = .2) 2690 out.ax = ppl.gca() 2691 2692 return out 2693 2694 def plot_residuals( 2695 self, 2696 kde = False, 2697 hist = False, 2698 binwidth = 2/3, 2699 dir = 'output', 2700 filename = None, 2701 highlight = [], 2702 colors = None, 2703 figsize = None, 2704 dpi = 100, 2705 yspan = None, 2706 ): 2707 ''' 2708 Plot residuals of each analysis as a function of time (actually, as a function of 2709 the order of analyses in the `D4xdata` object) 2710 2711 + `kde`: whether to add a kernel density estimate of residuals 2712 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2713 + `histbins`: specify bin edges for the histogram 2714 + `dir`: the directory in which to save the plot 2715 + `highlight`: a list of samples to highlight 2716 + `colors`: a dict of `{<sample>: <color>}` for all samples 2717 + `figsize`: (width, height) of figure 2718 + `dpi`: resolution for PNG output 2719 + `yspan`: factor controlling the range of y values shown in plot 2720 (by default: `yspan = 1.5 if kde else 1.0`) 2721 ''' 2722 2723 from matplotlib import ticker 2724 2725 if yspan is None: 2726 if kde: 2727 yspan = 1.5 2728 else: 2729 yspan = 1.0 2730 2731 # Layout 2732 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2733 if hist or kde: 2734 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2735 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2736 else: 2737 ppl.subplots_adjust(.08,.05,.78,.8) 2738 ax1 = ppl.subplot(111) 2739 2740 # Colors 2741 N = len(self.anchors) 2742 if colors is None: 2743 if len(highlight) > 0: 2744 Nh = len(highlight) 2745 if Nh == 1: 2746 colors = {highlight[0]: (0,0,0)} 2747 elif Nh == 3: 2748 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2749 elif Nh == 4: 2750 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2751 else: 2752 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2753 else: 2754 if N == 3: 2755 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2756 elif N == 4: 2757 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2758 else: 2759 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2760 2761 ppl.sca(ax1) 2762 2763 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2764 2765 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2766 2767 session = self[0]['Session'] 2768 x1 = 0 2769# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2770 x_sessions = {} 2771 one_or_more_singlets = False 2772 one_or_more_multiplets = False 2773 multiplets = set() 2774 for k,r in enumerate(self): 2775 if r['Session'] != session: 2776 x2 = k-1 2777 x_sessions[session] = (x1+x2)/2 2778 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2779 session = r['Session'] 2780 x1 = k 2781 singlet = len(self.samples[r['Sample']]['data']) == 1 2782 if not singlet: 2783 multiplets.add(r['Sample']) 2784 if r['Sample'] in self.unknowns: 2785 if singlet: 2786 one_or_more_singlets = True 2787 else: 2788 one_or_more_multiplets = True 2789 kw = dict( 2790 marker = 'x' if singlet else '+', 2791 ms = 4 if singlet else 5, 2792 ls = 'None', 2793 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2794 mew = 1, 2795 alpha = 0.2 if singlet else 1, 2796 ) 2797 if highlight and r['Sample'] not in highlight: 2798 kw['alpha'] = 0.2 2799 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2800 x2 = k 2801 x_sessions[session] = (x1+x2)/2 2802 2803 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2804 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2805 if not (hist or kde): 2806 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2807 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2808 2809 xmin, xmax, ymin, ymax = ppl.axis() 2810 if yspan != 1: 2811 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2812 for s in x_sessions: 2813 ppl.text( 2814 x_sessions[s], 2815 ymax +1, 2816 s, 2817 va = 'bottom', 2818 **( 2819 dict(ha = 'center') 2820 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2821 else dict(ha = 'left', rotation = 45) 2822 ) 2823 ) 2824 2825 if hist or kde: 2826 ppl.sca(ax2) 2827 2828 for s in colors: 2829 kw['marker'] = '+' 2830 kw['ms'] = 5 2831 kw['mec'] = colors[s] 2832 kw['label'] = s 2833 kw['alpha'] = 1 2834 ppl.plot([], [], **kw) 2835 2836 kw['mec'] = (0,0,0) 2837 2838 if one_or_more_singlets: 2839 kw['marker'] = 'x' 2840 kw['ms'] = 4 2841 kw['alpha'] = .2 2842 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2843 ppl.plot([], [], **kw) 2844 2845 if one_or_more_multiplets: 2846 kw['marker'] = '+' 2847 kw['ms'] = 4 2848 kw['alpha'] = 1 2849 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2850 ppl.plot([], [], **kw) 2851 2852 if hist or kde: 2853 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2854 else: 2855 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2856 leg.set_zorder(-1000) 2857 2858 ppl.sca(ax1) 2859 2860 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2861 ppl.xticks([]) 2862 ppl.axis([-1, len(self), None, None]) 2863 2864 if hist or kde: 2865 ppl.sca(ax2) 2866 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2867 2868 if kde: 2869 from scipy.stats import gaussian_kde 2870 yi = np.linspace(ymin, ymax, 201) 2871 xi = gaussian_kde(X).evaluate(yi) 2872 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2873# ppl.plot(xi, yi, 'k-', lw = 1) 2874 elif hist: 2875 ppl.hist( 2876 X, 2877 orientation = 'horizontal', 2878 histtype = 'stepfilled', 2879 ec = [.4]*3, 2880 fc = [.25]*3, 2881 alpha = .25, 2882 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2883 ) 2884 ppl.text(0, 0, 2885 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2886 size = 7.5, 2887 alpha = 1, 2888 va = 'center', 2889 ha = 'left', 2890 ) 2891 2892 ppl.axis([0, None, ymin, ymax]) 2893 ppl.xticks([]) 2894 ppl.yticks([]) 2895# ax2.spines['left'].set_visible(False) 2896 ax2.spines['right'].set_visible(False) 2897 ax2.spines['top'].set_visible(False) 2898 ax2.spines['bottom'].set_visible(False) 2899 2900 ax1.axis([None, None, ymin, ymax]) 2901 2902 if not os.path.exists(dir): 2903 os.makedirs(dir) 2904 if filename is None: 2905 return fig 2906 elif filename == '': 2907 filename = f'D{self._4x}_residuals.pdf' 2908 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2909 ppl.close(fig) 2910 2911 2912 def simulate(self, *args, **kwargs): 2913 ''' 2914 Legacy function with warning message pointing to `virtual_data()` 2915 ''' 2916 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2917 2918 def plot_distribution_of_analyses( 2919 self, 2920 dir = 'output', 2921 filename = None, 2922 vs_time = False, 2923 figsize = (6,4), 2924 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2925 output = None, 2926 dpi = 100, 2927 ): 2928 ''' 2929 Plot temporal distribution of all analyses in the data set. 2930 2931 **Parameters** 2932 2933 + `dir`: the directory in which to save the plot 2934 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2935 + `dpi`: resolution for PNG output 2936 + `figsize`: (width, height) of figure 2937 + `dpi`: resolution for PNG output 2938 ''' 2939 2940 asamples = [s for s in self.anchors] 2941 usamples = [s for s in self.unknowns] 2942 if output is None or output == 'fig': 2943 fig = ppl.figure(figsize = figsize) 2944 ppl.subplots_adjust(*subplots_adjust) 2945 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2946 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2947 Xmax += (Xmax-Xmin)/40 2948 Xmin -= (Xmax-Xmin)/41 2949 for k, s in enumerate(asamples + usamples): 2950 if vs_time: 2951 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2952 else: 2953 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2954 Y = [-k for x in X] 2955 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2956 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2957 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2958 ppl.axis([Xmin, Xmax, -k-1, 1]) 2959 ppl.xlabel('\ntime') 2960 ppl.gca().annotate('', 2961 xy = (0.6, -0.02), 2962 xycoords = 'axes fraction', 2963 xytext = (.4, -0.02), 2964 arrowprops = dict(arrowstyle = "->", color = 'k'), 2965 ) 2966 2967 2968 x2 = -1 2969 for session in self.sessions: 2970 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2971 if vs_time: 2972 ppl.axvline(x1, color = 'k', lw = .75) 2973 if x2 > -1: 2974 if not vs_time: 2975 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2976 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2977# from xlrd import xldate_as_datetime 2978# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2979 if vs_time: 2980 ppl.axvline(x2, color = 'k', lw = .75) 2981 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2982 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2983 2984 ppl.xticks([]) 2985 ppl.yticks([]) 2986 2987 if output is None: 2988 if not os.path.exists(dir): 2989 os.makedirs(dir) 2990 if filename == None: 2991 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2992 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2993 ppl.close(fig) 2994 elif output == 'ax': 2995 return ppl.gca() 2996 elif output == 'fig': 2997 return fig 2998 2999 3000 def plot_bulk_compositions( 3001 self, 3002 samples = None, 3003 dir = 'output/bulk_compositions', 3004 figsize = (6,6), 3005 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3006 show = False, 3007 sample_color = (0,.5,1), 3008 analysis_color = (.7,.7,.7), 3009 labeldist = 0.3, 3010 radius = 0.05, 3011 ): 3012 ''' 3013 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3014 3015 By default, creates a directory `./output/bulk_compositions` where plots for 3016 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3017 3018 3019 **Parameters** 3020 3021 + `samples`: Only these samples are processed (by default: all samples). 3022 + `dir`: where to save the plots 3023 + `figsize`: (width, height) of figure 3024 + `subplots_adjust`: passed to `subplots_adjust()` 3025 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3026 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3027 + `sample_color`: color used for replicate markers/labels 3028 + `analysis_color`: color used for sample markers/labels 3029 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3030 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3031 ''' 3032 3033 from matplotlib.patches import Ellipse 3034 3035 if samples is None: 3036 samples = [_ for _ in self.samples] 3037 3038 saved = {} 3039 3040 for s in samples: 3041 3042 fig = ppl.figure(figsize = figsize) 3043 fig.subplots_adjust(*subplots_adjust) 3044 ax = ppl.subplot(111) 3045 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3046 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3047 ppl.title(s) 3048 3049 3050 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3051 UID = [_['UID'] for _ in self.samples[s]['data']] 3052 XY0 = XY.mean(0) 3053 3054 for xy in XY: 3055 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3056 3057 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3058 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3059 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3060 saved[s] = [XY, XY0] 3061 3062 x1, x2, y1, y2 = ppl.axis() 3063 x0, dx = (x1+x2)/2, (x2-x1)/2 3064 y0, dy = (y1+y2)/2, (y2-y1)/2 3065 dx, dy = [max(max(dx, dy), radius)]*2 3066 3067 ppl.axis([ 3068 x0 - 1.2*dx, 3069 x0 + 1.2*dx, 3070 y0 - 1.2*dy, 3071 y0 + 1.2*dy, 3072 ]) 3073 3074 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3075 3076 for xy, uid in zip(XY, UID): 3077 3078 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3079 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3080 3081 if (vector_in_display_space**2).sum() > 0: 3082 3083 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3084 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3085 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3086 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3087 3088 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3089 3090 else: 3091 3092 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3093 3094 if radius: 3095 ax.add_artist(Ellipse( 3096 xy = XY0, 3097 width = radius*2, 3098 height = radius*2, 3099 ls = (0, (2,2)), 3100 lw = .7, 3101 ec = analysis_color, 3102 fc = 'None', 3103 )) 3104 ppl.text( 3105 XY0[0], 3106 XY0[1]-radius, 3107 f'\n± {radius*1e3:.0f} ppm', 3108 color = analysis_color, 3109 va = 'top', 3110 ha = 'center', 3111 linespacing = 0.4, 3112 size = 8, 3113 ) 3114 3115 if not os.path.exists(dir): 3116 os.makedirs(dir) 3117 fig.savefig(f'{dir}/{s}.pdf') 3118 ppl.close(fig) 3119 3120 fig = ppl.figure(figsize = figsize) 3121 fig.subplots_adjust(*subplots_adjust) 3122 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3123 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3124 3125 for s in saved: 3126 for xy in saved[s][0]: 3127 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3128 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3129 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3130 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3131 3132 x1, x2, y1, y2 = ppl.axis() 3133 ppl.axis([ 3134 x1 - (x2-x1)/10, 3135 x2 + (x2-x1)/10, 3136 y1 - (y2-y1)/10, 3137 y2 + (y2-y1)/10, 3138 ]) 3139 3140 3141 if not os.path.exists(dir): 3142 os.makedirs(dir) 3143 fig.savefig(f'{dir}/__all__.pdf') 3144 if show: 3145 ppl.show() 3146 ppl.close(fig) 3147 3148 3149 def _save_D4x_correl( 3150 self, 3151 samples = None, 3152 dir = 'output', 3153 filename = None, 3154 D4x_precision = 4, 3155 correl_precision = 4, 3156 ): 3157 ''' 3158 Save D4x values along with their SE and correlation matrix. 3159 3160 **Parameters** 3161 3162 + `samples`: Only these samples are output (by default: all samples). 3163 + `dir`: the directory in which to save the faile (by defaut: `output`) 3164 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3165 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3166 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3167 ''' 3168 if samples is None: 3169 samples = sorted([s for s in self.unknowns]) 3170 3171 out = [['Sample']] + [[s] for s in samples] 3172 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3173 for k,s in enumerate(samples): 3174 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3175 for s2 in samples: 3176 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3177 3178 if not os.path.exists(dir): 3179 os.makedirs(dir) 3180 if filename is None: 3181 filename = f'D{self._4x}_correl.csv' 3182 with open(f'{dir}/{filename}', 'w') as fid: 3183 fid.write(make_csv(out))
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
955 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 956 ''' 957 **Parameters** 958 959 + `l`: a list of dictionaries, with each dictionary including at least the keys 960 `Sample`, `d45`, `d46`, and `d47` or `d48`. 961 + `mass`: `'47'` or `'48'` 962 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 963 + `session`: define session name for analyses without a `Session` key 964 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 965 966 Returns a `D4xdata` object derived from `list`. 967 ''' 968 self._4x = mass 969 self.verbose = verbose 970 self.prefix = 'D4xdata' 971 self.logfile = logfile 972 list.__init__(self, l) 973 self.Nf = None 974 self.repeatability = {} 975 self.refresh(session = session)
Parameters
l
: a list of dictionaries, with each dictionary including at least the keysSample
,d45
,d46
, andd47
ord48
.mass
:'47'
or'48'
logfile
: if specified, write detailed logs to this file path when callingD4xdata
methods.session
: define session name for analyses without aSession
keyverbose
: ifTrue
, print out detailed logs when callingD4xdata
methods.
Returns a D4xdata
object derived from list
.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB
)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092
.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17
.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE
(by default equal to 'ETH-3'
) specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg()
,
D4xdata.standardize_d13C
, and D4xdata.standardize_d18O
.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C()
.
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}
after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O()
.
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}
after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none
: do not apply any δ13C standardization.'1pt'
: within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).
Method by which to standardize δ18O values:
none
: do not apply any δ18O standardization.'1pt'
: within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).
978 def make_verbal(oldfun): 979 ''' 980 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 981 ''' 982 @wraps(oldfun) 983 def newfun(*args, verbose = '', **kwargs): 984 myself = args[0] 985 oldprefix = myself.prefix 986 myself.prefix = oldfun.__name__ 987 if verbose != '': 988 oldverbose = myself.verbose 989 myself.verbose = verbose 990 out = oldfun(*args, **kwargs) 991 myself.prefix = oldprefix 992 if verbose != '': 993 myself.verbose = oldverbose 994 return out 995 return newfun
Decorator: allow temporarily changing self.prefix
and overriding self.verbose
.
998 def msg(self, txt): 999 ''' 1000 Log a message to `self.logfile`, and print it out if `verbose = True` 1001 ''' 1002 self.log(txt) 1003 if self.verbose: 1004 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
, and print it out if verbose = True
1007 def vmsg(self, txt): 1008 ''' 1009 Log a message to `self.logfile` and print it out 1010 ''' 1011 self.log(txt) 1012 print(txt)
Log a message to self.logfile
and print it out
1015 def log(self, *txts): 1016 ''' 1017 Log a message to `self.logfile` 1018 ''' 1019 if self.logfile: 1020 with open(self.logfile, 'a') as fid: 1021 for txt in txts: 1022 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
1025 def refresh(self, session = 'mySession'): 1026 ''' 1027 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 1028 ''' 1029 self.fill_in_missing_info(session = session) 1030 self.refresh_sessions() 1031 self.refresh_samples()
Update self.sessions
, self.samples
, self.anchors
, and self.unknowns
.
1034 def refresh_sessions(self): 1035 ''' 1036 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1037 to `False` for all sessions. 1038 ''' 1039 self.sessions = { 1040 s: {'data': [r for r in self if r['Session'] == s]} 1041 for s in sorted({r['Session'] for r in self}) 1042 } 1043 for s in self.sessions: 1044 self.sessions[s]['scrambling_drift'] = False 1045 self.sessions[s]['slope_drift'] = False 1046 self.sessions[s]['wg_drift'] = False 1047 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1048 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions
and set scrambling_drift
, slope_drift
, and wg_drift
to False
for all sessions.
1051 def refresh_samples(self): 1052 ''' 1053 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1054 ''' 1055 self.samples = { 1056 s: {'data': [r for r in self if r['Sample'] == s]} 1057 for s in sorted({r['Sample'] for r in self}) 1058 } 1059 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1060 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples
, self.anchors
, and self.unknowns
.
1063 def read(self, filename, sep = '', session = ''): 1064 ''' 1065 Read file in csv format to load data into a `D47data` object. 1066 1067 In the csv file, spaces before and after field separators (`','` by default) 1068 are optional. Each line corresponds to a single analysis. 1069 1070 The required fields are: 1071 1072 + `UID`: a unique identifier 1073 + `Session`: an identifier for the analytical session 1074 + `Sample`: a sample identifier 1075 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1076 1077 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1078 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1079 and `d49` are optional, and set to NaN by default. 1080 1081 **Parameters** 1082 1083 + `fileneme`: the path of the file to read 1084 + `sep`: csv separator delimiting the fields 1085 + `session`: set `Session` field to this string for all analyses 1086 ''' 1087 with open(filename) as fid: 1088 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data
object.
In the csv file, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
fileneme
: the path of the file to readsep
: csv separator delimiting the fieldssession
: setSession
field to this string for all analyses
1091 def input(self, txt, sep = '', session = ''): 1092 ''' 1093 Read `txt` string in csv format to load analysis data into a `D47data` object. 1094 1095 In the csv string, spaces before and after field separators (`','` by default) 1096 are optional. Each line corresponds to a single analysis. 1097 1098 The required fields are: 1099 1100 + `UID`: a unique identifier 1101 + `Session`: an identifier for the analytical session 1102 + `Sample`: a sample identifier 1103 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1104 1105 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1106 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1107 and `d49` are optional, and set to NaN by default. 1108 1109 **Parameters** 1110 1111 + `txt`: the csv string to read 1112 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1113 whichever appers most often in `txt`. 1114 + `session`: set `Session` field to this string for all analyses 1115 ''' 1116 if sep == '': 1117 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1118 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1119 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1120 1121 if session != '': 1122 for r in data: 1123 r['Session'] = session 1124 1125 self += data 1126 self.refresh()
Read txt
string in csv format to load analysis data into a D47data
object.
In the csv string, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
txt
: the csv string to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in
txt
.session
: setSession
field to this string for all analyses
1129 @make_verbal 1130 def wg(self, samples = None, a18_acid = None): 1131 ''' 1132 Compute bulk composition of the working gas for each session based on 1133 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1134 `self.Nominal_d18O_VPDB`. 1135 ''' 1136 1137 self.msg('Computing WG composition:') 1138 1139 if a18_acid is None: 1140 a18_acid = self.ALPHA_18O_ACID_REACTION 1141 if samples is None: 1142 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1143 1144 assert a18_acid, f'Acid fractionation factor should not be zero.' 1145 1146 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1147 R45R46_standards = {} 1148 for sample in samples: 1149 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1150 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1151 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1152 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1153 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1154 1155 C12_s = 1 / (1 + R13_s) 1156 C13_s = R13_s / (1 + R13_s) 1157 C16_s = 1 / (1 + R17_s + R18_s) 1158 C17_s = R17_s / (1 + R17_s + R18_s) 1159 C18_s = R18_s / (1 + R17_s + R18_s) 1160 1161 C626_s = C12_s * C16_s ** 2 1162 C627_s = 2 * C12_s * C16_s * C17_s 1163 C628_s = 2 * C12_s * C16_s * C18_s 1164 C636_s = C13_s * C16_s ** 2 1165 C637_s = 2 * C13_s * C16_s * C17_s 1166 C727_s = C12_s * C17_s ** 2 1167 1168 R45_s = (C627_s + C636_s) / C626_s 1169 R46_s = (C628_s + C637_s + C727_s) / C626_s 1170 R45R46_standards[sample] = (R45_s, R46_s) 1171 1172 for s in self.sessions: 1173 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1174 assert db, f'No sample from {samples} found in session "{s}".' 1175# dbsamples = sorted({r['Sample'] for r in db}) 1176 1177 X = [r['d45'] for r in db] 1178 Y = [R45R46_standards[r['Sample']][0] for r in db] 1179 x1, x2 = np.min(X), np.max(X) 1180 1181 if x1 < x2: 1182 wgcoord = x1/(x1-x2) 1183 else: 1184 wgcoord = 999 1185 1186 if wgcoord < -.5 or wgcoord > 1.5: 1187 # unreasonable to extrapolate to d45 = 0 1188 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1189 else : 1190 # d45 = 0 is reasonably well bracketed 1191 R45_wg = np.polyfit(X, Y, 1)[1] 1192 1193 X = [r['d46'] for r in db] 1194 Y = [R45R46_standards[r['Sample']][1] for r in db] 1195 x1, x2 = np.min(X), np.max(X) 1196 1197 if x1 < x2: 1198 wgcoord = x1/(x1-x2) 1199 else: 1200 wgcoord = 999 1201 1202 if wgcoord < -.5 or wgcoord > 1.5: 1203 # unreasonable to extrapolate to d46 = 0 1204 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1205 else : 1206 # d46 = 0 is reasonably well bracketed 1207 R46_wg = np.polyfit(X, Y, 1)[1] 1208 1209 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1210 1211 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1212 1213 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1214 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1215 for r in self.sessions[s]['data']: 1216 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1217 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based on
the carbonate standards defined in both self.Nominal_d13C_VPDB
and
self.Nominal_d18O_VPDB
.
1220 def compute_bulk_delta(self, R45, R46, D17O = 0): 1221 ''' 1222 Compute δ13C_VPDB and δ18O_VSMOW, 1223 by solving the generalized form of equation (17) from 1224 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1225 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1226 solving the corresponding second-order Taylor polynomial. 1227 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1228 ''' 1229 1230 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1231 1232 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1233 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1234 C = 2 * self.R18_VSMOW 1235 D = -R46 1236 1237 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1238 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1239 cc = A + B + C + D 1240 1241 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1242 1243 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1244 R17 = K * R18 ** self.LAMBDA_17 1245 R13 = R45 - 2 * R17 1246 1247 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1248 1249 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1252 @make_verbal 1253 def crunch(self, verbose = ''): 1254 ''' 1255 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1256 ''' 1257 for r in self: 1258 self.compute_bulk_and_clumping_deltas(r) 1259 self.standardize_d13C() 1260 self.standardize_d18O() 1261 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1264 def fill_in_missing_info(self, session = 'mySession'): 1265 ''' 1266 Fill in optional fields with default values 1267 ''' 1268 for i,r in enumerate(self): 1269 if 'D17O' not in r: 1270 r['D17O'] = 0. 1271 if 'UID' not in r: 1272 r['UID'] = f'{i+1}' 1273 if 'Session' not in r: 1274 r['Session'] = session 1275 for k in ['d47', 'd48', 'd49']: 1276 if k not in r: 1277 r[k] = np.nan
Fill in optional fields with default values
1280 def standardize_d13C(self): 1281 ''' 1282 Perform δ13C standadization within each session `s` according to 1283 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1284 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1285 may be redefined abitrarily at a later stage. 1286 ''' 1287 for s in self.sessions: 1288 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1289 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1290 X,Y = zip(*XY) 1291 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1292 offset = np.mean(Y) - np.mean(X) 1293 for r in self.sessions[s]['data']: 1294 r['d13C_VPDB'] += offset 1295 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1296 a,b = np.polyfit(X,Y,1) 1297 for r in self.sessions[s]['data']: 1298 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s
according to
self.sessions[s]['d13C_standardization_method']
, which is defined by default
by D47data.refresh_sessions()
as equal to self.d13C_STANDARDIZATION_METHOD
, but
may be redefined abitrarily at a later stage.
1300 def standardize_d18O(self): 1301 ''' 1302 Perform δ18O standadization within each session `s` according to 1303 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1304 which is defined by default by `D47data.refresh_sessions()`as equal to 1305 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1306 ''' 1307 for s in self.sessions: 1308 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1309 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1310 X,Y = zip(*XY) 1311 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1312 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1313 offset = np.mean(Y) - np.mean(X) 1314 for r in self.sessions[s]['data']: 1315 r['d18O_VSMOW'] += offset 1316 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1317 a,b = np.polyfit(X,Y,1) 1318 for r in self.sessions[s]['data']: 1319 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s
according to
self.ALPHA_18O_ACID_REACTION
and self.sessions[s]['d18O_standardization_method']
,
which is defined by default by D47data.refresh_sessions()
as equal to
self.d18O_STANDARDIZATION_METHOD
, but may be redefined abitrarily at a later stage.
1322 def compute_bulk_and_clumping_deltas(self, r): 1323 ''' 1324 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1325 ''' 1326 1327 # Compute working gas R13, R18, and isobar ratios 1328 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1329 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1330 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1331 1332 # Compute analyte isobar ratios 1333 R45 = (1 + r['d45'] / 1000) * R45_wg 1334 R46 = (1 + r['d46'] / 1000) * R46_wg 1335 R47 = (1 + r['d47'] / 1000) * R47_wg 1336 R48 = (1 + r['d48'] / 1000) * R48_wg 1337 R49 = (1 + r['d49'] / 1000) * R49_wg 1338 1339 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1340 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1341 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1342 1343 # Compute stochastic isobar ratios of the analyte 1344 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1345 R13, R18, D17O = r['D17O'] 1346 ) 1347 1348 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1349 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1350 if (R45 / R45stoch - 1) > 5e-8: 1351 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1352 if (R46 / R46stoch - 1) > 5e-8: 1353 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1354 1355 # Compute raw clumped isotope anomalies 1356 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1357 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1358 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r
.
1361 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1362 ''' 1363 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1364 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1365 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1366 ''' 1367 1368 # Compute R17 1369 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1370 1371 # Compute isotope concentrations 1372 C12 = (1 + R13) ** -1 1373 C13 = C12 * R13 1374 C16 = (1 + R17 + R18) ** -1 1375 C17 = C16 * R17 1376 C18 = C16 * R18 1377 1378 # Compute stochastic isotopologue concentrations 1379 C626 = C16 * C12 * C16 1380 C627 = C16 * C12 * C17 * 2 1381 C628 = C16 * C12 * C18 * 2 1382 C636 = C16 * C13 * C16 1383 C637 = C16 * C13 * C17 * 2 1384 C638 = C16 * C13 * C18 * 2 1385 C727 = C17 * C12 * C17 1386 C728 = C17 * C12 * C18 * 2 1387 C737 = C17 * C13 * C17 1388 C738 = C17 * C13 * C18 * 2 1389 C828 = C18 * C12 * C18 1390 C838 = C18 * C13 * C18 1391 1392 # Compute stochastic isobar ratios 1393 R45 = (C636 + C627) / C626 1394 R46 = (C628 + C637 + C727) / C626 1395 R47 = (C638 + C728 + C737) / C626 1396 R48 = (C738 + C828) / C626 1397 R49 = C838 / C626 1398 1399 # Account for stochastic anomalies 1400 R47 *= 1 + D47 / 1000 1401 R48 *= 1 + D48 / 1000 1402 R49 *= 1 + D49 / 1000 1403 1404 # Return isobar ratios 1405 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13
and R18
,
optionally accounting for non-zero values of Δ17O (D17O
) and clumped isotope
anomalies (D47
, D48
, D49
), all expressed in permil.
1408 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1409 ''' 1410 Split unknown samples by UID (treat all analyses as different samples) 1411 or by session (treat analyses of a given sample in different sessions as 1412 different samples). 1413 1414 **Parameters** 1415 1416 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1417 + `grouping`: `by_uid` | `by_session` 1418 ''' 1419 if samples_to_split == 'all': 1420 samples_to_split = [s for s in self.unknowns] 1421 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1422 self.grouping = grouping.lower() 1423 if self.grouping in gkeys: 1424 gkey = gkeys[self.grouping] 1425 for r in self: 1426 if r['Sample'] in samples_to_split: 1427 r['Sample_original'] = r['Sample'] 1428 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1429 elif r['Sample'] in self.unknowns: 1430 r['Sample_original'] = r['Sample'] 1431 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split
: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']
grouping
:by_uid
|by_session
1434 def unsplit_samples(self, tables = False): 1435 ''' 1436 Reverse the effects of `D47data.split_samples()`. 1437 1438 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1439 1440 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1441 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1442 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1443 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1444 that case session-averaged Δ4x values are statistically independent). 1445 ''' 1446 unknowns_old = sorted({s for s in self.unknowns}) 1447 CM_old = self.standardization.covar[:,:] 1448 VD_old = self.standardization.params.valuesdict().copy() 1449 vars_old = self.standardization.var_names 1450 1451 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1452 1453 Ns = len(vars_old) - len(unknowns_old) 1454 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1455 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1456 1457 W = np.zeros((len(vars_new), len(vars_old))) 1458 W[:Ns,:Ns] = np.eye(Ns) 1459 for u in unknowns_new: 1460 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1461 if self.grouping == 'by_session': 1462 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1463 elif self.grouping == 'by_uid': 1464 weights = [1 for s in splits] 1465 sw = sum(weights) 1466 weights = [w/sw for w in weights] 1467 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1468 1469 CM_new = W @ CM_old @ W.T 1470 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1471 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1472 1473 self.standardization.covar = CM_new 1474 self.standardization.params.valuesdict = lambda : VD_new 1475 self.standardization.var_names = vars_new 1476 1477 for r in self: 1478 if r['Sample'] in self.unknowns: 1479 r['Sample_split'] = r['Sample'] 1480 r['Sample'] = r['Sample_original'] 1481 1482 self.refresh_samples() 1483 self.consolidate_samples() 1484 self.repeatabilities() 1485 1486 if tables: 1487 self.table_of_analyses() 1488 self.table_of_samples()
Reverse the effects of D47data.split_samples()
.
This should only be used after D4xdata.standardize()
with method='pooled'
.
After D4xdata.standardize()
with method='indep_sessions'
, one should
probably use D4xdata.combine_samples()
instead to reverse the effects of
D47data.split_samples()
with grouping='by_uid'
, or w_avg()
to reverse the
effects of D47data.split_samples()
with grouping='by_sessions'
(because in
that case session-averaged Δ4x values are statistically independent).
1490 def assign_timestamps(self): 1491 ''' 1492 Assign a time field `t` of type `float` to each analysis. 1493 1494 If `TimeTag` is one of the data fields, `t` is equal within a given session 1495 to `TimeTag` minus the mean value of `TimeTag` for that session. 1496 Otherwise, `TimeTag` is by default equal to the index of each analysis 1497 in the dataset and `t` is defined as above. 1498 ''' 1499 for session in self.sessions: 1500 sdata = self.sessions[session]['data'] 1501 try: 1502 t0 = np.mean([r['TimeTag'] for r in sdata]) 1503 for r in sdata: 1504 r['t'] = r['TimeTag'] - t0 1505 except KeyError: 1506 t0 = (len(sdata)-1)/2 1507 for t,r in enumerate(sdata): 1508 r['t'] = t - t0
Assign a time field t
of type float
to each analysis.
If TimeTag
is one of the data fields, t
is equal within a given session
to TimeTag
minus the mean value of TimeTag
for that session.
Otherwise, TimeTag
is by default equal to the index of each analysis
in the dataset and t
is defined as above.
1511 def report(self): 1512 ''' 1513 Prints a report on the standardization fit. 1514 Only applicable after `D4xdata.standardize(method='pooled')`. 1515 ''' 1516 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled')
.
1519 def combine_samples(self, sample_groups): 1520 ''' 1521 Combine analyses of different samples to compute weighted average Δ4x 1522 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1523 dictionary. 1524 1525 Caution: samples are weighted by number of replicate analyses, which is a 1526 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1527 correlated analytical errors for one or more samples). 1528 1529 Returns a tuplet of: 1530 1531 + the list of group names 1532 + an array of the corresponding Δ4x values 1533 + the corresponding (co)variance matrix 1534 1535 **Parameters** 1536 1537 + `sample_groups`: a dictionary of the form: 1538 ```py 1539 {'group1': ['sample_1', 'sample_2'], 1540 'group2': ['sample_3', 'sample_4', 'sample_5']} 1541 ``` 1542 ''' 1543 1544 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1545 groups = sorted(sample_groups.keys()) 1546 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1547 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1548 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1549 W = np.array([ 1550 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1551 for j in groups]) 1552 D4x_new = W @ D4x_old 1553 CM_new = W @ CM_old @ W.T 1554 1555 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups
: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1558 @make_verbal 1559 def standardize(self, 1560 method = 'pooled', 1561 weighted_sessions = [], 1562 consolidate = True, 1563 consolidate_tables = False, 1564 consolidate_plots = False, 1565 constraints = {}, 1566 ): 1567 ''' 1568 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1569 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1570 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1571 i.e. that their true Δ4x value does not change between sessions, 1572 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1573 `'indep_sessions'`, the standardization processes each session independently, based only 1574 on anchors analyses. 1575 ''' 1576 1577 self.standardization_method = method 1578 self.assign_timestamps() 1579 1580 if method == 'pooled': 1581 if weighted_sessions: 1582 for session_group in weighted_sessions: 1583 if self._4x == '47': 1584 X = D47data([r for r in self if r['Session'] in session_group]) 1585 elif self._4x == '48': 1586 X = D48data([r for r in self if r['Session'] in session_group]) 1587 X.Nominal_D4x = self.Nominal_D4x.copy() 1588 X.refresh() 1589 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1590 w = np.sqrt(result.redchi) 1591 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1592 for r in X: 1593 r[f'wD{self._4x}raw'] *= w 1594 else: 1595 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1596 for r in self: 1597 r[f'wD{self._4x}raw'] = 1. 1598 1599 params = Parameters() 1600 for k,session in enumerate(self.sessions): 1601 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1602 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1603 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1604 s = pf(session) 1605 params.add(f'a_{s}', value = 0.9) 1606 params.add(f'b_{s}', value = 0.) 1607 params.add(f'c_{s}', value = -0.9) 1608 params.add(f'a2_{s}', value = 0., 1609# vary = self.sessions[session]['scrambling_drift'], 1610 ) 1611 params.add(f'b2_{s}', value = 0., 1612# vary = self.sessions[session]['slope_drift'], 1613 ) 1614 params.add(f'c2_{s}', value = 0., 1615# vary = self.sessions[session]['wg_drift'], 1616 ) 1617 if not self.sessions[session]['scrambling_drift']: 1618 params[f'a2_{s}'].expr = '0' 1619 if not self.sessions[session]['slope_drift']: 1620 params[f'b2_{s}'].expr = '0' 1621 if not self.sessions[session]['wg_drift']: 1622 params[f'c2_{s}'].expr = '0' 1623 1624 for sample in self.unknowns: 1625 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1626 1627 for k in constraints: 1628 params[k].expr = constraints[k] 1629 1630 def residuals(p): 1631 R = [] 1632 for r in self: 1633 session = pf(r['Session']) 1634 sample = pf(r['Sample']) 1635 if r['Sample'] in self.Nominal_D4x: 1636 R += [ ( 1637 r[f'D{self._4x}raw'] - ( 1638 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1639 + p[f'b_{session}'] * r[f'd{self._4x}'] 1640 + p[f'c_{session}'] 1641 + r['t'] * ( 1642 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1643 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1644 + p[f'c2_{session}'] 1645 ) 1646 ) 1647 ) / r[f'wD{self._4x}raw'] ] 1648 else: 1649 R += [ ( 1650 r[f'D{self._4x}raw'] - ( 1651 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1652 + p[f'b_{session}'] * r[f'd{self._4x}'] 1653 + p[f'c_{session}'] 1654 + r['t'] * ( 1655 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1656 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1657 + p[f'c2_{session}'] 1658 ) 1659 ) 1660 ) / r[f'wD{self._4x}raw'] ] 1661 return R 1662 1663 M = Minimizer(residuals, params) 1664 result = M.least_squares() 1665 self.Nf = result.nfree 1666 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1667 new_names, new_covar, new_se = _fullcovar(result)[:3] 1668 result.var_names = new_names 1669 result.covar = new_covar 1670 1671 for r in self: 1672 s = pf(r["Session"]) 1673 a = result.params.valuesdict()[f'a_{s}'] 1674 b = result.params.valuesdict()[f'b_{s}'] 1675 c = result.params.valuesdict()[f'c_{s}'] 1676 a2 = result.params.valuesdict()[f'a2_{s}'] 1677 b2 = result.params.valuesdict()[f'b2_{s}'] 1678 c2 = result.params.valuesdict()[f'c2_{s}'] 1679 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1680 1681 1682 self.standardization = result 1683 1684 for session in self.sessions: 1685 self.sessions[session]['Np'] = 3 1686 for k in ['scrambling', 'slope', 'wg']: 1687 if self.sessions[session][f'{k}_drift']: 1688 self.sessions[session]['Np'] += 1 1689 1690 if consolidate: 1691 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1692 return result 1693 1694 1695 elif method == 'indep_sessions': 1696 1697 if weighted_sessions: 1698 for session_group in weighted_sessions: 1699 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1700 X.Nominal_D4x = self.Nominal_D4x.copy() 1701 X.refresh() 1702 # This is only done to assign r['wD47raw'] for r in X: 1703 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1704 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1705 else: 1706 self.msg('All weights set to 1 ‰') 1707 for r in self: 1708 r[f'wD{self._4x}raw'] = 1 1709 1710 for session in self.sessions: 1711 s = self.sessions[session] 1712 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1713 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1714 s['Np'] = sum(p_active) 1715 sdata = s['data'] 1716 1717 A = np.array([ 1718 [ 1719 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1720 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1721 1 / r[f'wD{self._4x}raw'], 1722 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1723 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1724 r['t'] / r[f'wD{self._4x}raw'] 1725 ] 1726 for r in sdata if r['Sample'] in self.anchors 1727 ])[:,p_active] # only keep columns for the active parameters 1728 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1729 s['Na'] = Y.size 1730 CM = linalg.inv(A.T @ A) 1731 bf = (CM @ A.T @ Y).T[0,:] 1732 k = 0 1733 for n,a in zip(p_names, p_active): 1734 if a: 1735 s[n] = bf[k] 1736# self.msg(f'{n} = {bf[k]}') 1737 k += 1 1738 else: 1739 s[n] = 0. 1740# self.msg(f'{n} = 0.0') 1741 1742 for r in sdata : 1743 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1744 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1745 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1746 1747 s['CM'] = np.zeros((6,6)) 1748 i = 0 1749 k_active = [j for j,a in enumerate(p_active) if a] 1750 for j,a in enumerate(p_active): 1751 if a: 1752 s['CM'][j,k_active] = CM[i,:] 1753 i += 1 1754 1755 if not weighted_sessions: 1756 w = self.rmswd()['rmswd'] 1757 for r in self: 1758 r[f'wD{self._4x}'] *= w 1759 r[f'wD{self._4x}raw'] *= w 1760 for session in self.sessions: 1761 self.sessions[session]['CM'] *= w**2 1762 1763 for session in self.sessions: 1764 s = self.sessions[session] 1765 s['SE_a'] = s['CM'][0,0]**.5 1766 s['SE_b'] = s['CM'][1,1]**.5 1767 s['SE_c'] = s['CM'][2,2]**.5 1768 s['SE_a2'] = s['CM'][3,3]**.5 1769 s['SE_b2'] = s['CM'][4,4]**.5 1770 s['SE_c2'] = s['CM'][5,5]**.5 1771 1772 if not weighted_sessions: 1773 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1774 else: 1775 self.Nf = 0 1776 for sg in weighted_sessions: 1777 self.Nf += self.rmswd(sessions = sg)['Nf'] 1778 1779 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1780 1781 avgD4x = { 1782 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1783 for sample in self.samples 1784 } 1785 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1786 rD4x = (chi2/self.Nf)**.5 1787 self.repeatability[f'sigma_{self._4x}'] = rD4x 1788 1789 if consolidate: 1790 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method
argument is set to 'pooled'
, the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method
argument is set to
'indep_sessions'
, the standardization processes each session independently, based only
on anchors analyses.
1793 def standardization_error(self, session, d4x, D4x, t = 0): 1794 ''' 1795 Compute standardization error for a given session and 1796 (δ47, Δ47) composition. 1797 ''' 1798 a = self.sessions[session]['a'] 1799 b = self.sessions[session]['b'] 1800 c = self.sessions[session]['c'] 1801 a2 = self.sessions[session]['a2'] 1802 b2 = self.sessions[session]['b2'] 1803 c2 = self.sessions[session]['c2'] 1804 CM = self.sessions[session]['CM'] 1805 1806 x, y = D4x, d4x 1807 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1808# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1809 dxdy = -(b+b2*t) / (a+a2*t) 1810 dxdz = 1. / (a+a2*t) 1811 dxda = -x / (a+a2*t) 1812 dxdb = -y / (a+a2*t) 1813 dxdc = -1. / (a+a2*t) 1814 dxda2 = -x * a2 / (a+a2*t) 1815 dxdb2 = -y * t / (a+a2*t) 1816 dxdc2 = -t / (a+a2*t) 1817 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1818 sx = (V @ CM @ V.T) ** .5 1819 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1822 @make_verbal 1823 def summary(self, 1824 dir = 'output', 1825 filename = None, 1826 save_to_file = True, 1827 print_out = True, 1828 ): 1829 ''' 1830 Print out an/or save to disk a summary of the standardization results. 1831 1832 **Parameters** 1833 1834 + `dir`: the directory in which to save the table 1835 + `filename`: the name to the csv file to write to 1836 + `save_to_file`: whether to save the table to disk 1837 + `print_out`: whether to print out the table 1838 ''' 1839 1840 out = [] 1841 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1842 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1843 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1844 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1845 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1846 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1847 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1848 out += [['Model degrees of freedom', f"{self.Nf}"]] 1849 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1850 out += [['Standardization method', self.standardization_method]] 1851 1852 if save_to_file: 1853 if not os.path.exists(dir): 1854 os.makedirs(dir) 1855 if filename is None: 1856 filename = f'D{self._4x}_summary.csv' 1857 with open(f'{dir}/{filename}', 'w') as fid: 1858 fid.write(make_csv(out)) 1859 if print_out: 1860 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the table
1863 @make_verbal 1864 def table_of_sessions(self, 1865 dir = 'output', 1866 filename = None, 1867 save_to_file = True, 1868 print_out = True, 1869 output = None, 1870 ): 1871 ''' 1872 Print out an/or save to disk a table of sessions. 1873 1874 **Parameters** 1875 1876 + `dir`: the directory in which to save the table 1877 + `filename`: the name to the csv file to write to 1878 + `save_to_file`: whether to save the table to disk 1879 + `print_out`: whether to print out the table 1880 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1881 if set to `'raw'`: return a list of list of strings 1882 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1883 ''' 1884 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1885 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1886 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1887 1888 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1889 if include_a2: 1890 out[-1] += ['a2 ± SE'] 1891 if include_b2: 1892 out[-1] += ['b2 ± SE'] 1893 if include_c2: 1894 out[-1] += ['c2 ± SE'] 1895 for session in self.sessions: 1896 out += [[ 1897 session, 1898 f"{self.sessions[session]['Na']}", 1899 f"{self.sessions[session]['Nu']}", 1900 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1901 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1902 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1903 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1904 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1905 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1906 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1907 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1908 ]] 1909 if include_a2: 1910 if self.sessions[session]['scrambling_drift']: 1911 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1912 else: 1913 out[-1] += [''] 1914 if include_b2: 1915 if self.sessions[session]['slope_drift']: 1916 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1917 else: 1918 out[-1] += [''] 1919 if include_c2: 1920 if self.sessions[session]['wg_drift']: 1921 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1922 else: 1923 out[-1] += [''] 1924 1925 if save_to_file: 1926 if not os.path.exists(dir): 1927 os.makedirs(dir) 1928 if filename is None: 1929 filename = f'D{self._4x}_sessions.csv' 1930 with open(f'{dir}/{filename}', 'w') as fid: 1931 fid.write(make_csv(out)) 1932 if print_out: 1933 self.msg('\n' + pretty_table(out)) 1934 if output == 'raw': 1935 return out 1936 elif output == 'pretty': 1937 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1940 @make_verbal 1941 def table_of_analyses( 1942 self, 1943 dir = 'output', 1944 filename = None, 1945 save_to_file = True, 1946 print_out = True, 1947 output = None, 1948 ): 1949 ''' 1950 Print out an/or save to disk a table of analyses. 1951 1952 **Parameters** 1953 1954 + `dir`: the directory in which to save the table 1955 + `filename`: the name to the csv file to write to 1956 + `save_to_file`: whether to save the table to disk 1957 + `print_out`: whether to print out the table 1958 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1959 if set to `'raw'`: return a list of list of strings 1960 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1961 ''' 1962 1963 out = [['UID','Session','Sample']] 1964 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1965 for f in extra_fields: 1966 out[-1] += [f[0]] 1967 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1968 for r in self: 1969 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1970 for f in extra_fields: 1971 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1972 out[-1] += [ 1973 f"{r['d13Cwg_VPDB']:.3f}", 1974 f"{r['d18Owg_VSMOW']:.3f}", 1975 f"{r['d45']:.6f}", 1976 f"{r['d46']:.6f}", 1977 f"{r['d47']:.6f}", 1978 f"{r['d48']:.6f}", 1979 f"{r['d49']:.6f}", 1980 f"{r['d13C_VPDB']:.6f}", 1981 f"{r['d18O_VSMOW']:.6f}", 1982 f"{r['D47raw']:.6f}", 1983 f"{r['D48raw']:.6f}", 1984 f"{r['D49raw']:.6f}", 1985 f"{r[f'D{self._4x}']:.6f}" 1986 ] 1987 if save_to_file: 1988 if not os.path.exists(dir): 1989 os.makedirs(dir) 1990 if filename is None: 1991 filename = f'D{self._4x}_analyses.csv' 1992 with open(f'{dir}/{filename}', 'w') as fid: 1993 fid.write(make_csv(out)) 1994 if print_out: 1995 self.msg('\n' + pretty_table(out)) 1996 return out
Print out an/or save to disk a table of analyses.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1998 @make_verbal 1999 def covar_table( 2000 self, 2001 correl = False, 2002 dir = 'output', 2003 filename = None, 2004 save_to_file = True, 2005 print_out = True, 2006 output = None, 2007 ): 2008 ''' 2009 Print out, save to disk and/or return the variance-covariance matrix of D4x 2010 for all unknown samples. 2011 2012 **Parameters** 2013 2014 + `dir`: the directory in which to save the csv 2015 + `filename`: the name of the csv file to write to 2016 + `save_to_file`: whether to save the csv 2017 + `print_out`: whether to print out the matrix 2018 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 2019 if set to `'raw'`: return a list of list of strings 2020 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2021 ''' 2022 samples = sorted([u for u in self.unknowns]) 2023 out = [[''] + samples] 2024 for s1 in samples: 2025 out.append([s1]) 2026 for s2 in samples: 2027 if correl: 2028 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 2029 else: 2030 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 2031 2032 if save_to_file: 2033 if not os.path.exists(dir): 2034 os.makedirs(dir) 2035 if filename is None: 2036 if correl: 2037 filename = f'D{self._4x}_correl.csv' 2038 else: 2039 filename = f'D{self._4x}_covar.csv' 2040 with open(f'{dir}/{filename}', 'w') as fid: 2041 fid.write(make_csv(out)) 2042 if print_out: 2043 self.msg('\n'+pretty_table(out)) 2044 if output == 'raw': 2045 return out 2046 elif output == 'pretty': 2047 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the matrixoutput
: if set to'pretty'
: return a pretty text matrix (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2049 @make_verbal 2050 def table_of_samples( 2051 self, 2052 dir = 'output', 2053 filename = None, 2054 save_to_file = True, 2055 print_out = True, 2056 output = None, 2057 ): 2058 ''' 2059 Print out, save to disk and/or return a table of samples. 2060 2061 **Parameters** 2062 2063 + `dir`: the directory in which to save the csv 2064 + `filename`: the name of the csv file to write to 2065 + `save_to_file`: whether to save the csv 2066 + `print_out`: whether to print out the table 2067 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2068 if set to `'raw'`: return a list of list of strings 2069 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2070 ''' 2071 2072 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2073 for sample in self.anchors: 2074 out += [[ 2075 f"{sample}", 2076 f"{self.samples[sample]['N']}", 2077 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2078 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2079 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2080 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2081 ]] 2082 for sample in self.unknowns: 2083 out += [[ 2084 f"{sample}", 2085 f"{self.samples[sample]['N']}", 2086 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2087 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2088 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2089 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2090 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2091 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2092 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2093 ]] 2094 if save_to_file: 2095 if not os.path.exists(dir): 2096 os.makedirs(dir) 2097 if filename is None: 2098 filename = f'D{self._4x}_samples.csv' 2099 with open(f'{dir}/{filename}', 'w') as fid: 2100 fid.write(make_csv(out)) 2101 if print_out: 2102 self.msg('\n'+pretty_table(out)) 2103 if output == 'raw': 2104 return out 2105 elif output == 'pretty': 2106 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2109 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2110 ''' 2111 Generate session plots and save them to disk. 2112 2113 **Parameters** 2114 2115 + `dir`: the directory in which to save the plots 2116 + `figsize`: the width and height (in inches) of each plot 2117 + `filetype`: 'pdf' or 'png' 2118 + `dpi`: resolution for PNG output 2119 ''' 2120 if not os.path.exists(dir): 2121 os.makedirs(dir) 2122 2123 for session in self.sessions: 2124 sp = self.plot_single_session(session, xylimits = 'constant') 2125 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2126 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir
: the directory in which to save the plotsfigsize
: the width and height (in inches) of each plotfiletype
: 'pdf' or 'png'dpi
: resolution for PNG output
2130 @make_verbal 2131 def consolidate_samples(self): 2132 ''' 2133 Compile various statistics for each sample. 2134 2135 For each anchor sample: 2136 2137 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2138 + `SE_D47` or `SE_D48`: set to zero by definition 2139 2140 For each unknown sample: 2141 2142 + `D47` or `D48`: the standardized Δ4x value for this unknown 2143 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2144 2145 For each anchor and unknown: 2146 2147 + `N`: the total number of analyses of this sample 2148 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2149 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2150 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2151 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2152 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2153 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2154 ''' 2155 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2156 for sample in self.samples: 2157 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2158 if self.samples[sample]['N'] > 1: 2159 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2160 2161 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2162 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2163 2164 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2165 if len(D4x_pop) > 2: 2166 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2167 2168 if self.standardization_method == 'pooled': 2169 for sample in self.anchors: 2170 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2171 self.samples[sample][f'SE_D{self._4x}'] = 0. 2172 for sample in self.unknowns: 2173 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2174 try: 2175 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2176 except ValueError: 2177 # when `sample` is constrained by self.standardize(constraints = {...}), 2178 # it is no longer listed in self.standardization.var_names. 2179 # Temporary fix: define SE as zero for now 2180 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2181 2182 elif self.standardization_method == 'indep_sessions': 2183 for sample in self.anchors: 2184 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2185 self.samples[sample][f'SE_D{self._4x}'] = 0. 2186 for sample in self.unknowns: 2187 self.msg(f'Consolidating sample {sample}') 2188 self.unknowns[sample][f'session_D{self._4x}'] = {} 2189 session_avg = [] 2190 for session in self.sessions: 2191 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2192 if sdata: 2193 self.msg(f'{sample} found in session {session}') 2194 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2195 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2196 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2197 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2198 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2199 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2200 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2201 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2202 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2203 wsum = sum([weights[s] for s in weights]) 2204 for s in weights: 2205 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2206 2207 for r in self: 2208 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47
orD48
: the nominal Δ4x value for this anchor, specified byself.Nominal_D4x
SE_D47
orSE_D48
: set to zero by definition
For each unknown sample:
D47
orD48
: the standardized Δ4x value for this unknownSE_D47
orSE_D48
: the standard error of Δ4x for this unknown
For each anchor and unknown:
N
: the total number of analyses of this sampleSD_D47
orSD_D48
: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB
: the average δ13CVPDB value for this sampled18O_VSMOW
: the average δ18OVSMOW value for this sample (as CO2)p_Levene
: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE
.
2212 def consolidate_sessions(self): 2213 ''' 2214 Compute various statistics for each session. 2215 2216 + `Na`: Number of anchor analyses in the session 2217 + `Nu`: Number of unknown analyses in the session 2218 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2219 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2220 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2221 + `a`: scrambling factor 2222 + `b`: compositional slope 2223 + `c`: WG offset 2224 + `SE_a`: Model stadard erorr of `a` 2225 + `SE_b`: Model stadard erorr of `b` 2226 + `SE_c`: Model stadard erorr of `c` 2227 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2228 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2229 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2230 + `a2`: scrambling factor drift 2231 + `b2`: compositional slope drift 2232 + `c2`: WG offset drift 2233 + `Np`: Number of standardization parameters to fit 2234 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2235 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2236 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2237 ''' 2238 for session in self.sessions: 2239 if 'd13Cwg_VPDB' not in self.sessions[session]: 2240 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2241 if 'd18Owg_VSMOW' not in self.sessions[session]: 2242 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2243 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2244 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2245 2246 self.msg(f'Computing repeatabilities for session {session}') 2247 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2248 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2249 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2250 2251 if self.standardization_method == 'pooled': 2252 for session in self.sessions: 2253 2254 # different (better?) computation of D4x repeatability for each session: 2255 sqresiduals = [(r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'])**2 for r in self.sessions[session]['data']] 2256 self.sessions[session][f'r_D{self._4x}'] = np.mean(sqresiduals)**.5 2257 2258 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2259 i = self.standardization.var_names.index(f'a_{pf(session)}') 2260 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2261 2262 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2263 i = self.standardization.var_names.index(f'b_{pf(session)}') 2264 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2265 2266 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2267 i = self.standardization.var_names.index(f'c_{pf(session)}') 2268 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2269 2270 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2271 if self.sessions[session]['scrambling_drift']: 2272 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2273 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2274 else: 2275 self.sessions[session]['SE_a2'] = 0. 2276 2277 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2278 if self.sessions[session]['slope_drift']: 2279 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2280 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2281 else: 2282 self.sessions[session]['SE_b2'] = 0. 2283 2284 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2285 if self.sessions[session]['wg_drift']: 2286 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2287 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2288 else: 2289 self.sessions[session]['SE_c2'] = 0. 2290 2291 i = self.standardization.var_names.index(f'a_{pf(session)}') 2292 j = self.standardization.var_names.index(f'b_{pf(session)}') 2293 k = self.standardization.var_names.index(f'c_{pf(session)}') 2294 CM = np.zeros((6,6)) 2295 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2296 try: 2297 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2298 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2299 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2300 try: 2301 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2302 CM[3,4] = self.standardization.covar[i2,j2] 2303 CM[4,3] = self.standardization.covar[j2,i2] 2304 except ValueError: 2305 pass 2306 try: 2307 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2308 CM[3,5] = self.standardization.covar[i2,k2] 2309 CM[5,3] = self.standardization.covar[k2,i2] 2310 except ValueError: 2311 pass 2312 except ValueError: 2313 pass 2314 try: 2315 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2316 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2317 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2318 try: 2319 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2320 CM[4,5] = self.standardization.covar[j2,k2] 2321 CM[5,4] = self.standardization.covar[k2,j2] 2322 except ValueError: 2323 pass 2324 except ValueError: 2325 pass 2326 try: 2327 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2328 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2329 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2330 except ValueError: 2331 pass 2332 2333 self.sessions[session]['CM'] = CM 2334 2335 elif self.standardization_method == 'indep_sessions': 2336 pass # Not implemented yet
Compute various statistics for each session.
Na
: Number of anchor analyses in the sessionNu
: Number of unknown analyses in the sessionr_d13C_VPDB
: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW
: δ18OVSMOW repeatability of analyses within the sessionr_D47
orr_D48
: Δ4x repeatability of analyses within the sessiona
: scrambling factorb
: compositional slopec
: WG offsetSE_a
: Model stadard erorr ofa
SE_b
: Model stadard erorr ofb
SE_c
: Model stadard erorr ofc
scrambling_drift
(boolean): whether to allow a temporal drift in the scrambling factor (a
)slope_drift
(boolean): whether to allow a temporal drift in the compositional slope (b
)wg_drift
(boolean): whether to allow a temporal drift in the WG offset (c
)a2
: scrambling factor driftb2
: compositional slope driftc2
: WG offset driftNp
: Number of standardization parameters to fitCM
: model covariance matrix for (a
,b
,c
,a2
,b2
,c2
)d13Cwg_VPDB
: δ13CVPDB of WGd18Owg_VSMOW
: δ18OVSMOW of WG
2339 @make_verbal 2340 def repeatabilities(self): 2341 ''' 2342 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2343 (for all samples, for anchors, and for unknowns). 2344 ''' 2345 self.msg('Computing reproducibilities for all sessions') 2346 2347 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2348 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2349 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2350 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2351 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2354 @make_verbal 2355 def consolidate(self, tables = True, plots = True): 2356 ''' 2357 Collect information about samples, sessions and repeatabilities. 2358 ''' 2359 self.consolidate_samples() 2360 self.consolidate_sessions() 2361 self.repeatabilities() 2362 2363 if tables: 2364 self.summary() 2365 self.table_of_sessions() 2366 self.table_of_analyses() 2367 self.table_of_samples() 2368 2369 if plots: 2370 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2373 @make_verbal 2374 def rmswd(self, 2375 samples = 'all samples', 2376 sessions = 'all sessions', 2377 ): 2378 ''' 2379 Compute the χ2, root mean squared weighted deviation 2380 (i.e. reduced χ2), and corresponding degrees of freedom of the 2381 Δ4x values for samples in `samples` and sessions in `sessions`. 2382 2383 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2384 ''' 2385 if samples == 'all samples': 2386 mysamples = [k for k in self.samples] 2387 elif samples == 'anchors': 2388 mysamples = [k for k in self.anchors] 2389 elif samples == 'unknowns': 2390 mysamples = [k for k in self.unknowns] 2391 else: 2392 mysamples = samples 2393 2394 if sessions == 'all sessions': 2395 sessions = [k for k in self.sessions] 2396 2397 chisq, Nf = 0, 0 2398 for sample in mysamples : 2399 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2400 if len(G) > 1 : 2401 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2402 Nf += (len(G) - 1) 2403 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2404 r = (chisq / Nf)**.5 if Nf > 0 else 0 2405 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2406 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples
and sessions in sessions
.
Only used in D4xdata.standardize()
with method='indep_sessions'
.
2409 @make_verbal 2410 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2411 ''' 2412 Compute the repeatability of `[r[key] for r in self]` 2413 ''' 2414 2415 if samples == 'all samples': 2416 mysamples = [k for k in self.samples] 2417 elif samples == 'anchors': 2418 mysamples = [k for k in self.anchors] 2419 elif samples == 'unknowns': 2420 mysamples = [k for k in self.unknowns] 2421 else: 2422 mysamples = samples 2423 2424 if sessions == 'all sessions': 2425 sessions = [k for k in self.sessions] 2426 2427 if key in ['D47', 'D48']: 2428 # Full disclosure: the definition of Nf is tricky/debatable 2429 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2430 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2431 Nf = len(G) 2432# print(f'len(G) = {Nf}') 2433 Nf -= len([s for s in mysamples if s in self.unknowns]) 2434# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2435 for session in sessions: 2436 Np = len([ 2437 _ for _ in self.standardization.params 2438 if ( 2439 self.standardization.params[_].expr is not None 2440 and ( 2441 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2442 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2443 ) 2444 ) 2445 ]) 2446# print(f'session {session}: {Np} parameters to consider') 2447 Na = len({ 2448 r['Sample'] for r in self.sessions[session]['data'] 2449 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2450 }) 2451# print(f'session {session}: {Na} different anchors in that session') 2452 Nf -= min(Np, Na) 2453# print(f'Nf = {Nf}') 2454 2455# for sample in mysamples : 2456# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2457# if len(X) > 1 : 2458# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2459# if sample in self.unknowns: 2460# Nf += len(X) - 1 2461# else: 2462# Nf += len(X) 2463# if samples in ['anchors', 'all samples']: 2464# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2465 r = (chisq / Nf)**.5 if Nf > 0 else 0 2466 2467 else: # if key not in ['D47', 'D48'] 2468 chisq, Nf = 0, 0 2469 for sample in mysamples : 2470 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2471 if len(X) > 1 : 2472 Nf += len(X) - 1 2473 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2474 r = (chisq / Nf)**.5 if Nf > 0 else 0 2475 2476 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2477 return r
Compute the repeatability of [r[key] for r in self]
2479 def sample_average(self, samples, weights = 'equal', normalize = True): 2480 ''' 2481 Weighted average Δ4x value of a group of samples, accounting for covariance. 2482 2483 Returns the weighed average Δ4x value and associated SE 2484 of a group of samples. Weights are equal by default. If `normalize` is 2485 true, `weights` will be rescaled so that their sum equals 1. 2486 2487 **Examples** 2488 2489 ```python 2490 self.sample_average(['X','Y'], [1, 2]) 2491 ``` 2492 2493 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2494 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2495 values of samples X and Y, respectively. 2496 2497 ```python 2498 self.sample_average(['X','Y'], [1, -1], normalize = False) 2499 ``` 2500 2501 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2502 ''' 2503 if weights == 'equal': 2504 weights = [1/len(samples)] * len(samples) 2505 2506 if normalize: 2507 s = sum(weights) 2508 if s: 2509 weights = [w/s for w in weights] 2510 2511 try: 2512# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2513# C = self.standardization.covar[indices,:][:,indices] 2514 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2515 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2516 return correlated_sum(X, C, weights) 2517 except ValueError: 2518 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize
is
true, weights
will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2521 def sample_D4x_covar(self, sample1, sample2 = None): 2522 ''' 2523 Covariance between Δ4x values of samples 2524 2525 Returns the error covariance between the average Δ4x values of two 2526 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2527 returns the Δ4x variance for that sample. 2528 ''' 2529 if sample2 is None: 2530 sample2 = sample1 2531 if self.standardization_method == 'pooled': 2532 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2533 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2534 return self.standardization.covar[i, j] 2535 elif self.standardization_method == 'indep_sessions': 2536 if sample1 == sample2: 2537 return self.samples[sample1][f'SE_D{self._4x}']**2 2538 else: 2539 c = 0 2540 for session in self.sessions: 2541 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2542 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2543 if sdata1 and sdata2: 2544 a = self.sessions[session]['a'] 2545 # !! TODO: CM below does not account for temporal changes in standardization parameters 2546 CM = self.sessions[session]['CM'][:3,:3] 2547 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2548 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2549 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2550 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2551 c += ( 2552 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2553 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2554 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2555 @ CM 2556 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2557 ) / a**2 2558 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1
is specified, or if sample_1 == sample_2
),
returns the Δ4x variance for that sample.
2560 def sample_D4x_correl(self, sample1, sample2 = None): 2561 ''' 2562 Correlation between Δ4x errors of samples 2563 2564 Returns the error correlation between the average Δ4x values of two samples. 2565 ''' 2566 if sample2 is None or sample2 == sample1: 2567 return 1. 2568 return ( 2569 self.sample_D4x_covar(sample1, sample2) 2570 / self.unknowns[sample1][f'SE_D{self._4x}'] 2571 / self.unknowns[sample2][f'SE_D{self._4x}'] 2572 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2574 def plot_single_session(self, 2575 session, 2576 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2577 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2578 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2579 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2580 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2581 xylimits = 'free', # | 'constant' 2582 x_label = None, 2583 y_label = None, 2584 error_contour_interval = 'auto', 2585 fig = 'new', 2586 ): 2587 ''' 2588 Generate plot for a single session 2589 ''' 2590 if x_label is None: 2591 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2592 if y_label is None: 2593 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2594 2595 out = _SessionPlot() 2596 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2597 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2598 anchors_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2599 anchors_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors] 2600 unknowns_d = [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2601 unknowns_D = [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns] 2602 anchor_avg = (np.array([ np.array([ 2603 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2604 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2605 ]) for sample in anchors]).T, 2606 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T) 2607 unknown_avg = (np.array([ np.array([ 2608 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2609 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2610 ]) for sample in unknowns]).T, 2611 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T) 2612 2613 2614 if fig == 'new': 2615 out.fig = ppl.figure(figsize = (6,6)) 2616 ppl.subplots_adjust(.1,.1,.9,.9) 2617 2618 out.anchor_analyses, = ppl.plot( 2619 anchors_d, 2620 anchors_D, 2621 **kw_plot_anchors) 2622 out.unknown_analyses, = ppl.plot( 2623 unknowns_d, 2624 unknowns_D, 2625 **kw_plot_unknowns) 2626 out.anchor_avg = ppl.plot( 2627 *anchor_avg, 2628 **kw_plot_anchor_avg) 2629 out.unknown_avg = ppl.plot( 2630 *unknown_avg, 2631 **kw_plot_unknown_avg) 2632 if xylimits == 'constant': 2633 x = [r[f'd{self._4x}'] for r in self] 2634 y = [r[f'D{self._4x}'] for r in self] 2635 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2636 w, h = x2-x1, y2-y1 2637 x1 -= w/20 2638 x2 += w/20 2639 y1 -= h/20 2640 y2 += h/20 2641 ppl.axis([x1, x2, y1, y2]) 2642 elif xylimits == 'free': 2643 x1, x2, y1, y2 = ppl.axis() 2644 else: 2645 x1, x2, y1, y2 = ppl.axis(xylimits) 2646 2647 if error_contour_interval != 'none': 2648 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2649 XI,YI = np.meshgrid(xi, yi) 2650 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2651 if error_contour_interval == 'auto': 2652 rng = np.max(SI) - np.min(SI) 2653 if rng <= 0.01: 2654 cinterval = 0.001 2655 elif rng <= 0.03: 2656 cinterval = 0.004 2657 elif rng <= 0.1: 2658 cinterval = 0.01 2659 elif rng <= 0.3: 2660 cinterval = 0.03 2661 elif rng <= 1.: 2662 cinterval = 0.1 2663 else: 2664 cinterval = 0.5 2665 else: 2666 cinterval = error_contour_interval 2667 2668 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2669 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2670 out.clabel = ppl.clabel(out.contour) 2671 contour = (XI, YI, SI, cval, cinterval) 2672 2673 if fig == None: 2674 return { 2675 'anchors':anchors, 2676 'unknowns':unknowns, 2677 'anchors_d':anchors_d, 2678 'anchors_D':anchors_D, 2679 'unknowns_d':unknowns_d, 2680 'unknowns_D':unknowns_D, 2681 'anchor_avg':anchor_avg, 2682 'unknown_avg':unknown_avg, 2683 'contour':contour, 2684 } 2685 2686 ppl.xlabel(x_label) 2687 ppl.ylabel(y_label) 2688 ppl.title(session, weight = 'bold') 2689 ppl.grid(alpha = .2) 2690 out.ax = ppl.gca() 2691 2692 return out
Generate plot for a single session
2694 def plot_residuals( 2695 self, 2696 kde = False, 2697 hist = False, 2698 binwidth = 2/3, 2699 dir = 'output', 2700 filename = None, 2701 highlight = [], 2702 colors = None, 2703 figsize = None, 2704 dpi = 100, 2705 yspan = None, 2706 ): 2707 ''' 2708 Plot residuals of each analysis as a function of time (actually, as a function of 2709 the order of analyses in the `D4xdata` object) 2710 2711 + `kde`: whether to add a kernel density estimate of residuals 2712 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2713 + `histbins`: specify bin edges for the histogram 2714 + `dir`: the directory in which to save the plot 2715 + `highlight`: a list of samples to highlight 2716 + `colors`: a dict of `{<sample>: <color>}` for all samples 2717 + `figsize`: (width, height) of figure 2718 + `dpi`: resolution for PNG output 2719 + `yspan`: factor controlling the range of y values shown in plot 2720 (by default: `yspan = 1.5 if kde else 1.0`) 2721 ''' 2722 2723 from matplotlib import ticker 2724 2725 if yspan is None: 2726 if kde: 2727 yspan = 1.5 2728 else: 2729 yspan = 1.0 2730 2731 # Layout 2732 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2733 if hist or kde: 2734 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2735 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2736 else: 2737 ppl.subplots_adjust(.08,.05,.78,.8) 2738 ax1 = ppl.subplot(111) 2739 2740 # Colors 2741 N = len(self.anchors) 2742 if colors is None: 2743 if len(highlight) > 0: 2744 Nh = len(highlight) 2745 if Nh == 1: 2746 colors = {highlight[0]: (0,0,0)} 2747 elif Nh == 3: 2748 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2749 elif Nh == 4: 2750 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2751 else: 2752 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2753 else: 2754 if N == 3: 2755 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2756 elif N == 4: 2757 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2758 else: 2759 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2760 2761 ppl.sca(ax1) 2762 2763 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2764 2765 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2766 2767 session = self[0]['Session'] 2768 x1 = 0 2769# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2770 x_sessions = {} 2771 one_or_more_singlets = False 2772 one_or_more_multiplets = False 2773 multiplets = set() 2774 for k,r in enumerate(self): 2775 if r['Session'] != session: 2776 x2 = k-1 2777 x_sessions[session] = (x1+x2)/2 2778 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2779 session = r['Session'] 2780 x1 = k 2781 singlet = len(self.samples[r['Sample']]['data']) == 1 2782 if not singlet: 2783 multiplets.add(r['Sample']) 2784 if r['Sample'] in self.unknowns: 2785 if singlet: 2786 one_or_more_singlets = True 2787 else: 2788 one_or_more_multiplets = True 2789 kw = dict( 2790 marker = 'x' if singlet else '+', 2791 ms = 4 if singlet else 5, 2792 ls = 'None', 2793 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2794 mew = 1, 2795 alpha = 0.2 if singlet else 1, 2796 ) 2797 if highlight and r['Sample'] not in highlight: 2798 kw['alpha'] = 0.2 2799 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2800 x2 = k 2801 x_sessions[session] = (x1+x2)/2 2802 2803 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2804 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2805 if not (hist or kde): 2806 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2807 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2808 2809 xmin, xmax, ymin, ymax = ppl.axis() 2810 if yspan != 1: 2811 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2812 for s in x_sessions: 2813 ppl.text( 2814 x_sessions[s], 2815 ymax +1, 2816 s, 2817 va = 'bottom', 2818 **( 2819 dict(ha = 'center') 2820 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2821 else dict(ha = 'left', rotation = 45) 2822 ) 2823 ) 2824 2825 if hist or kde: 2826 ppl.sca(ax2) 2827 2828 for s in colors: 2829 kw['marker'] = '+' 2830 kw['ms'] = 5 2831 kw['mec'] = colors[s] 2832 kw['label'] = s 2833 kw['alpha'] = 1 2834 ppl.plot([], [], **kw) 2835 2836 kw['mec'] = (0,0,0) 2837 2838 if one_or_more_singlets: 2839 kw['marker'] = 'x' 2840 kw['ms'] = 4 2841 kw['alpha'] = .2 2842 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2843 ppl.plot([], [], **kw) 2844 2845 if one_or_more_multiplets: 2846 kw['marker'] = '+' 2847 kw['ms'] = 4 2848 kw['alpha'] = 1 2849 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2850 ppl.plot([], [], **kw) 2851 2852 if hist or kde: 2853 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2854 else: 2855 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2856 leg.set_zorder(-1000) 2857 2858 ppl.sca(ax1) 2859 2860 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2861 ppl.xticks([]) 2862 ppl.axis([-1, len(self), None, None]) 2863 2864 if hist or kde: 2865 ppl.sca(ax2) 2866 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2867 2868 if kde: 2869 from scipy.stats import gaussian_kde 2870 yi = np.linspace(ymin, ymax, 201) 2871 xi = gaussian_kde(X).evaluate(yi) 2872 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2873# ppl.plot(xi, yi, 'k-', lw = 1) 2874 elif hist: 2875 ppl.hist( 2876 X, 2877 orientation = 'horizontal', 2878 histtype = 'stepfilled', 2879 ec = [.4]*3, 2880 fc = [.25]*3, 2881 alpha = .25, 2882 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2883 ) 2884 ppl.text(0, 0, 2885 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2886 size = 7.5, 2887 alpha = 1, 2888 va = 'center', 2889 ha = 'left', 2890 ) 2891 2892 ppl.axis([0, None, ymin, ymax]) 2893 ppl.xticks([]) 2894 ppl.yticks([]) 2895# ax2.spines['left'].set_visible(False) 2896 ax2.spines['right'].set_visible(False) 2897 ax2.spines['top'].set_visible(False) 2898 ax2.spines['bottom'].set_visible(False) 2899 2900 ax1.axis([None, None, ymin, ymax]) 2901 2902 if not os.path.exists(dir): 2903 os.makedirs(dir) 2904 if filename is None: 2905 return fig 2906 elif filename == '': 2907 filename = f'D{self._4x}_residuals.pdf' 2908 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2909 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata
object)
kde
: whether to add a kernel density estimate of residualshist
: whether to add a histogram of residuals (incompatible withkde
)histbins
: specify bin edges for the histogramdir
: the directory in which to save the plothighlight
: a list of samples to highlightcolors
: a dict of{<sample>: <color>}
for all samplesfigsize
: (width, height) of figuredpi
: resolution for PNG outputyspan
: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0
)
2912 def simulate(self, *args, **kwargs): 2913 ''' 2914 Legacy function with warning message pointing to `virtual_data()` 2915 ''' 2916 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2918 def plot_distribution_of_analyses( 2919 self, 2920 dir = 'output', 2921 filename = None, 2922 vs_time = False, 2923 figsize = (6,4), 2924 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2925 output = None, 2926 dpi = 100, 2927 ): 2928 ''' 2929 Plot temporal distribution of all analyses in the data set. 2930 2931 **Parameters** 2932 2933 + `dir`: the directory in which to save the plot 2934 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2935 + `dpi`: resolution for PNG output 2936 + `figsize`: (width, height) of figure 2937 + `dpi`: resolution for PNG output 2938 ''' 2939 2940 asamples = [s for s in self.anchors] 2941 usamples = [s for s in self.unknowns] 2942 if output is None or output == 'fig': 2943 fig = ppl.figure(figsize = figsize) 2944 ppl.subplots_adjust(*subplots_adjust) 2945 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2946 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2947 Xmax += (Xmax-Xmin)/40 2948 Xmin -= (Xmax-Xmin)/41 2949 for k, s in enumerate(asamples + usamples): 2950 if vs_time: 2951 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2952 else: 2953 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2954 Y = [-k for x in X] 2955 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2956 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2957 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2958 ppl.axis([Xmin, Xmax, -k-1, 1]) 2959 ppl.xlabel('\ntime') 2960 ppl.gca().annotate('', 2961 xy = (0.6, -0.02), 2962 xycoords = 'axes fraction', 2963 xytext = (.4, -0.02), 2964 arrowprops = dict(arrowstyle = "->", color = 'k'), 2965 ) 2966 2967 2968 x2 = -1 2969 for session in self.sessions: 2970 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2971 if vs_time: 2972 ppl.axvline(x1, color = 'k', lw = .75) 2973 if x2 > -1: 2974 if not vs_time: 2975 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2976 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2977# from xlrd import xldate_as_datetime 2978# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2979 if vs_time: 2980 ppl.axvline(x2, color = 'k', lw = .75) 2981 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2982 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2983 2984 ppl.xticks([]) 2985 ppl.yticks([]) 2986 2987 if output is None: 2988 if not os.path.exists(dir): 2989 os.makedirs(dir) 2990 if filename == None: 2991 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2992 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2993 ppl.close(fig) 2994 elif output == 'ax': 2995 return ppl.gca() 2996 elif output == 'fig': 2997 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir
: the directory in which to save the plotvs_time
: ifTrue
, plot as a function ofTimeTag
rather than sequentially.dpi
: resolution for PNG outputfigsize
: (width, height) of figuredpi
: resolution for PNG output
3000 def plot_bulk_compositions( 3001 self, 3002 samples = None, 3003 dir = 'output/bulk_compositions', 3004 figsize = (6,6), 3005 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 3006 show = False, 3007 sample_color = (0,.5,1), 3008 analysis_color = (.7,.7,.7), 3009 labeldist = 0.3, 3010 radius = 0.05, 3011 ): 3012 ''' 3013 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 3014 3015 By default, creates a directory `./output/bulk_compositions` where plots for 3016 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 3017 3018 3019 **Parameters** 3020 3021 + `samples`: Only these samples are processed (by default: all samples). 3022 + `dir`: where to save the plots 3023 + `figsize`: (width, height) of figure 3024 + `subplots_adjust`: passed to `subplots_adjust()` 3025 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 3026 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 3027 + `sample_color`: color used for replicate markers/labels 3028 + `analysis_color`: color used for sample markers/labels 3029 + `labeldist`: distance (in inches) from replicate markers to replicate labels 3030 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 3031 ''' 3032 3033 from matplotlib.patches import Ellipse 3034 3035 if samples is None: 3036 samples = [_ for _ in self.samples] 3037 3038 saved = {} 3039 3040 for s in samples: 3041 3042 fig = ppl.figure(figsize = figsize) 3043 fig.subplots_adjust(*subplots_adjust) 3044 ax = ppl.subplot(111) 3045 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3046 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3047 ppl.title(s) 3048 3049 3050 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 3051 UID = [_['UID'] for _ in self.samples[s]['data']] 3052 XY0 = XY.mean(0) 3053 3054 for xy in XY: 3055 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 3056 3057 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3058 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3059 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3060 saved[s] = [XY, XY0] 3061 3062 x1, x2, y1, y2 = ppl.axis() 3063 x0, dx = (x1+x2)/2, (x2-x1)/2 3064 y0, dy = (y1+y2)/2, (y2-y1)/2 3065 dx, dy = [max(max(dx, dy), radius)]*2 3066 3067 ppl.axis([ 3068 x0 - 1.2*dx, 3069 x0 + 1.2*dx, 3070 y0 - 1.2*dy, 3071 y0 + 1.2*dy, 3072 ]) 3073 3074 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3075 3076 for xy, uid in zip(XY, UID): 3077 3078 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3079 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3080 3081 if (vector_in_display_space**2).sum() > 0: 3082 3083 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3084 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3085 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3086 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3087 3088 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3089 3090 else: 3091 3092 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3093 3094 if radius: 3095 ax.add_artist(Ellipse( 3096 xy = XY0, 3097 width = radius*2, 3098 height = radius*2, 3099 ls = (0, (2,2)), 3100 lw = .7, 3101 ec = analysis_color, 3102 fc = 'None', 3103 )) 3104 ppl.text( 3105 XY0[0], 3106 XY0[1]-radius, 3107 f'\n± {radius*1e3:.0f} ppm', 3108 color = analysis_color, 3109 va = 'top', 3110 ha = 'center', 3111 linespacing = 0.4, 3112 size = 8, 3113 ) 3114 3115 if not os.path.exists(dir): 3116 os.makedirs(dir) 3117 fig.savefig(f'{dir}/{s}.pdf') 3118 ppl.close(fig) 3119 3120 fig = ppl.figure(figsize = figsize) 3121 fig.subplots_adjust(*subplots_adjust) 3122 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3123 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3124 3125 for s in saved: 3126 for xy in saved[s][0]: 3127 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3128 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3129 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3130 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3131 3132 x1, x2, y1, y2 = ppl.axis() 3133 ppl.axis([ 3134 x1 - (x2-x1)/10, 3135 x2 + (x2-x1)/10, 3136 y1 - (y2-y1)/10, 3137 y2 + (y2-y1)/10, 3138 ]) 3139 3140 3141 if not os.path.exists(dir): 3142 os.makedirs(dir) 3143 fig.savefig(f'{dir}/__all__.pdf') 3144 if show: 3145 ppl.show() 3146 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions
where plots for
each sample are saved. Another plot named __all__.pdf
shows all analyses together.
Parameters
samples
: Only these samples are processed (by default: all samples).dir
: where to save the plotsfigsize
: (width, height) of figuresubplots_adjust
: passed tosubplots_adjust()
show
: whether to callmatplotlib.pyplot.show()
on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color
: color used for replicate markers/labelsanalysis_color
: color used for sample markers/labelslabeldist
: distance (in inches) from replicate markers to replicate labelsradius
: radius of the dashed circle providing scale. No circle ifradius = 0
.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3188class D47data(D4xdata): 3189 ''' 3190 Store and process data for a large set of Δ47 analyses, 3191 usually comprising more than one analytical session. 3192 ''' 3193 3194 Nominal_D4x = { 3195 'ETH-1': 0.2052, 3196 'ETH-2': 0.2085, 3197 'ETH-3': 0.6132, 3198 'ETH-4': 0.4511, 3199 'IAEA-C1': 0.3018, 3200 'IAEA-C2': 0.6409, 3201 'MERCK': 0.5135, 3202 } # I-CDES (Bernasconi et al., 2021) 3203 ''' 3204 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3205 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3206 reference frame. 3207 3208 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3209 ```py 3210 { 3211 'ETH-1' : 0.2052, 3212 'ETH-2' : 0.2085, 3213 'ETH-3' : 0.6132, 3214 'ETH-4' : 0.4511, 3215 'IAEA-C1' : 0.3018, 3216 'IAEA-C2' : 0.6409, 3217 'MERCK' : 0.5135, 3218 } 3219 ``` 3220 ''' 3221 3222 3223 @property 3224 def Nominal_D47(self): 3225 return self.Nominal_D4x 3226 3227 3228 @Nominal_D47.setter 3229 def Nominal_D47(self, new): 3230 self.Nominal_D4x = dict(**new) 3231 self.refresh() 3232 3233 3234 def __init__(self, l = [], **kwargs): 3235 ''' 3236 **Parameters:** same as `D4xdata.__init__()` 3237 ''' 3238 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3239 3240 3241 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3242 ''' 3243 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3244 value for that temperature, and add treat these samples as additional anchors. 3245 3246 **Parameters** 3247 3248 + `fCo2eqD47`: Which CO2 equilibrium law to use 3249 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3250 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3251 + `priority`: if `replace`: forget old anchors and only use the new ones; 3252 if `new`: keep pre-existing anchors but update them in case of conflict 3253 between old and new Δ47 values; 3254 if `old`: keep pre-existing anchors but preserve their original Δ47 3255 values in case of conflict. 3256 ''' 3257 f = { 3258 'petersen': fCO2eqD47_Petersen, 3259 'wang': fCO2eqD47_Wang, 3260 }[fCo2eqD47] 3261 foo = {} 3262 for r in self: 3263 if 'Teq' in r: 3264 if r['Sample'] in foo: 3265 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3266 else: 3267 foo[r['Sample']] = f(r['Teq']) 3268 else: 3269 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3270 3271 if priority == 'replace': 3272 self.Nominal_D47 = {} 3273 for s in foo: 3274 if priority != 'old' or s not in self.Nominal_D47: 3275 self.Nominal_D47[s] = foo[s] 3276 3277 def save_D47_correl(self, *args, **kwargs): 3278 return self._save_D4x_correl(*args, **kwargs) 3279 3280 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3234 def __init__(self, l = [], **kwargs): 3235 ''' 3236 **Parameters:** same as `D4xdata.__init__()` 3237 ''' 3238 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize()
to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3241 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3242 ''' 3243 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3244 value for that temperature, and add treat these samples as additional anchors. 3245 3246 **Parameters** 3247 3248 + `fCo2eqD47`: Which CO2 equilibrium law to use 3249 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3250 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3251 + `priority`: if `replace`: forget old anchors and only use the new ones; 3252 if `new`: keep pre-existing anchors but update them in case of conflict 3253 between old and new Δ47 values; 3254 if `old`: keep pre-existing anchors but preserve their original Δ47 3255 values in case of conflict. 3256 ''' 3257 f = { 3258 'petersen': fCO2eqD47_Petersen, 3259 'wang': fCO2eqD47_Wang, 3260 }[fCo2eqD47] 3261 foo = {} 3262 for r in self: 3263 if 'Teq' in r: 3264 if r['Sample'] in foo: 3265 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3266 else: 3267 foo[r['Sample']] = f(r['Teq']) 3268 else: 3269 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3270 3271 if priority == 'replace': 3272 self.Nominal_D47 = {} 3273 for s in foo: 3274 if priority != 'old' or s not in self.Nominal_D47: 3275 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq
is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47
: Which CO2 equilibrium law to use (petersen
: Petersen et al. (2019);wang
: Wang et al. (2019)).priority
: ifreplace
: forget old anchors and only use the new ones; ifnew
: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold
: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Save D47 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D47_correl.csv
)D47_precision
: the precision to use when writingD47
andD47_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3283class D48data(D4xdata): 3284 ''' 3285 Store and process data for a large set of Δ48 analyses, 3286 usually comprising more than one analytical session. 3287 ''' 3288 3289 Nominal_D4x = { 3290 'ETH-1': 0.138, 3291 'ETH-2': 0.138, 3292 'ETH-3': 0.270, 3293 'ETH-4': 0.223, 3294 'GU-1': -0.419, 3295 } # (Fiebig et al., 2019, 2021) 3296 ''' 3297 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3298 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3299 reference frame. 3300 3301 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3302 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3303 3304 ```py 3305 { 3306 'ETH-1' : 0.138, 3307 'ETH-2' : 0.138, 3308 'ETH-3' : 0.270, 3309 'ETH-4' : 0.223, 3310 'GU-1' : -0.419, 3311 } 3312 ``` 3313 ''' 3314 3315 3316 @property 3317 def Nominal_D48(self): 3318 return self.Nominal_D4x 3319 3320 3321 @Nominal_D48.setter 3322 def Nominal_D48(self, new): 3323 self.Nominal_D4x = dict(**new) 3324 self.refresh() 3325 3326 3327 def __init__(self, l = [], **kwargs): 3328 ''' 3329 **Parameters:** same as `D4xdata.__init__()` 3330 ''' 3331 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3332 3333 def save_D48_correl(self, *args, **kwargs): 3334 return self._save_D4x_correl(*args, **kwargs) 3335 3336 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3327 def __init__(self, l = [], **kwargs): 3328 ''' 3329 **Parameters:** same as `D4xdata.__init__()` 3330 ''' 3331 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize()
to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Save D48 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D48_correl.csv
)D48_precision
: the precision to use when writingD48
andD48_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3339class D49data(D4xdata): 3340 ''' 3341 Store and process data for a large set of Δ49 analyses, 3342 usually comprising more than one analytical session. 3343 ''' 3344 3345 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3346 ''' 3347 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3348 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3349 reference frame. 3350 3351 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3352 3353 ```py 3354 { 3355 "1000C": 0.0, 3356 "25C": 2.228 3357 } 3358 ``` 3359 ''' 3360 3361 @property 3362 def Nominal_D49(self): 3363 return self.Nominal_D4x 3364 3365 @Nominal_D49.setter 3366 def Nominal_D49(self, new): 3367 self.Nominal_D4x = dict(**new) 3368 self.refresh() 3369 3370 def __init__(self, l=[], **kwargs): 3371 ''' 3372 **Parameters:** same as `D4xdata.__init__()` 3373 ''' 3374 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3375 3376 def save_D49_correl(self, *args, **kwargs): 3377 return self._save_D4x_correl(*args, **kwargs) 3378 3379 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.
3370 def __init__(self, l=[], **kwargs): 3371 ''' 3372 **Parameters:** same as `D4xdata.__init__()` 3373 ''' 3374 D4xdata.__init__(self, l=l, mass='49', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ49 values assigned to the Δ49 anchor samples, used by
D49data.standardize()
to normalize unknown samples to an absolute Δ49
reference frame.
By default equal to (after Wang et al. (2004)):
{
"1000C": 0.0,
"25C": 2.228
}
Save D49 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D49_correl.csv
)D49_precision
: the precision to use when writingD49
andD49_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- verbose
- prefix
- logfile
- Nf
- repeatability
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort