D47crunch
Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements
Process and standardize carbonate and/or CO2 clumped-isotope analyses, from low-level data out of a dual-inlet mass spectrometer to final, “absolute” Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates (Daëron, 2021).
The tutorial section takes you through a series of simple steps to import/process data and print out the results. The how-to section provides instructions applicable to various specific tasks.
1. Tutorial
1.1 Installation
The easy option is to use pip
; open a shell terminal and simply type:
python -m pip install D47crunch
For those wishing to experiment with the bleeding-edge development version, this can be done through the following steps:
- Download the
dev
branch source code here and rename it toD47crunch.py
. - Do any of the following:
- copy
D47crunch.py
to somewhere in your Python path - copy
D47crunch.py
to a working directory (import D47crunch
will only work if called within that directory) - copy
D47crunch.py
to any other location (e.g.,/foo/bar
) and then use the following code snippet in your own code to importD47crunch
:
- copy
import sys
sys.path.append('/foo/bar')
import D47crunch
Documentation for the development version can be downloaded here (save html file and open it locally).
1.2 Usage
Start by creating a file named rawdata.csv
with the following contents:
UID, Sample, d45, d46, d47, d48, d49
A01, ETH-1, 5.79502, 11.62767, 16.89351, 24.56708, 0.79486
A02, MYSAMPLE-1, 6.21907, 11.49107, 17.27749, 24.58270, 1.56318
A03, ETH-2, -6.05868, -4.81718, -11.63506, -10.32578, 0.61352
A04, MYSAMPLE-2, -3.86184, 4.94184, 0.60612, 10.52732, 0.57118
A05, ETH-3, 5.54365, 12.05228, 17.40555, 25.96919, 0.74608
A06, ETH-2, -6.06706, -4.87710, -11.69927, -10.64421, 1.61234
A07, ETH-1, 5.78821, 11.55910, 16.80191, 24.56423, 1.47963
A08, MYSAMPLE-2, -3.87692, 4.86889, 0.52185, 10.40390, 1.07032
Then instantiate a D47data
object which will store and process this data:
import D47crunch
mydata = D47crunch.D47data()
For now, this object is empty:
>>> print(mydata)
[]
To load the analyses saved in rawdata.csv
into our D47data
object and process the data:
mydata.read('rawdata.csv')
# compute δ13C, δ18O of working gas:
mydata.wg()
# compute δ13C, δ18O, raw Δ47 values for each analysis:
mydata.crunch()
# compute absolute Δ47 values for each analysis
# as well as average Δ47 values for each sample:
mydata.standardize()
We can now print a summary of the data processing:
>>> mydata.summary(verbose = True, save_to_file = False)
[summary]
––––––––––––––––––––––––––––––– –––––––––
N samples (anchors + unknowns) 5 (3 + 2)
N analyses (anchors + unknowns) 8 (5 + 3)
Repeatability of δ13C_VPDB 4.2 ppm
Repeatability of δ18O_VSMOW 47.5 ppm
Repeatability of Δ47 (anchors) 13.4 ppm
Repeatability of Δ47 (unknowns) 2.5 ppm
Repeatability of Δ47 (all) 9.6 ppm
Model degrees of freedom 3
Student's 95% t-factor 3.18
Standardization method pooled
––––––––––––––––––––––––––––––– –––––––––
This tells us that our data set contains 5 different samples: 3 anchors (ETH-1, ETH-2, ETH-3) and 2 unknowns (MYSAMPLE-1, MYSAMPLE-2). The total number of analyses is 8, with 5 anchor analyses and 3 unknown analyses. We get an estimate of the analytical repeatability (i.e. the overall, pooled standard deviation) for δ13C, δ18O and Δ47, as well as the number of degrees of freedom (here, 3) that these estimated standard deviations are based on, along with the corresponding Student's t-factor (here, 3.18) for 95 % confidence limits. Finally, the summary indicates that we used a “pooled” standardization approach (see [Daëron, 2021]).
To see the actual results:
>>> mydata.table_of_samples(verbose = True, save_to_file = False)
[table_of_samples]
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 2 2.01 37.01 0.2052 0.0131
ETH-2 2 -10.17 19.88 0.2085 0.0026
ETH-3 1 1.73 37.49 0.6132
MYSAMPLE-1 1 2.48 36.90 0.2996 0.0091 ± 0.0291
MYSAMPLE-2 2 -8.17 30.05 0.6600 0.0115 ± 0.0366 0.0025
–––––––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
This table lists, for each sample, the number of analytical replicates, average δ13C and δ18O values (for the analyte CO2 , not for the carbonate itself), the average Δ47 value and the SD of Δ47 for all replicates of this sample. For unknown samples, the SE and 95 % confidence limits for mean Δ47 are also listed These 95 % CL take into account the number of degrees of freedom of the regression model, so that in large datasets the 95 % CL will tend to 1.96 times the SE, but in this case the applicable t-factor is much larger.
We can also generate a table of all analyses in the data set (again, note that d18O_VSMOW
is the composition of the CO2 analyte):
>>> mydata.table_of_analyses(verbose = True, save_to_file = False)
[table_of_analyses]
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
A01 mySession ETH-1 -3.807 24.921 5.795020 11.627670 16.893510 24.567080 0.794860 2.014086 37.041843 -0.574686 1.149684 -27.690250 0.214454
A02 mySession MYSAMPLE-1 -3.807 24.921 6.219070 11.491070 17.277490 24.582700 1.563180 2.476827 36.898281 -0.499264 1.435380 -27.122614 0.299589
A03 mySession ETH-2 -3.807 24.921 -6.058680 -4.817180 -11.635060 -10.325780 0.613520 -10.166796 19.907706 -0.685979 -0.721617 16.716901 0.206693
A04 mySession MYSAMPLE-2 -3.807 24.921 -3.861840 4.941840 0.606120 10.527320 0.571180 -8.159927 30.087230 -0.248531 0.613099 -4.979413 0.658270
A05 mySession ETH-3 -3.807 24.921 5.543650 12.052280 17.405550 25.969190 0.746080 1.727029 37.485567 -0.226150 1.678699 -28.280301 0.613200
A06 mySession ETH-2 -3.807 24.921 -6.067060 -4.877100 -11.699270 -10.644210 1.612340 -10.173599 19.845192 -0.683054 -0.922832 17.861363 0.210328
A07 mySession ETH-1 -3.807 24.921 5.788210 11.559100 16.801910 24.564230 1.479630 2.009281 36.970298 -0.591129 1.282632 -26.888335 0.195926
A08 mySession MYSAMPLE-2 -3.807 24.921 -3.876920 4.868890 0.521850 10.403900 1.070320 -8.173486 30.011134 -0.245768 0.636159 -4.324964 0.661803
––– ––––––––– –––––––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––– –––––––––– –––––––––– ––––––––– ––––––––– –––––––––– ––––––––
2. How-to
2.1 Simulate a virtual data set to play with
It is sometimes convenient to quickly build a virtual data set of analyses, for instance to assess the final analytical precision achievable for a given combination of anchor and unknown analyses (see also Fig. 6 of Daëron, 2021).
This can be achieved with virtual_data()
. The example below creates a dataset with four sessions, each of which comprises three analyses of anchor ETH-1, three of ETH-2, three of ETH-3, and three analyses each of two unknown samples named FOO
and BAR
with an arbitrarily defined isotopic composition. Analytical repeatabilities for Δ47 and Δ48 are also specified arbitrarily. See the virtual_data()
documentation for additional configuration parameters.
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
2.2 Control data quality
D47crunch
offers several tools to visualize processed data. The examples below use the same virtual data set, generated with:
from D47crunch import *
from random import shuffle
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 8),
dict(Sample = 'ETH-2', N = 8),
dict(Sample = 'ETH-3', N = 8),
dict(Sample = 'FOO', N = 4,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 4,
d13C_VPDB = -15., d18O_VPDB = -15.,
D47 = 0.5, D48 = 0.2),
])
sessions = [
virtual_data(session = f'Session_{k+1:02.0f}', seed = 123456+k, **args)
for k in range(10)]
# shuffle the data:
data = [r for s in sessions for r in s]
shuffle(data)
data = sorted(data, key = lambda r: r['Session'])
# create D47data instance:
data47 = D47data(data)
# process D47data instance:
data47.crunch()
data47.standardize()
2.2.1 Plotting the distribution of analyses through time
data47.plot_distribution_of_analyses(filename = 'time_distribution.pdf')
The plot above shows the succession of analyses as if they were all distributed at regular time intervals. See D4xdata.plot_distribution_of_analyses()
for how to plot analyses as a function of “true” time (based on the TimeTag
for each analysis).
2.2.2 Generating session plots
data47.plot_sessions()
Below is one of the resulting sessions plots. Each cross marker is an analysis. Anchors are in red and unknowns in blue. Short horizontal lines show the nominal Δ47 value for anchors, in red, or the average Δ47 value for unknowns, in blue (overall average for all sessions). Curved grey contours correspond to Δ47 standardization errors in this session.
2.2.3 Plotting Δ47 or Δ48 residuals
data47.plot_residuals(filename = 'residuals.pdf', kde = True)
Again, note that this plot only shows the succession of analyses as if they were all distributed at regular time intervals.
2.2.4 Checking δ13C and δ18O dispersion
mydata = D47data(virtual_data(
session = 'mysession',
samples = [
dict(Sample = 'ETH-1', N = 4),
dict(Sample = 'ETH-2', N = 4),
dict(Sample = 'ETH-3', N = 4),
dict(Sample = 'MYSAMPLE', N = 8, D47 = 0.6, D48 = 0.1, d13C_VPDB = -4.0, d18O_VPDB = -12.0),
], seed = 123))
mydata.refresh()
mydata.wg()
mydata.crunch()
mydata.plot_bulk_compositions()
D4xdata.plot_bulk_compositions()
produces a series of plots, one for each sample, and an additional plot with all samples together. For example, here is the plot for sample MYSAMPLE
:
2.3 Use a different set of anchors, change anchor nominal values, and/or change oxygen-17 correction parameters
Nominal values for various carbonate standards are defined in four places:
D4xdata.Nominal_d13C_VPDB
D4xdata.Nominal_d18O_VPDB
D47data.Nominal_D4x
(also accessible throughD47data.Nominal_D47
)D48data.Nominal_D4x
(also accessible throughD48data.Nominal_D48
)
17O correction parameters are defined by:
D4xdata.R13_VPDB
D4xdata.R18_VSMOW
D4xdata.R18_VPDB
D4xdata.LAMBDA_17
D4xdata.R17_VSMOW
D4xdata.R17_VPDB
When creating a new instance of D47data
or D48data
, the current values of these variables are copied as properties of the new object. Applying custom values for, e.g., R17_VSMOW
and Nominal_D47
can thus be done in several ways:
Option 1: by redefining D4xdata.R17_VSMOW
and D47data.Nominal_D47
_before_ creating a D47data
object:
from D47crunch import D4xdata, D47data
# redefine R17_VSMOW:
D4xdata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
D4xdata.R17_VPDB = D4xdata.R17_VSMOW * (D4xdata.R18_VPDB/D4xdata.R18_VSMOW) ** D4xdata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
D47data.Nominal_D4x = {
a: D47data.Nominal_D4x[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
D47data.Nominal_D4x['ETH-3'] = 0.600
# only now create D47data object:
mydata = D47data()
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# NB: mydata.Nominal_D47 is just an alias for mydata.Nominal_D4x
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
Option 2: by redefining R17_VSMOW
and Nominal_D47
_after_ creating a D47data
object:
from D47crunch import D47data
# first create D47data object:
mydata = D47data()
# redefine R17_VSMOW:
mydata.R17_VSMOW = 0.00037 # new value
# redefine R17_VPDB for consistency:
mydata.R17_VPDB = mydata.R17_VSMOW * (mydata.R18_VPDB/mydata.R18_VSMOW) ** mydata.LAMBDA_17
# edit Nominal_D47 to only include ETH-1/2/3:
mydata.Nominal_D47 = {
a: mydata.Nominal_D47[a]
for a in ['ETH-1', 'ETH-2', 'ETH-3']
}
# redefine ETH-3:
mydata.Nominal_D47['ETH-3'] = 0.600
# check the results:
print(mydata.R17_VSMOW, mydata.R17_VPDB)
print(mydata.Nominal_D47)
# should print out:
# 0.00037 0.00037599710894149464
# {'ETH-1': 0.2052, 'ETH-2': 0.2085, 'ETH-3': 0.6}
The two options above are equivalent, but the latter provides a simple way to compare different data processing choices:
from D47crunch import D47data
# create two D47data objects:
foo = D47data()
bar = D47data()
# modify foo in various ways:
foo.LAMBDA_17 = 0.52
foo.R17_VSMOW = 0.00037 # new value
foo.R17_VPDB = foo.R17_VSMOW * (foo.R18_VPDB/foo.R18_VSMOW) ** foo.LAMBDA_17
foo.Nominal_D47 = {
'ETH-1': foo.Nominal_D47['ETH-1'],
'ETH-2': foo.Nominal_D47['ETH-1'],
'IAEA-C2': foo.Nominal_D47['IAEA-C2'],
'INLAB_REF_MATERIAL': 0.666,
}
# now import the same raw data into foo and bar:
foo.read('rawdata.csv')
foo.wg() # compute δ13C, δ18O of working gas
foo.crunch() # compute all δ13C, δ18O and raw Δ47 values
foo.standardize() # compute absolute Δ47 values
bar.read('rawdata.csv')
bar.wg() # compute δ13C, δ18O of working gas
bar.crunch() # compute all δ13C, δ18O and raw Δ47 values
bar.standardize() # compute absolute Δ47 values
# and compare the final results:
foo.table_of_samples(verbose = True, save_to_file = False)
bar.table_of_samples(verbose = True, save_to_file = False)
2.4 Process paired Δ47 and Δ48 values
Purely in terms of data processing, it is not obvious why Δ47 and Δ48 data should not be handled separately. For now, D47crunch
uses two independent classes — D47data
and D48data
— which crunch numbers and deal with standardization in very similar ways. The following example demonstrates how to print out combined outputs for D47data
and D48data
.
from D47crunch import *
# generate virtual data:
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args)
session2 = virtual_data(session = 'Session_02', **args)
# create D47data instance:
data47 = D47data(session1 + session2)
# process D47data instance:
data47.crunch()
data47.standardize()
# create D48data instance:
data48 = D48data(data47) # alternatively: data48 = D48data(session1 + session2)
# process D48data instance:
data48.crunch()
data48.standardize()
# output combined results:
table_of_sessions(data47, data48)
table_of_samples(data47, data48)
table_of_analyses(data47, data48)
Expected output:
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a_47 ± SE 1e3 x b_47 ± SE c_47 ± SE r_D48 a_48 ± SE 1e3 x b_48 ± SE c_48 ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
Session_01 9 3 -4.000 26.000 0.0000 0.0000 0.0098 1.021 ± 0.019 -0.398 ± 0.260 -0.903 ± 0.006 0.0486 0.540 ± 0.151 1.235 ± 0.607 -0.390 ± 0.025
Session_02 9 3 -4.000 26.000 0.0000 0.0000 0.0090 1.015 ± 0.019 0.376 ± 0.260 -0.905 ± 0.006 0.0186 1.350 ± 0.156 -0.871 ± 0.608 -0.504 ± 0.027
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––––– –––––––––––––– –––––– ––––––––––––– ––––––––––––––– ––––––––––––––
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene D48 SE 95% CL SD p_Levene
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 6 2.02 37.02 0.2052 0.0078 0.1380 0.0223
ETH-2 6 -10.17 19.88 0.2085 0.0036 0.1380 0.0482
ETH-3 6 1.71 37.45 0.6132 0.0080 0.2700 0.0176
FOO 6 -5.00 28.91 0.3026 0.0044 ± 0.0093 0.0121 0.164 0.1397 0.0121 ± 0.0255 0.0267 0.127
–––––– – ––––––––– –––––––––– –––––– –––––– –––––––– –––––– –––––––– –––––– –––––– –––––––– –––––– ––––––––
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47 D48
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.120787 21.286237 27.780042 2.020000 37.024281 -0.708176 -0.316435 -0.000013 0.197297 0.087763
2 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132240 21.307795 27.780042 2.020000 37.024281 -0.696913 -0.295333 -0.000013 0.208328 0.126791
3 Session_01 ETH-1 -4.000 26.000 6.018962 10.747026 16.132438 21.313884 27.780042 2.020000 37.024281 -0.696718 -0.289374 -0.000013 0.208519 0.137813
4 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700300 -12.210735 -18.023381 -10.170000 19.875825 -0.683938 -0.297902 -0.000002 0.209785 0.198705
5 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.707421 -12.270781 -18.023381 -10.170000 19.875825 -0.691145 -0.358673 -0.000002 0.202726 0.086308
6 Session_01 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.700061 -12.278310 -18.023381 -10.170000 19.875825 -0.683696 -0.366292 -0.000002 0.210022 0.072215
7 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.684379 22.225827 28.306614 1.710000 37.450394 -0.273094 -0.216392 -0.000014 0.623472 0.270873
8 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.660163 22.233729 28.306614 1.710000 37.450394 -0.296906 -0.208664 -0.000014 0.600150 0.285167
9 Session_01 ETH-3 -4.000 26.000 5.742374 11.161270 16.675191 22.215632 28.306614 1.710000 37.450394 -0.282128 -0.226363 -0.000014 0.614623 0.252432
10 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.328380 5.374933 4.665655 -5.000000 28.907344 -0.582131 -0.288924 -0.000006 0.314928 0.175105
11 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.302220 5.384454 4.665655 -5.000000 28.907344 -0.608241 -0.279457 -0.000006 0.289356 0.192614
12 Session_01 FOO -4.000 26.000 -0.840413 2.828738 1.322530 5.372841 4.665655 -5.000000 28.907344 -0.587970 -0.291004 -0.000006 0.309209 0.171257
13 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.140853 21.267202 27.780042 2.020000 37.024281 -0.688442 -0.335067 -0.000013 0.207730 0.138730
14 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.127087 21.256983 27.780042 2.020000 37.024281 -0.701980 -0.345071 -0.000013 0.194396 0.131311
15 Session_02 ETH-1 -4.000 26.000 6.018962 10.747026 16.148253 21.287779 27.780042 2.020000 37.024281 -0.681165 -0.314926 -0.000013 0.214898 0.153668
16 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715859 -12.204791 -18.023381 -10.170000 19.875825 -0.699685 -0.291887 -0.000002 0.207349 0.149128
17 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.709763 -12.188685 -18.023381 -10.170000 19.875825 -0.693516 -0.275587 -0.000002 0.213426 0.161217
18 Session_02 ETH-2 -4.000 26.000 -5.995859 -5.976076 -12.715427 -12.253049 -18.023381 -10.170000 19.875825 -0.699249 -0.340727 -0.000002 0.207780 0.112907
19 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.685994 22.249463 28.306614 1.710000 37.450394 -0.271506 -0.193275 -0.000014 0.618328 0.244431
20 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.681351 22.298166 28.306614 1.710000 37.450394 -0.276071 -0.145641 -0.000014 0.613831 0.279758
21 Session_02 ETH-3 -4.000 26.000 5.742374 11.161270 16.676169 22.306848 28.306614 1.710000 37.450394 -0.281167 -0.137150 -0.000014 0.608813 0.286056
22 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.324359 5.339497 4.665655 -5.000000 28.907344 -0.586144 -0.324160 -0.000006 0.314015 0.136535
23 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.297658 5.325854 4.665655 -5.000000 28.907344 -0.612794 -0.337727 -0.000006 0.287767 0.126473
24 Session_02 FOO -4.000 26.000 -0.840413 2.828738 1.310185 5.339898 4.665655 -5.000000 28.907344 -0.600291 -0.323761 -0.000006 0.300082 0.136830
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– –––––––– ––––––––
3. Command-Line Interface (CLI)
Instead of writing Python code, you may directly use the CLI to process raw Δ47 and Δ48 data using reasonable defaults. The simplest way is simply to call:
D47crunch rawdata.csv
This will create a directory named output
and populate it by calling the following methods:
D47data.wg()
D47data.crunch()
D47data.standardize()
D47data.summary()
D47data.table_of_samples()
D47data.table_of_sessions()
D47data.plot_sessions()
D47data.plot_residuals()
D47data.table_of_analyses()
D47data.plot_distribution_of_analyses()
D47data.plot_bulk_compositions()
D47data.save_D47_correl()
You may specify a custom set of anchors instead of the default ones using the --anchors
or -a
option:
D47crunch -a anchors.csv rawdata.csv
In this case, the anchors.csv
file (you may use any other file name) must have the following format:
Sample, d13C_VPDB, d18O_VPDB, D47
ETH-1, 2.02, -2.19, 0.2052
ETH-2, -10.17, -18.69, 0.2085
ETH-3, 1.71, -1.78, 0.6132
ETH-4, , , 0.4511
The samples with non-empty d13C_VPDB
, d18O_VPDB
, and D47
values are used to standardize δ13C, δ18O, and Δ47 values respectively.
You may also provide a list of analyses and/or samples to exclude from the input. This is done with the --exclude
or -e
option:
D47crunch -e badbatch.csv rawdata.csv
In this case, the badbatch.csv
file (again, you may use a different file name) must have the following format:
UID, Sample
A03
A09
B06
, MYBADSAMPLE-1
, MYBADSAMPLE-2
This will exclude (ignore) analyses with the UIDs A03
, A09
, and B06
, and those of samples MYBADSAMPLE-1
and MYBADSAMPLE-2
. It is possible to have and exclude file with only the UID
column, or only the Sample
column, or both, in any order.
The --output-dir
or -o
option may be used to specify a custom directory name for the output. For example, in unix-like shells the following command will create a time-stamped output directory:
D47crunch -o `date "+%Y-%M-%d-%Hh%M"` rawdata.csv
To process Δ48 as well as Δ47, just add the --D48
option.
4. API Documentation
1''' 2Standardization and analytical error propagation of Δ47 and Δ48 clumped-isotope measurements 3 4Process and standardize carbonate and/or CO2 clumped-isotope analyses, 5from low-level data out of a dual-inlet mass spectrometer to final, “absolute” 6Δ47, Δ48 and Δ49 values with fully propagated analytical error estimates 7([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). 8 9The **tutorial** section takes you through a series of simple steps to import/process data and print out the results. 10The **how-to** section provides instructions applicable to various specific tasks. 11 12.. include:: ../docs/tutorial.md 13.. include:: ../docs/howto.md 14.. include:: ../docs/cli.md 15 16# 4. API Documentation 17''' 18 19__docformat__ = "restructuredtext" 20__author__ = 'Mathieu Daëron' 21__contact__ = 'daeron@lsce.ipsl.fr' 22__copyright__ = 'Copyright (c) 2023 Mathieu Daëron' 23__license__ = 'Modified BSD License - https://opensource.org/licenses/BSD-3-Clause' 24__date__ = '2023-10-04' 25__version__ = '2.4.0' 26 27import os 28import numpy as np 29import typer 30from typing_extensions import Annotated 31from statistics import stdev 32from scipy.stats import t as tstudent 33from scipy.stats import levene 34from scipy.interpolate import interp1d 35from numpy import linalg 36from lmfit import Minimizer, Parameters, report_fit 37from matplotlib import pyplot as ppl 38from datetime import datetime as dt 39from functools import wraps 40from colorsys import hls_to_rgb 41from matplotlib import rcParams 42 43typer.rich_utils.STYLE_HELPTEXT = '' 44 45rcParams['font.family'] = 'sans-serif' 46rcParams['font.sans-serif'] = 'Helvetica' 47rcParams['font.size'] = 10 48rcParams['mathtext.fontset'] = 'custom' 49rcParams['mathtext.rm'] = 'sans' 50rcParams['mathtext.bf'] = 'sans:bold' 51rcParams['mathtext.it'] = 'sans:italic' 52rcParams['mathtext.cal'] = 'sans:italic' 53rcParams['mathtext.default'] = 'rm' 54rcParams['xtick.major.size'] = 4 55rcParams['xtick.major.width'] = 1 56rcParams['ytick.major.size'] = 4 57rcParams['ytick.major.width'] = 1 58rcParams['axes.grid'] = False 59rcParams['axes.linewidth'] = 1 60rcParams['grid.linewidth'] = .75 61rcParams['grid.linestyle'] = '-' 62rcParams['grid.alpha'] = .15 63rcParams['savefig.dpi'] = 150 64 65Petersen_etal_CO2eqD47 = np.array([[-12, 1.147113572], [-11, 1.139961218], [-10, 1.132872856], [-9, 1.125847677], [-8, 1.118884889], [-7, 1.111983708], [-6, 1.105143366], [-5, 1.098363105], [-4, 1.091642182], [-3, 1.084979862], [-2, 1.078375423], [-1, 1.071828156], [0, 1.065337360], [1, 1.058902349], [2, 1.052522443], [3, 1.046196976], [4, 1.039925291], [5, 1.033706741], [6, 1.027540690], [7, 1.021426510], [8, 1.015363585], [9, 1.009351306], [10, 1.003389075], [11, 0.997476303], [12, 0.991612409], [13, 0.985796821], [14, 0.980028975], [15, 0.974308318], [16, 0.968634304], [17, 0.963006392], [18, 0.957424055], [19, 0.951886769], [20, 0.946394020], [21, 0.940945302], [22, 0.935540114], [23, 0.930177964], [24, 0.924858369], [25, 0.919580851], [26, 0.914344938], [27, 0.909150167], [28, 0.903996080], [29, 0.898882228], [30, 0.893808167], [31, 0.888773459], [32, 0.883777672], [33, 0.878820382], [34, 0.873901170], [35, 0.869019623], [36, 0.864175334], [37, 0.859367901], [38, 0.854596929], [39, 0.849862028], [40, 0.845162813], [41, 0.840498905], [42, 0.835869931], [43, 0.831275522], [44, 0.826715314], [45, 0.822188950], [46, 0.817696075], [47, 0.813236341], [48, 0.808809404], [49, 0.804414926], [50, 0.800052572], [51, 0.795722012], [52, 0.791422922], [53, 0.787154979], [54, 0.782917869], [55, 0.778711277], [56, 0.774534898], [57, 0.770388426], [58, 0.766271562], [59, 0.762184010], [60, 0.758125479], [61, 0.754095680], [62, 0.750094329], [63, 0.746121147], [64, 0.742175856], [65, 0.738258184], [66, 0.734367860], [67, 0.730504620], [68, 0.726668201], [69, 0.722858343], [70, 0.719074792], [71, 0.715317295], [72, 0.711585602], [73, 0.707879469], [74, 0.704198652], [75, 0.700542912], [76, 0.696912012], [77, 0.693305719], [78, 0.689723802], [79, 0.686166034], [80, 0.682632189], [81, 0.679122047], [82, 0.675635387], [83, 0.672171994], [84, 0.668731654], [85, 0.665314156], [86, 0.661919291], [87, 0.658546854], [88, 0.655196641], [89, 0.651868451], [90, 0.648562087], [91, 0.645277352], [92, 0.642014054], [93, 0.638771999], [94, 0.635551001], [95, 0.632350872], [96, 0.629171428], [97, 0.626012487], [98, 0.622873870], [99, 0.619755397], [100, 0.616656895], [102, 0.610519107], [104, 0.604459143], [106, 0.598475670], [108, 0.592567388], [110, 0.586733026], [112, 0.580971342], [114, 0.575281125], [116, 0.569661187], [118, 0.564110371], [120, 0.558627545], [122, 0.553211600], [124, 0.547861454], [126, 0.542576048], [128, 0.537354347], [130, 0.532195337], [132, 0.527098028], [134, 0.522061450], [136, 0.517084654], [138, 0.512166711], [140, 0.507306712], [142, 0.502503768], [144, 0.497757006], [146, 0.493065573], [148, 0.488428634], [150, 0.483845370], [152, 0.479314980], [154, 0.474836677], [156, 0.470409692], [158, 0.466033271], [160, 0.461706674], [162, 0.457429176], [164, 0.453200067], [166, 0.449018650], [168, 0.444884242], [170, 0.440796174], [172, 0.436753787], [174, 0.432756438], [176, 0.428803494], [178, 0.424894334], [180, 0.421028350], [182, 0.417204944], [184, 0.413423530], [186, 0.409683531], [188, 0.405984383], [190, 0.402325531], [192, 0.398706429], [194, 0.395126543], [196, 0.391585347], [198, 0.388082324], [200, 0.384616967], [202, 0.381188778], [204, 0.377797268], [206, 0.374441954], [208, 0.371122364], [210, 0.367838033], [212, 0.364588505], [214, 0.361373329], [216, 0.358192065], [218, 0.355044277], [220, 0.351929540], [222, 0.348847432], [224, 0.345797540], [226, 0.342779460], [228, 0.339792789], [230, 0.336837136], [232, 0.333912113], [234, 0.331017339], [236, 0.328152439], [238, 0.325317046], [240, 0.322510795], [242, 0.319733329], [244, 0.316984297], [246, 0.314263352], [248, 0.311570153], [250, 0.308904364], [252, 0.306265654], [254, 0.303653699], [256, 0.301068176], [258, 0.298508771], [260, 0.295975171], [262, 0.293467070], [264, 0.290984167], [266, 0.288526163], [268, 0.286092765], [270, 0.283683684], [272, 0.281298636], [274, 0.278937339], [276, 0.276599517], [278, 0.274284898], [280, 0.271993211], [282, 0.269724193], [284, 0.267477582], [286, 0.265253121], [288, 0.263050554], [290, 0.260869633], [292, 0.258710110], [294, 0.256571741], [296, 0.254454286], [298, 0.252357508], [300, 0.250281174], [302, 0.248225053], [304, 0.246188917], [306, 0.244172542], [308, 0.242175707], [310, 0.240198194], [312, 0.238239786], [314, 0.236300272], [316, 0.234379441], [318, 0.232477087], [320, 0.230593005], [322, 0.228726993], [324, 0.226878853], [326, 0.225048388], [328, 0.223235405], [330, 0.221439711], [332, 0.219661118], [334, 0.217899439], [336, 0.216154491], [338, 0.214426091], [340, 0.212714060], [342, 0.211018220], [344, 0.209338398], [346, 0.207674420], [348, 0.206026115], [350, 0.204393315], [355, 0.200378063], [360, 0.196456139], [365, 0.192625077], [370, 0.188882487], [375, 0.185226048], [380, 0.181653511], [385, 0.178162694], [390, 0.174751478], [395, 0.171417807], [400, 0.168159686], [405, 0.164975177], [410, 0.161862398], [415, 0.158819521], [420, 0.155844772], [425, 0.152936426], [430, 0.150092806], [435, 0.147312286], [440, 0.144593281], [445, 0.141934254], [450, 0.139333710], [455, 0.136790195], [460, 0.134302294], [465, 0.131868634], [470, 0.129487876], [475, 0.127158722], [480, 0.124879906], [485, 0.122650197], [490, 0.120468398], [495, 0.118333345], [500, 0.116243903], [505, 0.114198970], [510, 0.112197471], [515, 0.110238362], [520, 0.108320625], [525, 0.106443271], [530, 0.104605335], [535, 0.102805877], [540, 0.101043985], [545, 0.099318768], [550, 0.097629359], [555, 0.095974915], [560, 0.094354612], [565, 0.092767650], [570, 0.091213248], [575, 0.089690648], [580, 0.088199108], [585, 0.086737906], [590, 0.085306341], [595, 0.083903726], [600, 0.082529395], [605, 0.081182697], [610, 0.079862998], [615, 0.078569680], [620, 0.077302141], [625, 0.076059794], [630, 0.074842066], [635, 0.073648400], [640, 0.072478251], [645, 0.071331090], [650, 0.070206399], [655, 0.069103674], [660, 0.068022424], [665, 0.066962168], [670, 0.065922439], [675, 0.064902780], [680, 0.063902748], [685, 0.062921909], [690, 0.061959837], [695, 0.061016122], [700, 0.060090360], [705, 0.059182157], [710, 0.058291131], [715, 0.057416907], [720, 0.056559120], [725, 0.055717414], [730, 0.054891440], [735, 0.054080860], [740, 0.053285343], [745, 0.052504565], [750, 0.051738210], [755, 0.050985971], [760, 0.050247546], [765, 0.049522643], [770, 0.048810974], [775, 0.048112260], [780, 0.047426227], [785, 0.046752609], [790, 0.046091145], [795, 0.045441581], [800, 0.044803668], [805, 0.044177164], [810, 0.043561831], [815, 0.042957438], [820, 0.042363759], [825, 0.041780573], [830, 0.041207664], [835, 0.040644822], [840, 0.040091839], [845, 0.039548516], [850, 0.039014654], [855, 0.038490063], [860, 0.037974554], [865, 0.037467944], [870, 0.036970054], [875, 0.036480707], [880, 0.035999734], [885, 0.035526965], [890, 0.035062238], [895, 0.034605393], [900, 0.034156272], [905, 0.033714724], [910, 0.033280598], [915, 0.032853749], [920, 0.032434032], [925, 0.032021309], [930, 0.031615443], [935, 0.031216300], [940, 0.030823749], [945, 0.030437663], [950, 0.030057915], [955, 0.029684385], [960, 0.029316951], [965, 0.028955498], [970, 0.028599910], [975, 0.028250075], [980, 0.027905884], [985, 0.027567229], [990, 0.027234006], [995, 0.026906112], [1000, 0.026583445], [1005, 0.026265908], [1010, 0.025953405], [1015, 0.025645841], [1020, 0.025343124], [1025, 0.025045163], [1030, 0.024751871], [1035, 0.024463160], [1040, 0.024178947], [1045, 0.023899147], [1050, 0.023623680], [1055, 0.023352467], [1060, 0.023085429], [1065, 0.022822491], [1070, 0.022563577], [1075, 0.022308615], [1080, 0.022057533], [1085, 0.021810260], [1090, 0.021566729], [1095, 0.021326872], [1100, 0.021090622]]) 66_fCO2eqD47_Petersen = interp1d(Petersen_etal_CO2eqD47[:,0], Petersen_etal_CO2eqD47[:,1]) 67def fCO2eqD47_Petersen(T): 68 ''' 69 CO2 equilibrium Δ47 value as a function of T (in degrees C) 70 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 71 72 ''' 73 return float(_fCO2eqD47_Petersen(T)) 74 75 76Wang_etal_CO2eqD47 = np.array([[-83., 1.8954], [-73., 1.7530], [-63., 1.6261], [-53., 1.5126], [-43., 1.4104], [-33., 1.3182], [-23., 1.2345], [-13., 1.1584], [-3., 1.0888], [7., 1.0251], [17., 0.9665], [27., 0.9125], [37., 0.8626], [47., 0.8164], [57., 0.7734], [67., 0.7334], [87., 0.6612], [97., 0.6286], [107., 0.5980], [117., 0.5693], [127., 0.5423], [137., 0.5169], [147., 0.4930], [157., 0.4704], [167., 0.4491], [177., 0.4289], [187., 0.4098], [197., 0.3918], [207., 0.3747], [217., 0.3585], [227., 0.3431], [237., 0.3285], [247., 0.3147], [257., 0.3015], [267., 0.2890], [277., 0.2771], [287., 0.2657], [297., 0.2550], [307., 0.2447], [317., 0.2349], [327., 0.2256], [337., 0.2167], [347., 0.2083], [357., 0.2002], [367., 0.1925], [377., 0.1851], [387., 0.1781], [397., 0.1714], [407., 0.1650], [417., 0.1589], [427., 0.1530], [437., 0.1474], [447., 0.1421], [457., 0.1370], [467., 0.1321], [477., 0.1274], [487., 0.1229], [497., 0.1186], [507., 0.1145], [517., 0.1105], [527., 0.1068], [537., 0.1031], [547., 0.0997], [557., 0.0963], [567., 0.0931], [577., 0.0901], [587., 0.0871], [597., 0.0843], [607., 0.0816], [617., 0.0790], [627., 0.0765], [637., 0.0741], [647., 0.0718], [657., 0.0695], [667., 0.0674], [677., 0.0654], [687., 0.0634], [697., 0.0615], [707., 0.0597], [717., 0.0579], [727., 0.0562], [737., 0.0546], [747., 0.0530], [757., 0.0515], [767., 0.0500], [777., 0.0486], [787., 0.0472], [797., 0.0459], [807., 0.0447], [817., 0.0435], [827., 0.0423], [837., 0.0411], [847., 0.0400], [857., 0.0390], [867., 0.0380], [877., 0.0370], [887., 0.0360], [897., 0.0351], [907., 0.0342], [917., 0.0333], [927., 0.0325], [937., 0.0317], [947., 0.0309], [957., 0.0302], [967., 0.0294], [977., 0.0287], [987., 0.0281], [997., 0.0274], [1007., 0.0268], [1017., 0.0261], [1027., 0.0255], [1037., 0.0249], [1047., 0.0244], [1057., 0.0238], [1067., 0.0233], [1077., 0.0228], [1087., 0.0223], [1097., 0.0218]]) 77_fCO2eqD47_Wang = interp1d(Wang_etal_CO2eqD47[:,0] - 0.15, Wang_etal_CO2eqD47[:,1]) 78def fCO2eqD47_Wang(T): 79 ''' 80 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 81 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 82 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 83 ''' 84 return float(_fCO2eqD47_Wang(T)) 85 86 87def correlated_sum(X, C, w = None): 88 ''' 89 Compute covariance-aware linear combinations 90 91 **Parameters** 92 93 + `X`: list or 1-D array of values to sum 94 + `C`: covariance matrix for the elements of `X` 95 + `w`: list or 1-D array of weights to apply to the elements of `X` 96 (all equal to 1 by default) 97 98 Return the sum (and its SE) of the elements of `X`, with optional weights equal 99 to the elements of `w`, accounting for covariances between the elements of `X`. 100 ''' 101 if w is None: 102 w = [1 for x in X] 103 return np.dot(w,X), (np.dot(w,np.dot(C,w)))**.5 104 105 106def make_csv(x, hsep = ',', vsep = '\n'): 107 ''' 108 Formats a list of lists of strings as a CSV 109 110 **Parameters** 111 112 + `x`: the list of lists of strings to format 113 + `hsep`: the field separator (`,` by default) 114 + `vsep`: the line-ending convention to use (`\\n` by default) 115 116 **Example** 117 118 ```py 119 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 120 ``` 121 122 outputs: 123 124 ```py 125 a,b,c 126 d,e,f 127 ``` 128 ''' 129 return vsep.join([hsep.join(l) for l in x]) 130 131 132def pf(txt): 133 ''' 134 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 135 ''' 136 return txt.replace('-','_').replace('.','_').replace(' ','_') 137 138 139def smart_type(x): 140 ''' 141 Tries to convert string `x` to a float if it includes a decimal point, or 142 to an integer if it does not. If both attempts fail, return the original 143 string unchanged. 144 ''' 145 try: 146 y = float(x) 147 except ValueError: 148 return x 149 if '.' not in x: 150 return int(y) 151 return y 152 153 154def pretty_table(x, header = 1, hsep = ' ', vsep = '–', align = '<'): 155 ''' 156 Reads a list of lists of strings and outputs an ascii table 157 158 **Parameters** 159 160 + `x`: a list of lists of strings 161 + `header`: the number of lines to treat as header lines 162 + `hsep`: the horizontal separator between columns 163 + `vsep`: the character to use as vertical separator 164 + `align`: string of left (`<`) or right (`>`) alignment characters. 165 166 **Example** 167 168 ```py 169 x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']] 170 print(pretty_table(x)) 171 ``` 172 yields: 173 ``` 174 -- ------ --- 175 A B C 176 -- ------ --- 177 1 1.9999 foo 178 10 x bar 179 -- ------ --- 180 ``` 181 182 ''' 183 txt = [] 184 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 185 186 if len(widths) > len(align): 187 align += '>' * (len(widths)-len(align)) 188 sepline = hsep.join([vsep*w for w in widths]) 189 txt += [sepline] 190 for k,l in enumerate(x): 191 if k and k == header: 192 txt += [sepline] 193 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 194 txt += [sepline] 195 txt += [''] 196 return '\n'.join(txt) 197 198 199def transpose_table(x): 200 ''' 201 Transpose a list if lists 202 203 **Parameters** 204 205 + `x`: a list of lists 206 207 **Example** 208 209 ```py 210 x = [[1, 2], [3, 4]] 211 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 212 ``` 213 ''' 214 return [[e for e in c] for c in zip(*x)] 215 216 217def w_avg(X, sX) : 218 ''' 219 Compute variance-weighted average 220 221 Returns the value and SE of the weighted average of the elements of `X`, 222 with relative weights equal to their inverse variances (`1/sX**2`). 223 224 **Parameters** 225 226 + `X`: array-like of elements to average 227 + `sX`: array-like of the corresponding SE values 228 229 **Tip** 230 231 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 232 they may be rearranged using `zip()`: 233 234 ```python 235 foo = [(0, 1), (1, 0.5), (2, 0.5)] 236 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 237 ``` 238 ''' 239 X = [ x for x in X ] 240 sX = [ sx for sx in sX ] 241 W = [ sx**-2 for sx in sX ] 242 W = [ w/sum(W) for w in W ] 243 Xavg = sum([ w*x for w,x in zip(W,X) ]) 244 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 245 return Xavg, sXavg 246 247 248def read_csv(filename, sep = ''): 249 ''' 250 Read contents of `filename` in csv format and return a list of dictionaries. 251 252 In the csv string, spaces before and after field separators (`','` by default) 253 are optional. 254 255 **Parameters** 256 257 + `filename`: the csv file to read 258 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 259 whichever appers most often in the contents of `filename`. 260 ''' 261 with open(filename) as fid: 262 txt = fid.read() 263 264 if sep == '': 265 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 266 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 267 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]] 268 269 270def simulate_single_analysis( 271 sample = 'MYSAMPLE', 272 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 273 d13C_VPDB = None, d18O_VPDB = None, 274 D47 = None, D48 = None, D49 = 0., D17O = 0., 275 a47 = 1., b47 = 0., c47 = -0.9, 276 a48 = 1., b48 = 0., c48 = -0.45, 277 Nominal_D47 = None, 278 Nominal_D48 = None, 279 Nominal_d13C_VPDB = None, 280 Nominal_d18O_VPDB = None, 281 ALPHA_18O_ACID_REACTION = None, 282 R13_VPDB = None, 283 R17_VSMOW = None, 284 R18_VSMOW = None, 285 LAMBDA_17 = None, 286 R18_VPDB = None, 287 ): 288 ''' 289 Compute working-gas delta values for a single analysis, assuming a stochastic working 290 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 291 292 **Parameters** 293 294 + `sample`: sample name 295 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 296 (respectively –4 and +26 ‰ by default) 297 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 298 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 299 of the carbonate sample 300 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 301 Δ48 values if `D47` or `D48` are not specified 302 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 303 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 304 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 305 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 306 correction parameters (by default equal to the `D4xdata` default values) 307 308 Returns a dictionary with fields 309 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 310 ''' 311 312 if Nominal_d13C_VPDB is None: 313 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 314 315 if Nominal_d18O_VPDB is None: 316 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 317 318 if ALPHA_18O_ACID_REACTION is None: 319 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 320 321 if R13_VPDB is None: 322 R13_VPDB = D4xdata().R13_VPDB 323 324 if R17_VSMOW is None: 325 R17_VSMOW = D4xdata().R17_VSMOW 326 327 if R18_VSMOW is None: 328 R18_VSMOW = D4xdata().R18_VSMOW 329 330 if LAMBDA_17 is None: 331 LAMBDA_17 = D4xdata().LAMBDA_17 332 333 if R18_VPDB is None: 334 R18_VPDB = D4xdata().R18_VPDB 335 336 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 337 338 if Nominal_D47 is None: 339 Nominal_D47 = D47data().Nominal_D47 340 341 if Nominal_D48 is None: 342 Nominal_D48 = D48data().Nominal_D48 343 344 if d13C_VPDB is None: 345 if sample in Nominal_d13C_VPDB: 346 d13C_VPDB = Nominal_d13C_VPDB[sample] 347 else: 348 raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.") 349 350 if d18O_VPDB is None: 351 if sample in Nominal_d18O_VPDB: 352 d18O_VPDB = Nominal_d18O_VPDB[sample] 353 else: 354 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 355 356 if D47 is None: 357 if sample in Nominal_D47: 358 D47 = Nominal_D47[sample] 359 else: 360 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 361 362 if D48 is None: 363 if sample in Nominal_D48: 364 D48 = Nominal_D48[sample] 365 else: 366 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 367 368 X = D4xdata() 369 X.R13_VPDB = R13_VPDB 370 X.R17_VSMOW = R17_VSMOW 371 X.R18_VSMOW = R18_VSMOW 372 X.LAMBDA_17 = LAMBDA_17 373 X.R18_VPDB = R18_VPDB 374 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 375 376 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 377 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 378 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 379 ) 380 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 381 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 382 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 383 D17O=D17O, D47=D47, D48=D48, D49=D49, 384 ) 385 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 386 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 387 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 388 D17O=D17O, 389 ) 390 391 d45 = 1000 * (R45/R45wg - 1) 392 d46 = 1000 * (R46/R46wg - 1) 393 d47 = 1000 * (R47/R47wg - 1) 394 d48 = 1000 * (R48/R48wg - 1) 395 d49 = 1000 * (R49/R49wg - 1) 396 397 for k in range(3): # dumb iteration to adjust for small changes in d47 398 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 399 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 400 d47 = 1000 * (R47raw/R47wg - 1) 401 d48 = 1000 * (R48raw/R48wg - 1) 402 403 return dict( 404 Sample = sample, 405 D17O = D17O, 406 d13Cwg_VPDB = d13Cwg_VPDB, 407 d18Owg_VSMOW = d18Owg_VSMOW, 408 d45 = d45, 409 d46 = d46, 410 d47 = d47, 411 d48 = d48, 412 d49 = d49, 413 ) 414 415 416def virtual_data( 417 samples = [], 418 a47 = 1., b47 = 0., c47 = -0.9, 419 a48 = 1., b48 = 0., c48 = -0.45, 420 rd45 = 0.020, rd46 = 0.060, 421 rD47 = 0.015, rD48 = 0.045, 422 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 423 session = None, 424 Nominal_D47 = None, Nominal_D48 = None, 425 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 426 ALPHA_18O_ACID_REACTION = None, 427 R13_VPDB = None, 428 R17_VSMOW = None, 429 R18_VSMOW = None, 430 LAMBDA_17 = None, 431 R18_VPDB = None, 432 seed = 0, 433 shuffle = True, 434 ): 435 ''' 436 Return list with simulated analyses from a single session. 437 438 **Parameters** 439 440 + `samples`: a list of entries; each entry is a dictionary with the following fields: 441 * `Sample`: the name of the sample 442 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 443 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 444 * `N`: how many analyses to generate for this sample 445 + `a47`: scrambling factor for Δ47 446 + `b47`: compositional nonlinearity for Δ47 447 + `c47`: working gas offset for Δ47 448 + `a48`: scrambling factor for Δ48 449 + `b48`: compositional nonlinearity for Δ48 450 + `c48`: working gas offset for Δ48 451 + `rd45`: analytical repeatability of δ45 452 + `rd46`: analytical repeatability of δ46 453 + `rD47`: analytical repeatability of Δ47 454 + `rD48`: analytical repeatability of Δ48 455 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 456 (by default equal to the `simulate_single_analysis` default values) 457 + `session`: name of the session (no name by default) 458 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 459 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 460 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 461 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 462 (by default equal to the `simulate_single_analysis` defaults) 463 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 464 (by default equal to the `simulate_single_analysis` defaults) 465 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 466 correction parameters (by default equal to the `simulate_single_analysis` default) 467 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 468 + `shuffle`: randomly reorder the sequence of analyses 469 470 471 Here is an example of using this method to generate an arbitrary combination of 472 anchors and unknowns for a bunch of sessions: 473 474 ```py 475 .. include:: ../code_examples/virtual_data/example.py 476 ``` 477 478 This should output something like: 479 480 ``` 481 .. include:: ../code_examples/virtual_data/output.txt 482 ``` 483 ''' 484 485 kwargs = locals().copy() 486 487 from numpy import random as nprandom 488 if seed: 489 rng = nprandom.default_rng(seed) 490 else: 491 rng = nprandom.default_rng() 492 493 N = sum([s['N'] for s in samples]) 494 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 495 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 496 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 497 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 498 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 499 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 500 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 501 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 502 503 k = 0 504 out = [] 505 for s in samples: 506 kw = {} 507 kw['sample'] = s['Sample'] 508 kw = { 509 **kw, 510 **{var: kwargs[var] 511 for var in [ 512 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 513 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 514 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 515 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 516 ] 517 if kwargs[var] is not None}, 518 **{var: s[var] 519 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 520 if var in s}, 521 } 522 523 sN = s['N'] 524 while sN: 525 out.append(simulate_single_analysis(**kw)) 526 out[-1]['d45'] += errors45[k] 527 out[-1]['d46'] += errors46[k] 528 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 529 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 530 sN -= 1 531 k += 1 532 533 if session is not None: 534 for r in out: 535 r['Session'] = session 536 537 if shuffle: 538 nprandom.shuffle(out) 539 540 return out 541 542def table_of_samples( 543 data47 = None, 544 data48 = None, 545 dir = 'output', 546 filename = None, 547 save_to_file = True, 548 print_out = True, 549 output = None, 550 ): 551 ''' 552 Print out, save to disk and/or return a combined table of samples 553 for a pair of `D47data` and `D48data` objects. 554 555 **Parameters** 556 557 + `data47`: `D47data` instance 558 + `data48`: `D48data` instance 559 + `dir`: the directory in which to save the table 560 + `filename`: the name to the csv file to write to 561 + `save_to_file`: whether to save the table to disk 562 + `print_out`: whether to print out the table 563 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 564 if set to `'raw'`: return a list of list of strings 565 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 566 ''' 567 if data47 is None: 568 if data48 is None: 569 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 570 else: 571 return data48.table_of_samples( 572 dir = dir, 573 filename = filename, 574 save_to_file = save_to_file, 575 print_out = print_out, 576 output = output 577 ) 578 else: 579 if data48 is None: 580 return data47.table_of_samples( 581 dir = dir, 582 filename = filename, 583 save_to_file = save_to_file, 584 print_out = print_out, 585 output = output 586 ) 587 else: 588 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 589 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 590 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 591 592 if save_to_file: 593 if not os.path.exists(dir): 594 os.makedirs(dir) 595 if filename is None: 596 filename = f'D47D48_samples.csv' 597 with open(f'{dir}/{filename}', 'w') as fid: 598 fid.write(make_csv(out)) 599 if print_out: 600 print('\n'+pretty_table(out)) 601 if output == 'raw': 602 return out 603 elif output == 'pretty': 604 return pretty_table(out) 605 606 607def table_of_sessions( 608 data47 = None, 609 data48 = None, 610 dir = 'output', 611 filename = None, 612 save_to_file = True, 613 print_out = True, 614 output = None, 615 ): 616 ''' 617 Print out, save to disk and/or return a combined table of sessions 618 for a pair of `D47data` and `D48data` objects. 619 ***Only applicable if the sessions in `data47` and those in `data48` 620 consist of the exact same sets of analyses.*** 621 622 **Parameters** 623 624 + `data47`: `D47data` instance 625 + `data48`: `D48data` instance 626 + `dir`: the directory in which to save the table 627 + `filename`: the name to the csv file to write to 628 + `save_to_file`: whether to save the table to disk 629 + `print_out`: whether to print out the table 630 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 631 if set to `'raw'`: return a list of list of strings 632 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 633 ''' 634 if data47 is None: 635 if data48 is None: 636 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 637 else: 638 return data48.table_of_sessions( 639 dir = dir, 640 filename = filename, 641 save_to_file = save_to_file, 642 print_out = print_out, 643 output = output 644 ) 645 else: 646 if data48 is None: 647 return data47.table_of_sessions( 648 dir = dir, 649 filename = filename, 650 save_to_file = save_to_file, 651 print_out = print_out, 652 output = output 653 ) 654 else: 655 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 656 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 657 for k,x in enumerate(out47[0]): 658 if k>7: 659 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 660 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 661 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 662 663 if save_to_file: 664 if not os.path.exists(dir): 665 os.makedirs(dir) 666 if filename is None: 667 filename = f'D47D48_sessions.csv' 668 with open(f'{dir}/{filename}', 'w') as fid: 669 fid.write(make_csv(out)) 670 if print_out: 671 print('\n'+pretty_table(out)) 672 if output == 'raw': 673 return out 674 elif output == 'pretty': 675 return pretty_table(out) 676 677 678def table_of_analyses( 679 data47 = None, 680 data48 = None, 681 dir = 'output', 682 filename = None, 683 save_to_file = True, 684 print_out = True, 685 output = None, 686 ): 687 ''' 688 Print out, save to disk and/or return a combined table of analyses 689 for a pair of `D47data` and `D48data` objects. 690 691 If the sessions in `data47` and those in `data48` do not consist of 692 the exact same sets of analyses, the table will have two columns 693 `Session_47` and `Session_48` instead of a single `Session` column. 694 695 **Parameters** 696 697 + `data47`: `D47data` instance 698 + `data48`: `D48data` instance 699 + `dir`: the directory in which to save the table 700 + `filename`: the name to the csv file to write to 701 + `save_to_file`: whether to save the table to disk 702 + `print_out`: whether to print out the table 703 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 704 if set to `'raw'`: return a list of list of strings 705 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 706 ''' 707 if data47 is None: 708 if data48 is None: 709 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 710 else: 711 return data48.table_of_analyses( 712 dir = dir, 713 filename = filename, 714 save_to_file = save_to_file, 715 print_out = print_out, 716 output = output 717 ) 718 else: 719 if data48 is None: 720 return data47.table_of_analyses( 721 dir = dir, 722 filename = filename, 723 save_to_file = save_to_file, 724 print_out = print_out, 725 output = output 726 ) 727 else: 728 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 729 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 730 731 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 732 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 733 else: 734 out47[0][1] = 'Session_47' 735 out48[0][1] = 'Session_48' 736 out47 = transpose_table(out47) 737 out48 = transpose_table(out48) 738 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 739 740 if save_to_file: 741 if not os.path.exists(dir): 742 os.makedirs(dir) 743 if filename is None: 744 filename = f'D47D48_sessions.csv' 745 with open(f'{dir}/{filename}', 'w') as fid: 746 fid.write(make_csv(out)) 747 if print_out: 748 print('\n'+pretty_table(out)) 749 if output == 'raw': 750 return out 751 elif output == 'pretty': 752 return pretty_table(out) 753 754 755def _fullcovar(minresult, epsilon = 0.01, named = False): 756 ''' 757 Construct full covariance matrix in the case of constrained parameters 758 ''' 759 760 import asteval 761 762 def f(values): 763 interp = asteval.Interpreter() 764 for n,v in zip(minresult.var_names, values): 765 interp(f'{n} = {v}') 766 for q in minresult.params: 767 if minresult.params[q].expr: 768 interp(f'{q} = {minresult.params[q].expr}') 769 return np.array([interp.symtable[q] for q in minresult.params]) 770 771 # construct Jacobian 772 J = np.zeros((minresult.nvarys, len(minresult.params))) 773 X = np.array([minresult.params[p].value for p in minresult.var_names]) 774 sX = np.array([minresult.params[p].stderr for p in minresult.var_names]) 775 776 for j in range(minresult.nvarys): 777 x1 = [_ for _ in X] 778 x1[j] += epsilon * sX[j] 779 x2 = [_ for _ in X] 780 x2[j] -= epsilon * sX[j] 781 J[j,:] = (f(x1) - f(x2)) / (2 * epsilon * sX[j]) 782 783 _names = [q for q in minresult.params] 784 _covar = J.T @ minresult.covar @ J 785 _se = np.diag(_covar)**.5 786 _correl = _covar.copy() 787 for k,s in enumerate(_se): 788 if s: 789 _correl[k,:] /= s 790 _correl[:,k] /= s 791 792 if named: 793 _covar = {i: {j:_covar[i,j] for j in minresult.params} for i in minresult.params} 794 _se = {i: _se[i] for i in minresult.params} 795 _correl = {i: {j:_correl[i,j] for j in minresult.params} for i in minresult.params} 796 797 return _names, _covar, _se, _correl 798 799 800class D4xdata(list): 801 ''' 802 Store and process data for a large set of Δ47 and/or Δ48 803 analyses, usually comprising more than one analytical session. 804 ''' 805 806 ### 17O CORRECTION PARAMETERS 807 R13_VPDB = 0.01118 # (Chang & Li, 1990) 808 ''' 809 Absolute (13C/12C) ratio of VPDB. 810 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 811 ''' 812 813 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 814 ''' 815 Absolute (18O/16C) ratio of VSMOW. 816 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 817 ''' 818 819 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 820 ''' 821 Mass-dependent exponent for triple oxygen isotopes. 822 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 823 ''' 824 825 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 826 ''' 827 Absolute (17O/16C) ratio of VSMOW. 828 By default equal to 0.00038475 829 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 830 rescaled to `R13_VPDB`) 831 ''' 832 833 R18_VPDB = R18_VSMOW * 1.03092 834 ''' 835 Absolute (18O/16C) ratio of VPDB. 836 By definition equal to `R18_VSMOW * 1.03092`. 837 ''' 838 839 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 840 ''' 841 Absolute (17O/16C) ratio of VPDB. 842 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 843 ''' 844 845 LEVENE_REF_SAMPLE = 'ETH-3' 846 ''' 847 After the Δ4x standardization step, each sample is tested to 848 assess whether the Δ4x variance within all analyses for that 849 sample differs significantly from that observed for a given reference 850 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 851 which yields a p-value corresponding to the null hypothesis that the 852 underlying variances are equal). 853 854 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 855 sample should be used as a reference for this test. 856 ''' 857 858 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 859 ''' 860 Specifies the 18O/16O fractionation factor generally applicable 861 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 862 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 863 864 By default equal to 1.008129 (calcite reacted at 90 °C, 865 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 866 ''' 867 868 Nominal_d13C_VPDB = { 869 'ETH-1': 2.02, 870 'ETH-2': -10.17, 871 'ETH-3': 1.71, 872 } # (Bernasconi et al., 2018) 873 ''' 874 Nominal δ13C_VPDB values assigned to carbonate standards, used by 875 `D4xdata.standardize_d13C()`. 876 877 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 878 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 879 ''' 880 881 Nominal_d18O_VPDB = { 882 'ETH-1': -2.19, 883 'ETH-2': -18.69, 884 'ETH-3': -1.78, 885 } # (Bernasconi et al., 2018) 886 ''' 887 Nominal δ18O_VPDB values assigned to carbonate standards, used by 888 `D4xdata.standardize_d18O()`. 889 890 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 891 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 892 ''' 893 894 d13C_STANDARDIZATION_METHOD = '2pt' 895 ''' 896 Method by which to standardize δ13C values: 897 898 + `none`: do not apply any δ13C standardization. 899 + `'1pt'`: within each session, offset all initial δ13C values so as to 900 minimize the difference between final δ13C_VPDB values and 901 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 902 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 903 values so as to minimize the difference between final δ13C_VPDB 904 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 905 is defined). 906 ''' 907 908 d18O_STANDARDIZATION_METHOD = '2pt' 909 ''' 910 Method by which to standardize δ18O values: 911 912 + `none`: do not apply any δ18O standardization. 913 + `'1pt'`: within each session, offset all initial δ18O values so as to 914 minimize the difference between final δ18O_VPDB values and 915 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 916 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 917 values so as to minimize the difference between final δ18O_VPDB 918 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 919 is defined). 920 ''' 921 922 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 923 ''' 924 **Parameters** 925 926 + `l`: a list of dictionaries, with each dictionary including at least the keys 927 `Sample`, `d45`, `d46`, and `d47` or `d48`. 928 + `mass`: `'47'` or `'48'` 929 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 930 + `session`: define session name for analyses without a `Session` key 931 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 932 933 Returns a `D4xdata` object derived from `list`. 934 ''' 935 self._4x = mass 936 self.verbose = verbose 937 self.prefix = 'D4xdata' 938 self.logfile = logfile 939 list.__init__(self, l) 940 self.Nf = None 941 self.repeatability = {} 942 self.refresh(session = session) 943 944 945 def make_verbal(oldfun): 946 ''' 947 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 948 ''' 949 @wraps(oldfun) 950 def newfun(*args, verbose = '', **kwargs): 951 myself = args[0] 952 oldprefix = myself.prefix 953 myself.prefix = oldfun.__name__ 954 if verbose != '': 955 oldverbose = myself.verbose 956 myself.verbose = verbose 957 out = oldfun(*args, **kwargs) 958 myself.prefix = oldprefix 959 if verbose != '': 960 myself.verbose = oldverbose 961 return out 962 return newfun 963 964 965 def msg(self, txt): 966 ''' 967 Log a message to `self.logfile`, and print it out if `verbose = True` 968 ''' 969 self.log(txt) 970 if self.verbose: 971 print(f'{f"[{self.prefix}]":<16} {txt}') 972 973 974 def vmsg(self, txt): 975 ''' 976 Log a message to `self.logfile` and print it out 977 ''' 978 self.log(txt) 979 print(txt) 980 981 982 def log(self, *txts): 983 ''' 984 Log a message to `self.logfile` 985 ''' 986 if self.logfile: 987 with open(self.logfile, 'a') as fid: 988 for txt in txts: 989 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 990 991 992 def refresh(self, session = 'mySession'): 993 ''' 994 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 995 ''' 996 self.fill_in_missing_info(session = session) 997 self.refresh_sessions() 998 self.refresh_samples() 999 1000 1001 def refresh_sessions(self): 1002 ''' 1003 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1004 to `False` for all sessions. 1005 ''' 1006 self.sessions = { 1007 s: {'data': [r for r in self if r['Session'] == s]} 1008 for s in sorted({r['Session'] for r in self}) 1009 } 1010 for s in self.sessions: 1011 self.sessions[s]['scrambling_drift'] = False 1012 self.sessions[s]['slope_drift'] = False 1013 self.sessions[s]['wg_drift'] = False 1014 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1015 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1016 1017 1018 def refresh_samples(self): 1019 ''' 1020 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1021 ''' 1022 self.samples = { 1023 s: {'data': [r for r in self if r['Sample'] == s]} 1024 for s in sorted({r['Sample'] for r in self}) 1025 } 1026 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1027 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1028 1029 1030 def read(self, filename, sep = '', session = ''): 1031 ''' 1032 Read file in csv format to load data into a `D47data` object. 1033 1034 In the csv file, spaces before and after field separators (`','` by default) 1035 are optional. Each line corresponds to a single analysis. 1036 1037 The required fields are: 1038 1039 + `UID`: a unique identifier 1040 + `Session`: an identifier for the analytical session 1041 + `Sample`: a sample identifier 1042 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1043 1044 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1045 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1046 and `d49` are optional, and set to NaN by default. 1047 1048 **Parameters** 1049 1050 + `fileneme`: the path of the file to read 1051 + `sep`: csv separator delimiting the fields 1052 + `session`: set `Session` field to this string for all analyses 1053 ''' 1054 with open(filename) as fid: 1055 self.input(fid.read(), sep = sep, session = session) 1056 1057 1058 def input(self, txt, sep = '', session = ''): 1059 ''' 1060 Read `txt` string in csv format to load analysis data into a `D47data` object. 1061 1062 In the csv string, spaces before and after field separators (`','` by default) 1063 are optional. Each line corresponds to a single analysis. 1064 1065 The required fields are: 1066 1067 + `UID`: a unique identifier 1068 + `Session`: an identifier for the analytical session 1069 + `Sample`: a sample identifier 1070 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1071 1072 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1073 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1074 and `d49` are optional, and set to NaN by default. 1075 1076 **Parameters** 1077 1078 + `txt`: the csv string to read 1079 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1080 whichever appers most often in `txt`. 1081 + `session`: set `Session` field to this string for all analyses 1082 ''' 1083 if sep == '': 1084 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1085 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1086 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1087 1088 if session != '': 1089 for r in data: 1090 r['Session'] = session 1091 1092 self += data 1093 self.refresh() 1094 1095 1096 @make_verbal 1097 def wg(self, samples = None, a18_acid = None): 1098 ''' 1099 Compute bulk composition of the working gas for each session based on 1100 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1101 `self.Nominal_d18O_VPDB`. 1102 ''' 1103 1104 self.msg('Computing WG composition:') 1105 1106 if a18_acid is None: 1107 a18_acid = self.ALPHA_18O_ACID_REACTION 1108 if samples is None: 1109 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1110 1111 assert a18_acid, f'Acid fractionation factor should not be zero.' 1112 1113 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1114 R45R46_standards = {} 1115 for sample in samples: 1116 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1117 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1118 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1119 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1120 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1121 1122 C12_s = 1 / (1 + R13_s) 1123 C13_s = R13_s / (1 + R13_s) 1124 C16_s = 1 / (1 + R17_s + R18_s) 1125 C17_s = R17_s / (1 + R17_s + R18_s) 1126 C18_s = R18_s / (1 + R17_s + R18_s) 1127 1128 C626_s = C12_s * C16_s ** 2 1129 C627_s = 2 * C12_s * C16_s * C17_s 1130 C628_s = 2 * C12_s * C16_s * C18_s 1131 C636_s = C13_s * C16_s ** 2 1132 C637_s = 2 * C13_s * C16_s * C17_s 1133 C727_s = C12_s * C17_s ** 2 1134 1135 R45_s = (C627_s + C636_s) / C626_s 1136 R46_s = (C628_s + C637_s + C727_s) / C626_s 1137 R45R46_standards[sample] = (R45_s, R46_s) 1138 1139 for s in self.sessions: 1140 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1141 assert db, f'No sample from {samples} found in session "{s}".' 1142# dbsamples = sorted({r['Sample'] for r in db}) 1143 1144 X = [r['d45'] for r in db] 1145 Y = [R45R46_standards[r['Sample']][0] for r in db] 1146 x1, x2 = np.min(X), np.max(X) 1147 1148 if x1 < x2: 1149 wgcoord = x1/(x1-x2) 1150 else: 1151 wgcoord = 999 1152 1153 if wgcoord < -.5 or wgcoord > 1.5: 1154 # unreasonable to extrapolate to d45 = 0 1155 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1156 else : 1157 # d45 = 0 is reasonably well bracketed 1158 R45_wg = np.polyfit(X, Y, 1)[1] 1159 1160 X = [r['d46'] for r in db] 1161 Y = [R45R46_standards[r['Sample']][1] for r in db] 1162 x1, x2 = np.min(X), np.max(X) 1163 1164 if x1 < x2: 1165 wgcoord = x1/(x1-x2) 1166 else: 1167 wgcoord = 999 1168 1169 if wgcoord < -.5 or wgcoord > 1.5: 1170 # unreasonable to extrapolate to d46 = 0 1171 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1172 else : 1173 # d46 = 0 is reasonably well bracketed 1174 R46_wg = np.polyfit(X, Y, 1)[1] 1175 1176 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1177 1178 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1179 1180 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1181 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1182 for r in self.sessions[s]['data']: 1183 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1184 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1185 1186 1187 def compute_bulk_delta(self, R45, R46, D17O = 0): 1188 ''' 1189 Compute δ13C_VPDB and δ18O_VSMOW, 1190 by solving the generalized form of equation (17) from 1191 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1192 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1193 solving the corresponding second-order Taylor polynomial. 1194 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1195 ''' 1196 1197 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1198 1199 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1200 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1201 C = 2 * self.R18_VSMOW 1202 D = -R46 1203 1204 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1205 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1206 cc = A + B + C + D 1207 1208 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1209 1210 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1211 R17 = K * R18 ** self.LAMBDA_17 1212 R13 = R45 - 2 * R17 1213 1214 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1215 1216 return d13C_VPDB, d18O_VSMOW 1217 1218 1219 @make_verbal 1220 def crunch(self, verbose = ''): 1221 ''' 1222 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1223 ''' 1224 for r in self: 1225 self.compute_bulk_and_clumping_deltas(r) 1226 self.standardize_d13C() 1227 self.standardize_d18O() 1228 self.msg(f"Crunched {len(self)} analyses.") 1229 1230 1231 def fill_in_missing_info(self, session = 'mySession'): 1232 ''' 1233 Fill in optional fields with default values 1234 ''' 1235 for i,r in enumerate(self): 1236 if 'D17O' not in r: 1237 r['D17O'] = 0. 1238 if 'UID' not in r: 1239 r['UID'] = f'{i+1}' 1240 if 'Session' not in r: 1241 r['Session'] = session 1242 for k in ['d47', 'd48', 'd49']: 1243 if k not in r: 1244 r[k] = np.nan 1245 1246 1247 def standardize_d13C(self): 1248 ''' 1249 Perform δ13C standadization within each session `s` according to 1250 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1251 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1252 may be redefined abitrarily at a later stage. 1253 ''' 1254 for s in self.sessions: 1255 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1256 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1257 X,Y = zip(*XY) 1258 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1259 offset = np.mean(Y) - np.mean(X) 1260 for r in self.sessions[s]['data']: 1261 r['d13C_VPDB'] += offset 1262 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1263 a,b = np.polyfit(X,Y,1) 1264 for r in self.sessions[s]['data']: 1265 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1266 1267 def standardize_d18O(self): 1268 ''' 1269 Perform δ18O standadization within each session `s` according to 1270 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1271 which is defined by default by `D47data.refresh_sessions()`as equal to 1272 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1273 ''' 1274 for s in self.sessions: 1275 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1276 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1277 X,Y = zip(*XY) 1278 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1279 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1280 offset = np.mean(Y) - np.mean(X) 1281 for r in self.sessions[s]['data']: 1282 r['d18O_VSMOW'] += offset 1283 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1284 a,b = np.polyfit(X,Y,1) 1285 for r in self.sessions[s]['data']: 1286 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1287 1288 1289 def compute_bulk_and_clumping_deltas(self, r): 1290 ''' 1291 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1292 ''' 1293 1294 # Compute working gas R13, R18, and isobar ratios 1295 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1296 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1297 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1298 1299 # Compute analyte isobar ratios 1300 R45 = (1 + r['d45'] / 1000) * R45_wg 1301 R46 = (1 + r['d46'] / 1000) * R46_wg 1302 R47 = (1 + r['d47'] / 1000) * R47_wg 1303 R48 = (1 + r['d48'] / 1000) * R48_wg 1304 R49 = (1 + r['d49'] / 1000) * R49_wg 1305 1306 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1307 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1308 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1309 1310 # Compute stochastic isobar ratios of the analyte 1311 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1312 R13, R18, D17O = r['D17O'] 1313 ) 1314 1315 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1316 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1317 if (R45 / R45stoch - 1) > 5e-8: 1318 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1319 if (R46 / R46stoch - 1) > 5e-8: 1320 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1321 1322 # Compute raw clumped isotope anomalies 1323 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1324 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1325 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1326 1327 1328 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1329 ''' 1330 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1331 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1332 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1333 ''' 1334 1335 # Compute R17 1336 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1337 1338 # Compute isotope concentrations 1339 C12 = (1 + R13) ** -1 1340 C13 = C12 * R13 1341 C16 = (1 + R17 + R18) ** -1 1342 C17 = C16 * R17 1343 C18 = C16 * R18 1344 1345 # Compute stochastic isotopologue concentrations 1346 C626 = C16 * C12 * C16 1347 C627 = C16 * C12 * C17 * 2 1348 C628 = C16 * C12 * C18 * 2 1349 C636 = C16 * C13 * C16 1350 C637 = C16 * C13 * C17 * 2 1351 C638 = C16 * C13 * C18 * 2 1352 C727 = C17 * C12 * C17 1353 C728 = C17 * C12 * C18 * 2 1354 C737 = C17 * C13 * C17 1355 C738 = C17 * C13 * C18 * 2 1356 C828 = C18 * C12 * C18 1357 C838 = C18 * C13 * C18 1358 1359 # Compute stochastic isobar ratios 1360 R45 = (C636 + C627) / C626 1361 R46 = (C628 + C637 + C727) / C626 1362 R47 = (C638 + C728 + C737) / C626 1363 R48 = (C738 + C828) / C626 1364 R49 = C838 / C626 1365 1366 # Account for stochastic anomalies 1367 R47 *= 1 + D47 / 1000 1368 R48 *= 1 + D48 / 1000 1369 R49 *= 1 + D49 / 1000 1370 1371 # Return isobar ratios 1372 return R45, R46, R47, R48, R49 1373 1374 1375 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1376 ''' 1377 Split unknown samples by UID (treat all analyses as different samples) 1378 or by session (treat analyses of a given sample in different sessions as 1379 different samples). 1380 1381 **Parameters** 1382 1383 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1384 + `grouping`: `by_uid` | `by_session` 1385 ''' 1386 if samples_to_split == 'all': 1387 samples_to_split = [s for s in self.unknowns] 1388 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1389 self.grouping = grouping.lower() 1390 if self.grouping in gkeys: 1391 gkey = gkeys[self.grouping] 1392 for r in self: 1393 if r['Sample'] in samples_to_split: 1394 r['Sample_original'] = r['Sample'] 1395 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1396 elif r['Sample'] in self.unknowns: 1397 r['Sample_original'] = r['Sample'] 1398 self.refresh_samples() 1399 1400 1401 def unsplit_samples(self, tables = False): 1402 ''' 1403 Reverse the effects of `D47data.split_samples()`. 1404 1405 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1406 1407 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1408 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1409 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1410 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1411 that case session-averaged Δ4x values are statistically independent). 1412 ''' 1413 unknowns_old = sorted({s for s in self.unknowns}) 1414 CM_old = self.standardization.covar[:,:] 1415 VD_old = self.standardization.params.valuesdict().copy() 1416 vars_old = self.standardization.var_names 1417 1418 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1419 1420 Ns = len(vars_old) - len(unknowns_old) 1421 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1422 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1423 1424 W = np.zeros((len(vars_new), len(vars_old))) 1425 W[:Ns,:Ns] = np.eye(Ns) 1426 for u in unknowns_new: 1427 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1428 if self.grouping == 'by_session': 1429 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1430 elif self.grouping == 'by_uid': 1431 weights = [1 for s in splits] 1432 sw = sum(weights) 1433 weights = [w/sw for w in weights] 1434 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1435 1436 CM_new = W @ CM_old @ W.T 1437 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1438 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1439 1440 self.standardization.covar = CM_new 1441 self.standardization.params.valuesdict = lambda : VD_new 1442 self.standardization.var_names = vars_new 1443 1444 for r in self: 1445 if r['Sample'] in self.unknowns: 1446 r['Sample_split'] = r['Sample'] 1447 r['Sample'] = r['Sample_original'] 1448 1449 self.refresh_samples() 1450 self.consolidate_samples() 1451 self.repeatabilities() 1452 1453 if tables: 1454 self.table_of_analyses() 1455 self.table_of_samples() 1456 1457 def assign_timestamps(self): 1458 ''' 1459 Assign a time field `t` of type `float` to each analysis. 1460 1461 If `TimeTag` is one of the data fields, `t` is equal within a given session 1462 to `TimeTag` minus the mean value of `TimeTag` for that session. 1463 Otherwise, `TimeTag` is by default equal to the index of each analysis 1464 in the dataset and `t` is defined as above. 1465 ''' 1466 for session in self.sessions: 1467 sdata = self.sessions[session]['data'] 1468 try: 1469 t0 = np.mean([r['TimeTag'] for r in sdata]) 1470 for r in sdata: 1471 r['t'] = r['TimeTag'] - t0 1472 except KeyError: 1473 t0 = (len(sdata)-1)/2 1474 for t,r in enumerate(sdata): 1475 r['t'] = t - t0 1476 1477 1478 def report(self): 1479 ''' 1480 Prints a report on the standardization fit. 1481 Only applicable after `D4xdata.standardize(method='pooled')`. 1482 ''' 1483 report_fit(self.standardization) 1484 1485 1486 def combine_samples(self, sample_groups): 1487 ''' 1488 Combine analyses of different samples to compute weighted average Δ4x 1489 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1490 dictionary. 1491 1492 Caution: samples are weighted by number of replicate analyses, which is a 1493 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1494 correlated analytical errors for one or more samples). 1495 1496 Returns a tuplet of: 1497 1498 + the list of group names 1499 + an array of the corresponding Δ4x values 1500 + the corresponding (co)variance matrix 1501 1502 **Parameters** 1503 1504 + `sample_groups`: a dictionary of the form: 1505 ```py 1506 {'group1': ['sample_1', 'sample_2'], 1507 'group2': ['sample_3', 'sample_4', 'sample_5']} 1508 ``` 1509 ''' 1510 1511 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1512 groups = sorted(sample_groups.keys()) 1513 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1514 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1515 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1516 W = np.array([ 1517 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1518 for j in groups]) 1519 D4x_new = W @ D4x_old 1520 CM_new = W @ CM_old @ W.T 1521 1522 return groups, D4x_new[:,0], CM_new 1523 1524 1525 @make_verbal 1526 def standardize(self, 1527 method = 'pooled', 1528 weighted_sessions = [], 1529 consolidate = True, 1530 consolidate_tables = False, 1531 consolidate_plots = False, 1532 constraints = {}, 1533 ): 1534 ''' 1535 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1536 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1537 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1538 i.e. that their true Δ4x value does not change between sessions, 1539 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1540 `'indep_sessions'`, the standardization processes each session independently, based only 1541 on anchors analyses. 1542 ''' 1543 1544 self.standardization_method = method 1545 self.assign_timestamps() 1546 1547 if method == 'pooled': 1548 if weighted_sessions: 1549 for session_group in weighted_sessions: 1550 if self._4x == '47': 1551 X = D47data([r for r in self if r['Session'] in session_group]) 1552 elif self._4x == '48': 1553 X = D48data([r for r in self if r['Session'] in session_group]) 1554 X.Nominal_D4x = self.Nominal_D4x.copy() 1555 X.refresh() 1556 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1557 w = np.sqrt(result.redchi) 1558 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1559 for r in X: 1560 r[f'wD{self._4x}raw'] *= w 1561 else: 1562 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1563 for r in self: 1564 r[f'wD{self._4x}raw'] = 1. 1565 1566 params = Parameters() 1567 for k,session in enumerate(self.sessions): 1568 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1569 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1570 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1571 s = pf(session) 1572 params.add(f'a_{s}', value = 0.9) 1573 params.add(f'b_{s}', value = 0.) 1574 params.add(f'c_{s}', value = -0.9) 1575 params.add(f'a2_{s}', value = 0., 1576# vary = self.sessions[session]['scrambling_drift'], 1577 ) 1578 params.add(f'b2_{s}', value = 0., 1579# vary = self.sessions[session]['slope_drift'], 1580 ) 1581 params.add(f'c2_{s}', value = 0., 1582# vary = self.sessions[session]['wg_drift'], 1583 ) 1584 if not self.sessions[session]['scrambling_drift']: 1585 params[f'a2_{s}'].expr = '0' 1586 if not self.sessions[session]['slope_drift']: 1587 params[f'b2_{s}'].expr = '0' 1588 if not self.sessions[session]['wg_drift']: 1589 params[f'c2_{s}'].expr = '0' 1590 1591 for sample in self.unknowns: 1592 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1593 1594 for k in constraints: 1595 params[k].expr = constraints[k] 1596 1597 def residuals(p): 1598 R = [] 1599 for r in self: 1600 session = pf(r['Session']) 1601 sample = pf(r['Sample']) 1602 if r['Sample'] in self.Nominal_D4x: 1603 R += [ ( 1604 r[f'D{self._4x}raw'] - ( 1605 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1606 + p[f'b_{session}'] * r[f'd{self._4x}'] 1607 + p[f'c_{session}'] 1608 + r['t'] * ( 1609 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1610 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1611 + p[f'c2_{session}'] 1612 ) 1613 ) 1614 ) / r[f'wD{self._4x}raw'] ] 1615 else: 1616 R += [ ( 1617 r[f'D{self._4x}raw'] - ( 1618 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1619 + p[f'b_{session}'] * r[f'd{self._4x}'] 1620 + p[f'c_{session}'] 1621 + r['t'] * ( 1622 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1623 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1624 + p[f'c2_{session}'] 1625 ) 1626 ) 1627 ) / r[f'wD{self._4x}raw'] ] 1628 return R 1629 1630 M = Minimizer(residuals, params) 1631 result = M.least_squares() 1632 self.Nf = result.nfree 1633 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1634 new_names, new_covar, new_se = _fullcovar(result)[:3] 1635 result.var_names = new_names 1636 result.covar = new_covar 1637 1638 for r in self: 1639 s = pf(r["Session"]) 1640 a = result.params.valuesdict()[f'a_{s}'] 1641 b = result.params.valuesdict()[f'b_{s}'] 1642 c = result.params.valuesdict()[f'c_{s}'] 1643 a2 = result.params.valuesdict()[f'a2_{s}'] 1644 b2 = result.params.valuesdict()[f'b2_{s}'] 1645 c2 = result.params.valuesdict()[f'c2_{s}'] 1646 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1647 1648 1649 self.standardization = result 1650 1651 for session in self.sessions: 1652 self.sessions[session]['Np'] = 3 1653 for k in ['scrambling', 'slope', 'wg']: 1654 if self.sessions[session][f'{k}_drift']: 1655 self.sessions[session]['Np'] += 1 1656 1657 if consolidate: 1658 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1659 return result 1660 1661 1662 elif method == 'indep_sessions': 1663 1664 if weighted_sessions: 1665 for session_group in weighted_sessions: 1666 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1667 X.Nominal_D4x = self.Nominal_D4x.copy() 1668 X.refresh() 1669 # This is only done to assign r['wD47raw'] for r in X: 1670 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1671 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1672 else: 1673 self.msg('All weights set to 1 ‰') 1674 for r in self: 1675 r[f'wD{self._4x}raw'] = 1 1676 1677 for session in self.sessions: 1678 s = self.sessions[session] 1679 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1680 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1681 s['Np'] = sum(p_active) 1682 sdata = s['data'] 1683 1684 A = np.array([ 1685 [ 1686 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1687 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1688 1 / r[f'wD{self._4x}raw'], 1689 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1690 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1691 r['t'] / r[f'wD{self._4x}raw'] 1692 ] 1693 for r in sdata if r['Sample'] in self.anchors 1694 ])[:,p_active] # only keep columns for the active parameters 1695 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1696 s['Na'] = Y.size 1697 CM = linalg.inv(A.T @ A) 1698 bf = (CM @ A.T @ Y).T[0,:] 1699 k = 0 1700 for n,a in zip(p_names, p_active): 1701 if a: 1702 s[n] = bf[k] 1703# self.msg(f'{n} = {bf[k]}') 1704 k += 1 1705 else: 1706 s[n] = 0. 1707# self.msg(f'{n} = 0.0') 1708 1709 for r in sdata : 1710 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1711 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1712 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1713 1714 s['CM'] = np.zeros((6,6)) 1715 i = 0 1716 k_active = [j for j,a in enumerate(p_active) if a] 1717 for j,a in enumerate(p_active): 1718 if a: 1719 s['CM'][j,k_active] = CM[i,:] 1720 i += 1 1721 1722 if not weighted_sessions: 1723 w = self.rmswd()['rmswd'] 1724 for r in self: 1725 r[f'wD{self._4x}'] *= w 1726 r[f'wD{self._4x}raw'] *= w 1727 for session in self.sessions: 1728 self.sessions[session]['CM'] *= w**2 1729 1730 for session in self.sessions: 1731 s = self.sessions[session] 1732 s['SE_a'] = s['CM'][0,0]**.5 1733 s['SE_b'] = s['CM'][1,1]**.5 1734 s['SE_c'] = s['CM'][2,2]**.5 1735 s['SE_a2'] = s['CM'][3,3]**.5 1736 s['SE_b2'] = s['CM'][4,4]**.5 1737 s['SE_c2'] = s['CM'][5,5]**.5 1738 1739 if not weighted_sessions: 1740 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1741 else: 1742 self.Nf = 0 1743 for sg in weighted_sessions: 1744 self.Nf += self.rmswd(sessions = sg)['Nf'] 1745 1746 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1747 1748 avgD4x = { 1749 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1750 for sample in self.samples 1751 } 1752 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1753 rD4x = (chi2/self.Nf)**.5 1754 self.repeatability[f'sigma_{self._4x}'] = rD4x 1755 1756 if consolidate: 1757 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1758 1759 1760 def standardization_error(self, session, d4x, D4x, t = 0): 1761 ''' 1762 Compute standardization error for a given session and 1763 (δ47, Δ47) composition. 1764 ''' 1765 a = self.sessions[session]['a'] 1766 b = self.sessions[session]['b'] 1767 c = self.sessions[session]['c'] 1768 a2 = self.sessions[session]['a2'] 1769 b2 = self.sessions[session]['b2'] 1770 c2 = self.sessions[session]['c2'] 1771 CM = self.sessions[session]['CM'] 1772 1773 x, y = D4x, d4x 1774 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1775# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1776 dxdy = -(b+b2*t) / (a+a2*t) 1777 dxdz = 1. / (a+a2*t) 1778 dxda = -x / (a+a2*t) 1779 dxdb = -y / (a+a2*t) 1780 dxdc = -1. / (a+a2*t) 1781 dxda2 = -x * a2 / (a+a2*t) 1782 dxdb2 = -y * t / (a+a2*t) 1783 dxdc2 = -t / (a+a2*t) 1784 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1785 sx = (V @ CM @ V.T) ** .5 1786 return sx 1787 1788 1789 @make_verbal 1790 def summary(self, 1791 dir = 'output', 1792 filename = None, 1793 save_to_file = True, 1794 print_out = True, 1795 ): 1796 ''' 1797 Print out an/or save to disk a summary of the standardization results. 1798 1799 **Parameters** 1800 1801 + `dir`: the directory in which to save the table 1802 + `filename`: the name to the csv file to write to 1803 + `save_to_file`: whether to save the table to disk 1804 + `print_out`: whether to print out the table 1805 ''' 1806 1807 out = [] 1808 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1809 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1810 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1811 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1812 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1813 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1814 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1815 out += [['Model degrees of freedom', f"{self.Nf}"]] 1816 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1817 out += [['Standardization method', self.standardization_method]] 1818 1819 if save_to_file: 1820 if not os.path.exists(dir): 1821 os.makedirs(dir) 1822 if filename is None: 1823 filename = f'D{self._4x}_summary.csv' 1824 with open(f'{dir}/{filename}', 'w') as fid: 1825 fid.write(make_csv(out)) 1826 if print_out: 1827 self.msg('\n' + pretty_table(out, header = 0)) 1828 1829 1830 @make_verbal 1831 def table_of_sessions(self, 1832 dir = 'output', 1833 filename = None, 1834 save_to_file = True, 1835 print_out = True, 1836 output = None, 1837 ): 1838 ''' 1839 Print out an/or save to disk a table of sessions. 1840 1841 **Parameters** 1842 1843 + `dir`: the directory in which to save the table 1844 + `filename`: the name to the csv file to write to 1845 + `save_to_file`: whether to save the table to disk 1846 + `print_out`: whether to print out the table 1847 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1848 if set to `'raw'`: return a list of list of strings 1849 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1850 ''' 1851 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1852 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1853 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1854 1855 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1856 if include_a2: 1857 out[-1] += ['a2 ± SE'] 1858 if include_b2: 1859 out[-1] += ['b2 ± SE'] 1860 if include_c2: 1861 out[-1] += ['c2 ± SE'] 1862 for session in self.sessions: 1863 out += [[ 1864 session, 1865 f"{self.sessions[session]['Na']}", 1866 f"{self.sessions[session]['Nu']}", 1867 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1868 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1869 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1870 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1871 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1872 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1873 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1874 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1875 ]] 1876 if include_a2: 1877 if self.sessions[session]['scrambling_drift']: 1878 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1879 else: 1880 out[-1] += [''] 1881 if include_b2: 1882 if self.sessions[session]['slope_drift']: 1883 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1884 else: 1885 out[-1] += [''] 1886 if include_c2: 1887 if self.sessions[session]['wg_drift']: 1888 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1889 else: 1890 out[-1] += [''] 1891 1892 if save_to_file: 1893 if not os.path.exists(dir): 1894 os.makedirs(dir) 1895 if filename is None: 1896 filename = f'D{self._4x}_sessions.csv' 1897 with open(f'{dir}/{filename}', 'w') as fid: 1898 fid.write(make_csv(out)) 1899 if print_out: 1900 self.msg('\n' + pretty_table(out)) 1901 if output == 'raw': 1902 return out 1903 elif output == 'pretty': 1904 return pretty_table(out) 1905 1906 1907 @make_verbal 1908 def table_of_analyses( 1909 self, 1910 dir = 'output', 1911 filename = None, 1912 save_to_file = True, 1913 print_out = True, 1914 output = None, 1915 ): 1916 ''' 1917 Print out an/or save to disk a table of analyses. 1918 1919 **Parameters** 1920 1921 + `dir`: the directory in which to save the table 1922 + `filename`: the name to the csv file to write to 1923 + `save_to_file`: whether to save the table to disk 1924 + `print_out`: whether to print out the table 1925 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1926 if set to `'raw'`: return a list of list of strings 1927 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1928 ''' 1929 1930 out = [['UID','Session','Sample']] 1931 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1932 for f in extra_fields: 1933 out[-1] += [f[0]] 1934 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1935 for r in self: 1936 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1937 for f in extra_fields: 1938 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1939 out[-1] += [ 1940 f"{r['d13Cwg_VPDB']:.3f}", 1941 f"{r['d18Owg_VSMOW']:.3f}", 1942 f"{r['d45']:.6f}", 1943 f"{r['d46']:.6f}", 1944 f"{r['d47']:.6f}", 1945 f"{r['d48']:.6f}", 1946 f"{r['d49']:.6f}", 1947 f"{r['d13C_VPDB']:.6f}", 1948 f"{r['d18O_VSMOW']:.6f}", 1949 f"{r['D47raw']:.6f}", 1950 f"{r['D48raw']:.6f}", 1951 f"{r['D49raw']:.6f}", 1952 f"{r[f'D{self._4x}']:.6f}" 1953 ] 1954 if save_to_file: 1955 if not os.path.exists(dir): 1956 os.makedirs(dir) 1957 if filename is None: 1958 filename = f'D{self._4x}_analyses.csv' 1959 with open(f'{dir}/{filename}', 'w') as fid: 1960 fid.write(make_csv(out)) 1961 if print_out: 1962 self.msg('\n' + pretty_table(out)) 1963 return out 1964 1965 @make_verbal 1966 def covar_table( 1967 self, 1968 correl = False, 1969 dir = 'output', 1970 filename = None, 1971 save_to_file = True, 1972 print_out = True, 1973 output = None, 1974 ): 1975 ''' 1976 Print out, save to disk and/or return the variance-covariance matrix of D4x 1977 for all unknown samples. 1978 1979 **Parameters** 1980 1981 + `dir`: the directory in which to save the csv 1982 + `filename`: the name of the csv file to write to 1983 + `save_to_file`: whether to save the csv 1984 + `print_out`: whether to print out the matrix 1985 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1986 if set to `'raw'`: return a list of list of strings 1987 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1988 ''' 1989 samples = sorted([u for u in self.unknowns]) 1990 out = [[''] + samples] 1991 for s1 in samples: 1992 out.append([s1]) 1993 for s2 in samples: 1994 if correl: 1995 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1996 else: 1997 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1998 1999 if save_to_file: 2000 if not os.path.exists(dir): 2001 os.makedirs(dir) 2002 if filename is None: 2003 if correl: 2004 filename = f'D{self._4x}_correl.csv' 2005 else: 2006 filename = f'D{self._4x}_covar.csv' 2007 with open(f'{dir}/{filename}', 'w') as fid: 2008 fid.write(make_csv(out)) 2009 if print_out: 2010 self.msg('\n'+pretty_table(out)) 2011 if output == 'raw': 2012 return out 2013 elif output == 'pretty': 2014 return pretty_table(out) 2015 2016 @make_verbal 2017 def table_of_samples( 2018 self, 2019 dir = 'output', 2020 filename = None, 2021 save_to_file = True, 2022 print_out = True, 2023 output = None, 2024 ): 2025 ''' 2026 Print out, save to disk and/or return a table of samples. 2027 2028 **Parameters** 2029 2030 + `dir`: the directory in which to save the csv 2031 + `filename`: the name of the csv file to write to 2032 + `save_to_file`: whether to save the csv 2033 + `print_out`: whether to print out the table 2034 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2035 if set to `'raw'`: return a list of list of strings 2036 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2037 ''' 2038 2039 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2040 for sample in self.anchors: 2041 out += [[ 2042 f"{sample}", 2043 f"{self.samples[sample]['N']}", 2044 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2045 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2046 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2047 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2048 ]] 2049 for sample in self.unknowns: 2050 out += [[ 2051 f"{sample}", 2052 f"{self.samples[sample]['N']}", 2053 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2054 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2055 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2056 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2057 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2058 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2059 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2060 ]] 2061 if save_to_file: 2062 if not os.path.exists(dir): 2063 os.makedirs(dir) 2064 if filename is None: 2065 filename = f'D{self._4x}_samples.csv' 2066 with open(f'{dir}/{filename}', 'w') as fid: 2067 fid.write(make_csv(out)) 2068 if print_out: 2069 self.msg('\n'+pretty_table(out)) 2070 if output == 'raw': 2071 return out 2072 elif output == 'pretty': 2073 return pretty_table(out) 2074 2075 2076 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2077 ''' 2078 Generate session plots and save them to disk. 2079 2080 **Parameters** 2081 2082 + `dir`: the directory in which to save the plots 2083 + `figsize`: the width and height (in inches) of each plot 2084 + `filetype`: 'pdf' or 'png' 2085 + `dpi`: resolution for PNG output 2086 ''' 2087 if not os.path.exists(dir): 2088 os.makedirs(dir) 2089 2090 for session in self.sessions: 2091 sp = self.plot_single_session(session, xylimits = 'constant') 2092 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2093 ppl.close(sp.fig) 2094 2095 2096 @make_verbal 2097 def consolidate_samples(self): 2098 ''' 2099 Compile various statistics for each sample. 2100 2101 For each anchor sample: 2102 2103 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2104 + `SE_D47` or `SE_D48`: set to zero by definition 2105 2106 For each unknown sample: 2107 2108 + `D47` or `D48`: the standardized Δ4x value for this unknown 2109 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2110 2111 For each anchor and unknown: 2112 2113 + `N`: the total number of analyses of this sample 2114 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2115 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2116 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2117 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2118 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2119 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2120 ''' 2121 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2122 for sample in self.samples: 2123 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2124 if self.samples[sample]['N'] > 1: 2125 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2126 2127 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2128 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2129 2130 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2131 if len(D4x_pop) > 2: 2132 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2133 2134 if self.standardization_method == 'pooled': 2135 for sample in self.anchors: 2136 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2137 self.samples[sample][f'SE_D{self._4x}'] = 0. 2138 for sample in self.unknowns: 2139 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2140 try: 2141 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2142 except ValueError: 2143 # when `sample` is constrained by self.standardize(constraints = {...}), 2144 # it is no longer listed in self.standardization.var_names. 2145 # Temporary fix: define SE as zero for now 2146 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2147 2148 elif self.standardization_method == 'indep_sessions': 2149 for sample in self.anchors: 2150 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2151 self.samples[sample][f'SE_D{self._4x}'] = 0. 2152 for sample in self.unknowns: 2153 self.msg(f'Consolidating sample {sample}') 2154 self.unknowns[sample][f'session_D{self._4x}'] = {} 2155 session_avg = [] 2156 for session in self.sessions: 2157 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2158 if sdata: 2159 self.msg(f'{sample} found in session {session}') 2160 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2161 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2162 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2163 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2164 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2165 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2166 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2167 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2168 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2169 wsum = sum([weights[s] for s in weights]) 2170 for s in weights: 2171 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2172 2173 for r in self: 2174 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2175 2176 2177 2178 def consolidate_sessions(self): 2179 ''' 2180 Compute various statistics for each session. 2181 2182 + `Na`: Number of anchor analyses in the session 2183 + `Nu`: Number of unknown analyses in the session 2184 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2185 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2186 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2187 + `a`: scrambling factor 2188 + `b`: compositional slope 2189 + `c`: WG offset 2190 + `SE_a`: Model stadard erorr of `a` 2191 + `SE_b`: Model stadard erorr of `b` 2192 + `SE_c`: Model stadard erorr of `c` 2193 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2194 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2195 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2196 + `a2`: scrambling factor drift 2197 + `b2`: compositional slope drift 2198 + `c2`: WG offset drift 2199 + `Np`: Number of standardization parameters to fit 2200 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2201 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2202 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2203 ''' 2204 for session in self.sessions: 2205 if 'd13Cwg_VPDB' not in self.sessions[session]: 2206 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2207 if 'd18Owg_VSMOW' not in self.sessions[session]: 2208 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2209 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2210 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2211 2212 self.msg(f'Computing repeatabilities for session {session}') 2213 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2214 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2215 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2216 2217 if self.standardization_method == 'pooled': 2218 for session in self.sessions: 2219 2220 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2221 i = self.standardization.var_names.index(f'a_{pf(session)}') 2222 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2223 2224 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2225 i = self.standardization.var_names.index(f'b_{pf(session)}') 2226 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2227 2228 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2229 i = self.standardization.var_names.index(f'c_{pf(session)}') 2230 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2231 2232 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2233 if self.sessions[session]['scrambling_drift']: 2234 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2235 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2236 else: 2237 self.sessions[session]['SE_a2'] = 0. 2238 2239 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2240 if self.sessions[session]['slope_drift']: 2241 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2242 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2243 else: 2244 self.sessions[session]['SE_b2'] = 0. 2245 2246 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2247 if self.sessions[session]['wg_drift']: 2248 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2249 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2250 else: 2251 self.sessions[session]['SE_c2'] = 0. 2252 2253 i = self.standardization.var_names.index(f'a_{pf(session)}') 2254 j = self.standardization.var_names.index(f'b_{pf(session)}') 2255 k = self.standardization.var_names.index(f'c_{pf(session)}') 2256 CM = np.zeros((6,6)) 2257 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2258 try: 2259 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2260 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2261 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2262 try: 2263 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2264 CM[3,4] = self.standardization.covar[i2,j2] 2265 CM[4,3] = self.standardization.covar[j2,i2] 2266 except ValueError: 2267 pass 2268 try: 2269 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2270 CM[3,5] = self.standardization.covar[i2,k2] 2271 CM[5,3] = self.standardization.covar[k2,i2] 2272 except ValueError: 2273 pass 2274 except ValueError: 2275 pass 2276 try: 2277 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2278 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2279 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2280 try: 2281 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2282 CM[4,5] = self.standardization.covar[j2,k2] 2283 CM[5,4] = self.standardization.covar[k2,j2] 2284 except ValueError: 2285 pass 2286 except ValueError: 2287 pass 2288 try: 2289 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2290 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2291 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2292 except ValueError: 2293 pass 2294 2295 self.sessions[session]['CM'] = CM 2296 2297 elif self.standardization_method == 'indep_sessions': 2298 pass # Not implemented yet 2299 2300 2301 @make_verbal 2302 def repeatabilities(self): 2303 ''' 2304 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2305 (for all samples, for anchors, and for unknowns). 2306 ''' 2307 self.msg('Computing reproducibilities for all sessions') 2308 2309 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2310 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2311 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2312 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2313 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2314 2315 2316 @make_verbal 2317 def consolidate(self, tables = True, plots = True): 2318 ''' 2319 Collect information about samples, sessions and repeatabilities. 2320 ''' 2321 self.consolidate_samples() 2322 self.consolidate_sessions() 2323 self.repeatabilities() 2324 2325 if tables: 2326 self.summary() 2327 self.table_of_sessions() 2328 self.table_of_analyses() 2329 self.table_of_samples() 2330 2331 if plots: 2332 self.plot_sessions() 2333 2334 2335 @make_verbal 2336 def rmswd(self, 2337 samples = 'all samples', 2338 sessions = 'all sessions', 2339 ): 2340 ''' 2341 Compute the χ2, root mean squared weighted deviation 2342 (i.e. reduced χ2), and corresponding degrees of freedom of the 2343 Δ4x values for samples in `samples` and sessions in `sessions`. 2344 2345 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2346 ''' 2347 if samples == 'all samples': 2348 mysamples = [k for k in self.samples] 2349 elif samples == 'anchors': 2350 mysamples = [k for k in self.anchors] 2351 elif samples == 'unknowns': 2352 mysamples = [k for k in self.unknowns] 2353 else: 2354 mysamples = samples 2355 2356 if sessions == 'all sessions': 2357 sessions = [k for k in self.sessions] 2358 2359 chisq, Nf = 0, 0 2360 for sample in mysamples : 2361 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2362 if len(G) > 1 : 2363 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2364 Nf += (len(G) - 1) 2365 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2366 r = (chisq / Nf)**.5 if Nf > 0 else 0 2367 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2368 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2369 2370 2371 @make_verbal 2372 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2373 ''' 2374 Compute the repeatability of `[r[key] for r in self]` 2375 ''' 2376 2377 if samples == 'all samples': 2378 mysamples = [k for k in self.samples] 2379 elif samples == 'anchors': 2380 mysamples = [k for k in self.anchors] 2381 elif samples == 'unknowns': 2382 mysamples = [k for k in self.unknowns] 2383 else: 2384 mysamples = samples 2385 2386 if sessions == 'all sessions': 2387 sessions = [k for k in self.sessions] 2388 2389 if key in ['D47', 'D48']: 2390 # Full disclosure: the definition of Nf is tricky/debatable 2391 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2392 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2393 Nf = len(G) 2394# print(f'len(G) = {Nf}') 2395 Nf -= len([s for s in mysamples if s in self.unknowns]) 2396# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2397 for session in sessions: 2398 Np = len([ 2399 _ for _ in self.standardization.params 2400 if ( 2401 self.standardization.params[_].expr is not None 2402 and ( 2403 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2404 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2405 ) 2406 ) 2407 ]) 2408# print(f'session {session}: {Np} parameters to consider') 2409 Na = len({ 2410 r['Sample'] for r in self.sessions[session]['data'] 2411 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2412 }) 2413# print(f'session {session}: {Na} different anchors in that session') 2414 Nf -= min(Np, Na) 2415# print(f'Nf = {Nf}') 2416 2417# for sample in mysamples : 2418# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2419# if len(X) > 1 : 2420# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2421# if sample in self.unknowns: 2422# Nf += len(X) - 1 2423# else: 2424# Nf += len(X) 2425# if samples in ['anchors', 'all samples']: 2426# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2427 r = (chisq / Nf)**.5 if Nf > 0 else 0 2428 2429 else: # if key not in ['D47', 'D48'] 2430 chisq, Nf = 0, 0 2431 for sample in mysamples : 2432 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2433 if len(X) > 1 : 2434 Nf += len(X) - 1 2435 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2436 r = (chisq / Nf)**.5 if Nf > 0 else 0 2437 2438 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2439 return r 2440 2441 def sample_average(self, samples, weights = 'equal', normalize = True): 2442 ''' 2443 Weighted average Δ4x value of a group of samples, accounting for covariance. 2444 2445 Returns the weighed average Δ4x value and associated SE 2446 of a group of samples. Weights are equal by default. If `normalize` is 2447 true, `weights` will be rescaled so that their sum equals 1. 2448 2449 **Examples** 2450 2451 ```python 2452 self.sample_average(['X','Y'], [1, 2]) 2453 ``` 2454 2455 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2456 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2457 values of samples X and Y, respectively. 2458 2459 ```python 2460 self.sample_average(['X','Y'], [1, -1], normalize = False) 2461 ``` 2462 2463 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2464 ''' 2465 if weights == 'equal': 2466 weights = [1/len(samples)] * len(samples) 2467 2468 if normalize: 2469 s = sum(weights) 2470 if s: 2471 weights = [w/s for w in weights] 2472 2473 try: 2474# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2475# C = self.standardization.covar[indices,:][:,indices] 2476 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2477 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2478 return correlated_sum(X, C, weights) 2479 except ValueError: 2480 return (0., 0.) 2481 2482 2483 def sample_D4x_covar(self, sample1, sample2 = None): 2484 ''' 2485 Covariance between Δ4x values of samples 2486 2487 Returns the error covariance between the average Δ4x values of two 2488 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2489 returns the Δ4x variance for that sample. 2490 ''' 2491 if sample2 is None: 2492 sample2 = sample1 2493 if self.standardization_method == 'pooled': 2494 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2495 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2496 return self.standardization.covar[i, j] 2497 elif self.standardization_method == 'indep_sessions': 2498 if sample1 == sample2: 2499 return self.samples[sample1][f'SE_D{self._4x}']**2 2500 else: 2501 c = 0 2502 for session in self.sessions: 2503 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2504 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2505 if sdata1 and sdata2: 2506 a = self.sessions[session]['a'] 2507 # !! TODO: CM below does not account for temporal changes in standardization parameters 2508 CM = self.sessions[session]['CM'][:3,:3] 2509 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2510 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2511 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2512 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2513 c += ( 2514 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2515 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2516 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2517 @ CM 2518 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2519 ) / a**2 2520 return float(c) 2521 2522 def sample_D4x_correl(self, sample1, sample2 = None): 2523 ''' 2524 Correlation between Δ4x errors of samples 2525 2526 Returns the error correlation between the average Δ4x values of two samples. 2527 ''' 2528 if sample2 is None or sample2 == sample1: 2529 return 1. 2530 return ( 2531 self.sample_D4x_covar(sample1, sample2) 2532 / self.unknowns[sample1][f'SE_D{self._4x}'] 2533 / self.unknowns[sample2][f'SE_D{self._4x}'] 2534 ) 2535 2536 def plot_single_session(self, 2537 session, 2538 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2539 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2540 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2541 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2542 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2543 xylimits = 'free', # | 'constant' 2544 x_label = None, 2545 y_label = None, 2546 error_contour_interval = 'auto', 2547 fig = 'new', 2548 ): 2549 ''' 2550 Generate plot for a single session 2551 ''' 2552 if x_label is None: 2553 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2554 if y_label is None: 2555 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2556 2557 out = _SessionPlot() 2558 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2559 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2560 2561 if fig == 'new': 2562 out.fig = ppl.figure(figsize = (6,6)) 2563 ppl.subplots_adjust(.1,.1,.9,.9) 2564 2565 out.anchor_analyses, = ppl.plot( 2566 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2567 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2568 **kw_plot_anchors) 2569 out.unknown_analyses, = ppl.plot( 2570 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2571 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2572 **kw_plot_unknowns) 2573 out.anchor_avg = ppl.plot( 2574 np.array([ np.array([ 2575 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2576 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2577 ]) for sample in anchors]).T, 2578 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2579 **kw_plot_anchor_avg) 2580 out.unknown_avg = ppl.plot( 2581 np.array([ np.array([ 2582 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2583 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2584 ]) for sample in unknowns]).T, 2585 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2586 **kw_plot_unknown_avg) 2587 if xylimits == 'constant': 2588 x = [r[f'd{self._4x}'] for r in self] 2589 y = [r[f'D{self._4x}'] for r in self] 2590 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2591 w, h = x2-x1, y2-y1 2592 x1 -= w/20 2593 x2 += w/20 2594 y1 -= h/20 2595 y2 += h/20 2596 ppl.axis([x1, x2, y1, y2]) 2597 elif xylimits == 'free': 2598 x1, x2, y1, y2 = ppl.axis() 2599 else: 2600 x1, x2, y1, y2 = ppl.axis(xylimits) 2601 2602 if error_contour_interval != 'none': 2603 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2604 XI,YI = np.meshgrid(xi, yi) 2605 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2606 if error_contour_interval == 'auto': 2607 rng = np.max(SI) - np.min(SI) 2608 if rng <= 0.01: 2609 cinterval = 0.001 2610 elif rng <= 0.03: 2611 cinterval = 0.004 2612 elif rng <= 0.1: 2613 cinterval = 0.01 2614 elif rng <= 0.3: 2615 cinterval = 0.03 2616 elif rng <= 1.: 2617 cinterval = 0.1 2618 else: 2619 cinterval = 0.5 2620 else: 2621 cinterval = error_contour_interval 2622 2623 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2624 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2625 out.clabel = ppl.clabel(out.contour) 2626 2627 ppl.xlabel(x_label) 2628 ppl.ylabel(y_label) 2629 ppl.title(session, weight = 'bold') 2630 ppl.grid(alpha = .2) 2631 out.ax = ppl.gca() 2632 2633 return out 2634 2635 def plot_residuals( 2636 self, 2637 kde = False, 2638 hist = False, 2639 binwidth = 2/3, 2640 dir = 'output', 2641 filename = None, 2642 highlight = [], 2643 colors = None, 2644 figsize = None, 2645 dpi = 100, 2646 yspan = None, 2647 ): 2648 ''' 2649 Plot residuals of each analysis as a function of time (actually, as a function of 2650 the order of analyses in the `D4xdata` object) 2651 2652 + `kde`: whether to add a kernel density estimate of residuals 2653 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2654 + `histbins`: specify bin edges for the histogram 2655 + `dir`: the directory in which to save the plot 2656 + `highlight`: a list of samples to highlight 2657 + `colors`: a dict of `{<sample>: <color>}` for all samples 2658 + `figsize`: (width, height) of figure 2659 + `dpi`: resolution for PNG output 2660 + `yspan`: factor controlling the range of y values shown in plot 2661 (by default: `yspan = 1.5 if kde else 1.0`) 2662 ''' 2663 2664 from matplotlib import ticker 2665 2666 if yspan is None: 2667 if kde: 2668 yspan = 1.5 2669 else: 2670 yspan = 1.0 2671 2672 # Layout 2673 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2674 if hist or kde: 2675 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2676 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2677 else: 2678 ppl.subplots_adjust(.08,.05,.78,.8) 2679 ax1 = ppl.subplot(111) 2680 2681 # Colors 2682 N = len(self.anchors) 2683 if colors is None: 2684 if len(highlight) > 0: 2685 Nh = len(highlight) 2686 if Nh == 1: 2687 colors = {highlight[0]: (0,0,0)} 2688 elif Nh == 3: 2689 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2690 elif Nh == 4: 2691 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2692 else: 2693 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2694 else: 2695 if N == 3: 2696 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2697 elif N == 4: 2698 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2699 else: 2700 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2701 2702 ppl.sca(ax1) 2703 2704 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2705 2706 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2707 2708 session = self[0]['Session'] 2709 x1 = 0 2710# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2711 x_sessions = {} 2712 one_or_more_singlets = False 2713 one_or_more_multiplets = False 2714 multiplets = set() 2715 for k,r in enumerate(self): 2716 if r['Session'] != session: 2717 x2 = k-1 2718 x_sessions[session] = (x1+x2)/2 2719 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2720 session = r['Session'] 2721 x1 = k 2722 singlet = len(self.samples[r['Sample']]['data']) == 1 2723 if not singlet: 2724 multiplets.add(r['Sample']) 2725 if r['Sample'] in self.unknowns: 2726 if singlet: 2727 one_or_more_singlets = True 2728 else: 2729 one_or_more_multiplets = True 2730 kw = dict( 2731 marker = 'x' if singlet else '+', 2732 ms = 4 if singlet else 5, 2733 ls = 'None', 2734 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2735 mew = 1, 2736 alpha = 0.2 if singlet else 1, 2737 ) 2738 if highlight and r['Sample'] not in highlight: 2739 kw['alpha'] = 0.2 2740 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2741 x2 = k 2742 x_sessions[session] = (x1+x2)/2 2743 2744 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2745 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2746 if not (hist or kde): 2747 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2748 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2749 2750 xmin, xmax, ymin, ymax = ppl.axis() 2751 if yspan != 1: 2752 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2753 for s in x_sessions: 2754 ppl.text( 2755 x_sessions[s], 2756 ymax +1, 2757 s, 2758 va = 'bottom', 2759 **( 2760 dict(ha = 'center') 2761 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2762 else dict(ha = 'left', rotation = 45) 2763 ) 2764 ) 2765 2766 if hist or kde: 2767 ppl.sca(ax2) 2768 2769 for s in colors: 2770 kw['marker'] = '+' 2771 kw['ms'] = 5 2772 kw['mec'] = colors[s] 2773 kw['label'] = s 2774 kw['alpha'] = 1 2775 ppl.plot([], [], **kw) 2776 2777 kw['mec'] = (0,0,0) 2778 2779 if one_or_more_singlets: 2780 kw['marker'] = 'x' 2781 kw['ms'] = 4 2782 kw['alpha'] = .2 2783 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2784 ppl.plot([], [], **kw) 2785 2786 if one_or_more_multiplets: 2787 kw['marker'] = '+' 2788 kw['ms'] = 4 2789 kw['alpha'] = 1 2790 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2791 ppl.plot([], [], **kw) 2792 2793 if hist or kde: 2794 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2795 else: 2796 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2797 leg.set_zorder(-1000) 2798 2799 ppl.sca(ax1) 2800 2801 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2802 ppl.xticks([]) 2803 ppl.axis([-1, len(self), None, None]) 2804 2805 if hist or kde: 2806 ppl.sca(ax2) 2807 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2808 2809 if kde: 2810 from scipy.stats import gaussian_kde 2811 yi = np.linspace(ymin, ymax, 201) 2812 xi = gaussian_kde(X).evaluate(yi) 2813 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2814# ppl.plot(xi, yi, 'k-', lw = 1) 2815 elif hist: 2816 ppl.hist( 2817 X, 2818 orientation = 'horizontal', 2819 histtype = 'stepfilled', 2820 ec = [.4]*3, 2821 fc = [.25]*3, 2822 alpha = .25, 2823 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2824 ) 2825 ppl.text(0, 0, 2826 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2827 size = 7.5, 2828 alpha = 1, 2829 va = 'center', 2830 ha = 'left', 2831 ) 2832 2833 ppl.axis([0, None, ymin, ymax]) 2834 ppl.xticks([]) 2835 ppl.yticks([]) 2836# ax2.spines['left'].set_visible(False) 2837 ax2.spines['right'].set_visible(False) 2838 ax2.spines['top'].set_visible(False) 2839 ax2.spines['bottom'].set_visible(False) 2840 2841 ax1.axis([None, None, ymin, ymax]) 2842 2843 if not os.path.exists(dir): 2844 os.makedirs(dir) 2845 if filename is None: 2846 return fig 2847 elif filename == '': 2848 filename = f'D{self._4x}_residuals.pdf' 2849 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2850 ppl.close(fig) 2851 2852 2853 def simulate(self, *args, **kwargs): 2854 ''' 2855 Legacy function with warning message pointing to `virtual_data()` 2856 ''' 2857 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2858 2859 def plot_distribution_of_analyses( 2860 self, 2861 dir = 'output', 2862 filename = None, 2863 vs_time = False, 2864 figsize = (6,4), 2865 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2866 output = None, 2867 dpi = 100, 2868 ): 2869 ''' 2870 Plot temporal distribution of all analyses in the data set. 2871 2872 **Parameters** 2873 2874 + `dir`: the directory in which to save the plot 2875 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2876 + `dpi`: resolution for PNG output 2877 + `figsize`: (width, height) of figure 2878 + `dpi`: resolution for PNG output 2879 ''' 2880 2881 asamples = [s for s in self.anchors] 2882 usamples = [s for s in self.unknowns] 2883 if output is None or output == 'fig': 2884 fig = ppl.figure(figsize = figsize) 2885 ppl.subplots_adjust(*subplots_adjust) 2886 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2887 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2888 Xmax += (Xmax-Xmin)/40 2889 Xmin -= (Xmax-Xmin)/41 2890 for k, s in enumerate(asamples + usamples): 2891 if vs_time: 2892 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2893 else: 2894 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2895 Y = [-k for x in X] 2896 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2897 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2898 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2899 ppl.axis([Xmin, Xmax, -k-1, 1]) 2900 ppl.xlabel('\ntime') 2901 ppl.gca().annotate('', 2902 xy = (0.6, -0.02), 2903 xycoords = 'axes fraction', 2904 xytext = (.4, -0.02), 2905 arrowprops = dict(arrowstyle = "->", color = 'k'), 2906 ) 2907 2908 2909 x2 = -1 2910 for session in self.sessions: 2911 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2912 if vs_time: 2913 ppl.axvline(x1, color = 'k', lw = .75) 2914 if x2 > -1: 2915 if not vs_time: 2916 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2917 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2918# from xlrd import xldate_as_datetime 2919# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2920 if vs_time: 2921 ppl.axvline(x2, color = 'k', lw = .75) 2922 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2923 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2924 2925 ppl.xticks([]) 2926 ppl.yticks([]) 2927 2928 if output is None: 2929 if not os.path.exists(dir): 2930 os.makedirs(dir) 2931 if filename == None: 2932 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2933 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2934 ppl.close(fig) 2935 elif output == 'ax': 2936 return ppl.gca() 2937 elif output == 'fig': 2938 return fig 2939 2940 2941 def plot_bulk_compositions( 2942 self, 2943 samples = None, 2944 dir = 'output/bulk_compositions', 2945 figsize = (6,6), 2946 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2947 show = False, 2948 sample_color = (0,.5,1), 2949 analysis_color = (.7,.7,.7), 2950 labeldist = 0.3, 2951 radius = 0.05, 2952 ): 2953 ''' 2954 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2955 2956 By default, creates a directory `./output/bulk_compositions` where plots for 2957 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2958 2959 2960 **Parameters** 2961 2962 + `samples`: Only these samples are processed (by default: all samples). 2963 + `dir`: where to save the plots 2964 + `figsize`: (width, height) of figure 2965 + `subplots_adjust`: passed to `subplots_adjust()` 2966 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2967 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2968 + `sample_color`: color used for replicate markers/labels 2969 + `analysis_color`: color used for sample markers/labels 2970 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2971 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2972 ''' 2973 2974 from matplotlib.patches import Ellipse 2975 2976 if samples is None: 2977 samples = [_ for _ in self.samples] 2978 2979 saved = {} 2980 2981 for s in samples: 2982 2983 fig = ppl.figure(figsize = figsize) 2984 fig.subplots_adjust(*subplots_adjust) 2985 ax = ppl.subplot(111) 2986 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2987 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2988 ppl.title(s) 2989 2990 2991 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2992 UID = [_['UID'] for _ in self.samples[s]['data']] 2993 XY0 = XY.mean(0) 2994 2995 for xy in XY: 2996 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2997 2998 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 2999 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3000 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3001 saved[s] = [XY, XY0] 3002 3003 x1, x2, y1, y2 = ppl.axis() 3004 x0, dx = (x1+x2)/2, (x2-x1)/2 3005 y0, dy = (y1+y2)/2, (y2-y1)/2 3006 dx, dy = [max(max(dx, dy), radius)]*2 3007 3008 ppl.axis([ 3009 x0 - 1.2*dx, 3010 x0 + 1.2*dx, 3011 y0 - 1.2*dy, 3012 y0 + 1.2*dy, 3013 ]) 3014 3015 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3016 3017 for xy, uid in zip(XY, UID): 3018 3019 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3020 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3021 3022 if (vector_in_display_space**2).sum() > 0: 3023 3024 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3025 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3026 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3027 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3028 3029 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3030 3031 else: 3032 3033 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3034 3035 if radius: 3036 ax.add_artist(Ellipse( 3037 xy = XY0, 3038 width = radius*2, 3039 height = radius*2, 3040 ls = (0, (2,2)), 3041 lw = .7, 3042 ec = analysis_color, 3043 fc = 'None', 3044 )) 3045 ppl.text( 3046 XY0[0], 3047 XY0[1]-radius, 3048 f'\n± {radius*1e3:.0f} ppm', 3049 color = analysis_color, 3050 va = 'top', 3051 ha = 'center', 3052 linespacing = 0.4, 3053 size = 8, 3054 ) 3055 3056 if not os.path.exists(dir): 3057 os.makedirs(dir) 3058 fig.savefig(f'{dir}/{s}.pdf') 3059 ppl.close(fig) 3060 3061 fig = ppl.figure(figsize = figsize) 3062 fig.subplots_adjust(*subplots_adjust) 3063 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3064 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3065 3066 for s in saved: 3067 for xy in saved[s][0]: 3068 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3069 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3070 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3071 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3072 3073 x1, x2, y1, y2 = ppl.axis() 3074 ppl.axis([ 3075 x1 - (x2-x1)/10, 3076 x2 + (x2-x1)/10, 3077 y1 - (y2-y1)/10, 3078 y2 + (y2-y1)/10, 3079 ]) 3080 3081 3082 if not os.path.exists(dir): 3083 os.makedirs(dir) 3084 fig.savefig(f'{dir}/__all__.pdf') 3085 if show: 3086 ppl.show() 3087 ppl.close(fig) 3088 3089 3090 def _save_D4x_correl( 3091 self, 3092 samples = None, 3093 dir = 'output', 3094 filename = None, 3095 D4x_precision = 4, 3096 correl_precision = 4, 3097 ): 3098 ''' 3099 Save D4x values along with their SE and correlation matrix. 3100 3101 **Parameters** 3102 3103 + `samples`: Only these samples are output (by default: all samples). 3104 + `dir`: the directory in which to save the faile (by defaut: `output`) 3105 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3106 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3107 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3108 ''' 3109 if samples is None: 3110 samples = sorted([s for s in self.unknowns]) 3111 3112 out = [['Sample']] + [[s] for s in samples] 3113 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3114 for k,s in enumerate(samples): 3115 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3116 for s2 in samples: 3117 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3118 3119 if not os.path.exists(dir): 3120 os.makedirs(dir) 3121 if filename is None: 3122 filename = f'D{self._4x}_correl.csv' 3123 with open(f'{dir}/{filename}', 'w') as fid: 3124 fid.write(make_csv(out)) 3125 3126 3127 3128 3129class D47data(D4xdata): 3130 ''' 3131 Store and process data for a large set of Δ47 analyses, 3132 usually comprising more than one analytical session. 3133 ''' 3134 3135 Nominal_D4x = { 3136 'ETH-1': 0.2052, 3137 'ETH-2': 0.2085, 3138 'ETH-3': 0.6132, 3139 'ETH-4': 0.4511, 3140 'IAEA-C1': 0.3018, 3141 'IAEA-C2': 0.6409, 3142 'MERCK': 0.5135, 3143 } # I-CDES (Bernasconi et al., 2021) 3144 ''' 3145 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3146 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3147 reference frame. 3148 3149 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3150 ```py 3151 { 3152 'ETH-1' : 0.2052, 3153 'ETH-2' : 0.2085, 3154 'ETH-3' : 0.6132, 3155 'ETH-4' : 0.4511, 3156 'IAEA-C1' : 0.3018, 3157 'IAEA-C2' : 0.6409, 3158 'MERCK' : 0.5135, 3159 } 3160 ``` 3161 ''' 3162 3163 3164 @property 3165 def Nominal_D47(self): 3166 return self.Nominal_D4x 3167 3168 3169 @Nominal_D47.setter 3170 def Nominal_D47(self, new): 3171 self.Nominal_D4x = dict(**new) 3172 self.refresh() 3173 3174 3175 def __init__(self, l = [], **kwargs): 3176 ''' 3177 **Parameters:** same as `D4xdata.__init__()` 3178 ''' 3179 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3180 3181 3182 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3183 ''' 3184 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3185 value for that temperature, and add treat these samples as additional anchors. 3186 3187 **Parameters** 3188 3189 + `fCo2eqD47`: Which CO2 equilibrium law to use 3190 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3191 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3192 + `priority`: if `replace`: forget old anchors and only use the new ones; 3193 if `new`: keep pre-existing anchors but update them in case of conflict 3194 between old and new Δ47 values; 3195 if `old`: keep pre-existing anchors but preserve their original Δ47 3196 values in case of conflict. 3197 ''' 3198 f = { 3199 'petersen': fCO2eqD47_Petersen, 3200 'wang': fCO2eqD47_Wang, 3201 }[fCo2eqD47] 3202 foo = {} 3203 for r in self: 3204 if 'Teq' in r: 3205 if r['Sample'] in foo: 3206 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3207 else: 3208 foo[r['Sample']] = f(r['Teq']) 3209 else: 3210 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3211 3212 if priority == 'replace': 3213 self.Nominal_D47 = {} 3214 for s in foo: 3215 if priority != 'old' or s not in self.Nominal_D47: 3216 self.Nominal_D47[s] = foo[s] 3217 3218 def save_D47_correl(self, *args, **kwargs): 3219 return self._save_D4x_correl(*args, **kwargs) 3220 3221 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47') 3222 3223 3224class D48data(D4xdata): 3225 ''' 3226 Store and process data for a large set of Δ48 analyses, 3227 usually comprising more than one analytical session. 3228 ''' 3229 3230 Nominal_D4x = { 3231 'ETH-1': 0.138, 3232 'ETH-2': 0.138, 3233 'ETH-3': 0.270, 3234 'ETH-4': 0.223, 3235 'GU-1': -0.419, 3236 } # (Fiebig et al., 2019, 2021) 3237 ''' 3238 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3239 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3240 reference frame. 3241 3242 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3243 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3244 3245 ```py 3246 { 3247 'ETH-1' : 0.138, 3248 'ETH-2' : 0.138, 3249 'ETH-3' : 0.270, 3250 'ETH-4' : 0.223, 3251 'GU-1' : -0.419, 3252 } 3253 ``` 3254 ''' 3255 3256 3257 @property 3258 def Nominal_D48(self): 3259 return self.Nominal_D4x 3260 3261 3262 @Nominal_D48.setter 3263 def Nominal_D48(self, new): 3264 self.Nominal_D4x = dict(**new) 3265 self.refresh() 3266 3267 3268 def __init__(self, l = [], **kwargs): 3269 ''' 3270 **Parameters:** same as `D4xdata.__init__()` 3271 ''' 3272 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3273 3274 def save_D48_correl(self, *args, **kwargs): 3275 return self._save_D4x_correl(*args, **kwargs) 3276 3277 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48') 3278 3279 3280class D49data(D4xdata): 3281 ''' 3282 Store and process data for a large set of Δ49 analyses, 3283 usually comprising more than one analytical session. 3284 ''' 3285 3286 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3287 ''' 3288 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3289 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3290 reference frame. 3291 3292 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3293 3294 ```py 3295 { 3296 "1000C": 0.0, 3297 "25C": 2.228 3298 } 3299 ``` 3300 ''' 3301 3302 @property 3303 def Nominal_D49(self): 3304 return self.Nominal_D4x 3305 3306 @Nominal_D49.setter 3307 def Nominal_D49(self, new): 3308 self.Nominal_D4x = dict(**new) 3309 self.refresh() 3310 3311 def __init__(self, l=[], **kwargs): 3312 ''' 3313 **Parameters:** same as `D4xdata.__init__()` 3314 ''' 3315 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3316 3317 def save_D49_correl(self, *args, **kwargs): 3318 return self._save_D4x_correl(*args, **kwargs) 3319 3320 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49') 3321 3322class _SessionPlot(): 3323 ''' 3324 Simple placeholder class 3325 ''' 3326 def __init__(self): 3327 pass 3328 3329_app = typer.Typer( 3330 add_completion = False, 3331 context_settings={'help_option_names': ['-h', '--help']}, 3332 rich_markup_mode = 'rich', 3333 ) 3334 3335@_app.command() 3336def _cli( 3337 rawdata: Annotated[str, typer.Argument(help = "Specify the path of a rawdata input file")], 3338 exclude: Annotated[str, typer.Option('--exclude', '-e', help = 'The path of a file specifying UIDs and/or Samples to exclude')] = 'none', 3339 anchors: Annotated[str, typer.Option('--anchors', '-a', help = 'The path of a file specifying custom anchors')] = 'none', 3340 output_dir: Annotated[str, typer.Option('--output-dir', '-o', help = 'Specify the output directory')] = 'output', 3341 run_D48: Annotated[bool, typer.Option('--D48', help = 'Also standardize D48')] = False, 3342 ): 3343 """ 3344 Process raw D47 data and return standardized results. 3345 3346 See [b]https://mdaeron.github.io/D47crunch/#3-command-line-interface-cli[/b] for more details. 3347 3348 Reads raw data from an input file, optionally excluding some samples and/or analyses, thean standardizes 3349 the data based either on the default [b]d13C_VDPB[/b], [b]d18O_VPDB[/b], [b]D47[/b], and [b]D48[/b] anchors or on different 3350 user-specified anchors. A new directory (named `output` by default) is created to store the results and 3351 the following sequence is applied: 3352 3353 * [b]D47data.wg()[/b] 3354 * [b]D47data.crunch()[/b] 3355 * [b]D47data.standardize()[/b] 3356 * [b]D47data.summary()[/b] 3357 * [b]D47data.table_of_samples()[/b] 3358 * [b]D47data.table_of_sessions()[/b] 3359 * [b]D47data.plot_sessions()[/b] 3360 * [b]D47data.plot_residuals()[/b] 3361 * [b]D47data.table_of_analyses()[/b] 3362 * [b]D47data.plot_distribution_of_analyses()[/b] 3363 * [b]D47data.plot_bulk_compositions()[/b] 3364 * [b]D47data.save_D47_correl()[/b] 3365 3366 Optionally, also apply similar methods for [b]]D48[/b]. 3367 3368 [b]Example CSV file for --anchors option:[/b] 3369 [i] 3370 Sample, d13C_VPDB, d18O_VPDB, D47, D48 3371 ETH-1, 2.02, -2.19, 0.2052, 0.138 3372 ETH-2, -10.17, -18.69, 0.2085, 0.138 3373 ETH-3, 1.71, -1.78, 0.6132, 0.270 3374 ETH-4, , , 0.4511, 0.223 3375 [/i] 3376 Except for [i]Sample[/i], none of the columns above are mandatory. 3377 3378 [b]Example CSV file for --exclude option:[/b] 3379 [i] 3380 Sample, UID 3381 FOO-1, 3382 BAR-2, 3383 , A04 3384 , A17 3385 , A88 3386 [/i] 3387 This will exclude all analyses of samples [i]FOO-1[/i] and [i]BAR-2[/i], 3388 and the analyses with UIDs [i]A04[/i], [i]A17[/i], and [i]A88[/i]. 3389 Neither column is mandatory. 3390 """ 3391 3392 data = D47data() 3393 data.read(rawdata) 3394 3395 if exclude != 'none': 3396 exclude = read_csv(exclude) 3397 exclude_uid = {r['UID'] for r in exclude if 'UID' in r} 3398 exclude_sample = {r['Sample'] for r in exclude if 'Sample' in r} 3399 else: 3400 exclude_uid = [] 3401 exclude_sample = [] 3402 3403 data = D47data([r for r in data if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3404 3405 if anchors != 'none': 3406 anchors = read_csv(anchors) 3407 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3408 data.Nominal_d13C_VPDB = { 3409 _['Sample']: _['d13C_VPDB'] 3410 for _ in anchors 3411 if 'd13C_VPDB' in _ 3412 } 3413 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3414 data.Nominal_d18O_VPDB = { 3415 _['Sample']: _['d18O_VPDB'] 3416 for _ in anchors 3417 if 'd18O_VPDB' in _ 3418 } 3419 if len([_ for _ in anchors if 'D47' in _]): 3420 data.Nominal_D4x = { 3421 _['Sample']: _['D47'] 3422 for _ in anchors 3423 if 'D47' in _ 3424 } 3425 3426 data.refresh() 3427 data.wg() 3428 data.crunch() 3429 data.standardize() 3430 data.summary(dir = output_dir) 3431 data.plot_residuals(dir = output_dir, filename = 'D47_residuals.pdf', kde = True) 3432 data.plot_bulk_compositions(dir = output_dir + '/bulk_compositions') 3433 data.plot_sessions(dir = output_dir) 3434 data.save_D47_correl(dir = output_dir) 3435 3436 if not run_D48: 3437 data.table_of_samples(dir = output_dir) 3438 data.table_of_analyses(dir = output_dir) 3439 data.table_of_sessions(dir = output_dir) 3440 3441 3442 if run_D48: 3443 data2 = D48data() 3444 print(rawdata) 3445 data2.read(rawdata) 3446 3447 data2 = D48data([r for r in data2 if r['UID'] not in exclude_uid and r['Sample'] not in exclude_sample]) 3448 3449 if anchors != 'none': 3450 if len([_ for _ in anchors if 'd13C_VPDB' in _]): 3451 data2.Nominal_d13C_VPDB = { 3452 _['Sample']: _['d13C_VPDB'] 3453 for _ in anchors 3454 if 'd13C_VPDB' in _ 3455 } 3456 if len([_ for _ in anchors if 'd18O_VPDB' in _]): 3457 data2.Nominal_d18O_VPDB = { 3458 _['Sample']: _['d18O_VPDB'] 3459 for _ in anchors 3460 if 'd18O_VPDB' in _ 3461 } 3462 if len([_ for _ in anchors if 'D48' in _]): 3463 data2.Nominal_D4x = { 3464 _['Sample']: _['D48'] 3465 for _ in anchors 3466 if 'D48' in _ 3467 } 3468 3469 data2.refresh() 3470 data2.wg() 3471 data2.crunch() 3472 data2.standardize() 3473 data2.summary(dir = output_dir) 3474 data2.plot_sessions(dir = output_dir) 3475 data2.plot_residuals(dir = output_dir, filename = 'D48_residuals.pdf', kde = True) 3476 data2.plot_distribution_of_analyses(dir = output_dir) 3477 data2.save_D48_correl(dir = output_dir) 3478 3479 table_of_analyses(data, data2, dir = output_dir) 3480 table_of_samples(data, data2, dir = output_dir) 3481 table_of_sessions(data, data2, dir = output_dir) 3482 3483def __cli(): 3484 _app()
68def fCO2eqD47_Petersen(T): 69 ''' 70 CO2 equilibrium Δ47 value as a function of T (in degrees C) 71 according to [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127). 72 73 ''' 74 return float(_fCO2eqD47_Petersen(T))
CO2 equilibrium Δ47 value as a function of T (in degrees C) according to Petersen et al. (2019).
79def fCO2eqD47_Wang(T): 80 ''' 81 CO2 equilibrium Δ47 value as a function of `T` (in degrees C) 82 according to [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039) 83 (supplementary data of [Dennis et al., 2011](https://doi.org/10.1016/j.gca.2011.09.025)). 84 ''' 85 return float(_fCO2eqD47_Wang(T))
CO2 equilibrium Δ47 value as a function of T
(in degrees C)
according to Wang et al. (2004)
(supplementary data of Dennis et al., 2011).
107def make_csv(x, hsep = ',', vsep = '\n'): 108 ''' 109 Formats a list of lists of strings as a CSV 110 111 **Parameters** 112 113 + `x`: the list of lists of strings to format 114 + `hsep`: the field separator (`,` by default) 115 + `vsep`: the line-ending convention to use (`\\n` by default) 116 117 **Example** 118 119 ```py 120 print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']])) 121 ``` 122 123 outputs: 124 125 ```py 126 a,b,c 127 d,e,f 128 ``` 129 ''' 130 return vsep.join([hsep.join(l) for l in x])
Formats a list of lists of strings as a CSV
Parameters
x
: the list of lists of strings to formathsep
: the field separator (,
by default)vsep
: the line-ending convention to use (\n
by default)
Example
print(make_csv([['a', 'b', 'c'], ['d', 'e', 'f']]))
outputs:
a,b,c
d,e,f
133def pf(txt): 134 ''' 135 Modify string `txt` to follow `lmfit.Parameter()` naming rules. 136 ''' 137 return txt.replace('-','_').replace('.','_').replace(' ','_')
Modify string txt
to follow lmfit.Parameter()
naming rules.
140def smart_type(x): 141 ''' 142 Tries to convert string `x` to a float if it includes a decimal point, or 143 to an integer if it does not. If both attempts fail, return the original 144 string unchanged. 145 ''' 146 try: 147 y = float(x) 148 except ValueError: 149 return x 150 if '.' not in x: 151 return int(y) 152 return y
Tries to convert string x
to a float if it includes a decimal point, or
to an integer if it does not. If both attempts fail, return the original
string unchanged.
155def pretty_table(x, header = 1, hsep = ' ', vsep = '–', align = '<'): 156 ''' 157 Reads a list of lists of strings and outputs an ascii table 158 159 **Parameters** 160 161 + `x`: a list of lists of strings 162 + `header`: the number of lines to treat as header lines 163 + `hsep`: the horizontal separator between columns 164 + `vsep`: the character to use as vertical separator 165 + `align`: string of left (`<`) or right (`>`) alignment characters. 166 167 **Example** 168 169 ```py 170 x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']] 171 print(pretty_table(x)) 172 ``` 173 yields: 174 ``` 175 -- ------ --- 176 A B C 177 -- ------ --- 178 1 1.9999 foo 179 10 x bar 180 -- ------ --- 181 ``` 182 183 ''' 184 txt = [] 185 widths = [np.max([len(e) for e in c]) for c in zip(*x)] 186 187 if len(widths) > len(align): 188 align += '>' * (len(widths)-len(align)) 189 sepline = hsep.join([vsep*w for w in widths]) 190 txt += [sepline] 191 for k,l in enumerate(x): 192 if k and k == header: 193 txt += [sepline] 194 txt += [hsep.join([f'{e:{a}{w}}' for e, w, a in zip(l, widths, align)])] 195 txt += [sepline] 196 txt += [''] 197 return '\n'.join(txt)
Reads a list of lists of strings and outputs an ascii table
Parameters
x
: a list of lists of stringsheader
: the number of lines to treat as header lineshsep
: the horizontal separator between columnsvsep
: the character to use as vertical separatoralign
: string of left (<
) or right (>
) alignment characters.
Example
x = [['A', 'B', 'C'], ['1', '1.9999', 'foo'], ['10', 'x', 'bar']]
print(pretty_table(x))
yields:
-- ------ ---
A B C
-- ------ ---
1 1.9999 foo
10 x bar
-- ------ ---
200def transpose_table(x): 201 ''' 202 Transpose a list if lists 203 204 **Parameters** 205 206 + `x`: a list of lists 207 208 **Example** 209 210 ```py 211 x = [[1, 2], [3, 4]] 212 print(transpose_table(x)) # yields: [[1, 3], [2, 4]] 213 ``` 214 ''' 215 return [[e for e in c] for c in zip(*x)]
Transpose a list if lists
Parameters
x
: a list of lists
Example
x = [[1, 2], [3, 4]]
print(transpose_table(x)) # yields: [[1, 3], [2, 4]]
218def w_avg(X, sX) : 219 ''' 220 Compute variance-weighted average 221 222 Returns the value and SE of the weighted average of the elements of `X`, 223 with relative weights equal to their inverse variances (`1/sX**2`). 224 225 **Parameters** 226 227 + `X`: array-like of elements to average 228 + `sX`: array-like of the corresponding SE values 229 230 **Tip** 231 232 If `X` and `sX` are initially arranged as a list of `(x, sx)` doublets, 233 they may be rearranged using `zip()`: 234 235 ```python 236 foo = [(0, 1), (1, 0.5), (2, 0.5)] 237 print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333) 238 ``` 239 ''' 240 X = [ x for x in X ] 241 sX = [ sx for sx in sX ] 242 W = [ sx**-2 for sx in sX ] 243 W = [ w/sum(W) for w in W ] 244 Xavg = sum([ w*x for w,x in zip(W,X) ]) 245 sXavg = sum([ w**2*sx**2 for w,sx in zip(W,sX) ])**.5 246 return Xavg, sXavg
Compute variance-weighted average
Returns the value and SE of the weighted average of the elements of X
,
with relative weights equal to their inverse variances (1/sX**2
).
Parameters
X
: array-like of elements to averagesX
: array-like of the corresponding SE values
Tip
If X
and sX
are initially arranged as a list of (x, sx)
doublets,
they may be rearranged using zip()
:
foo = [(0, 1), (1, 0.5), (2, 0.5)]
print(w_avg(*zip(*foo))) # yields: (1.3333333333333333, 0.3333333333333333)
249def read_csv(filename, sep = ''): 250 ''' 251 Read contents of `filename` in csv format and return a list of dictionaries. 252 253 In the csv string, spaces before and after field separators (`','` by default) 254 are optional. 255 256 **Parameters** 257 258 + `filename`: the csv file to read 259 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 260 whichever appers most often in the contents of `filename`. 261 ''' 262 with open(filename) as fid: 263 txt = fid.read() 264 265 if sep == '': 266 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 267 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 268 return [{k: smart_type(v) for k,v in zip(txt[0], l) if v} for l in txt[1:]]
Read contents of filename
in csv format and return a list of dictionaries.
In the csv string, spaces before and after field separators (','
by default)
are optional.
Parameters
filename
: the csv file to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in the contents of
filename
.
271def simulate_single_analysis( 272 sample = 'MYSAMPLE', 273 d13Cwg_VPDB = -4., d18Owg_VSMOW = 26., 274 d13C_VPDB = None, d18O_VPDB = None, 275 D47 = None, D48 = None, D49 = 0., D17O = 0., 276 a47 = 1., b47 = 0., c47 = -0.9, 277 a48 = 1., b48 = 0., c48 = -0.45, 278 Nominal_D47 = None, 279 Nominal_D48 = None, 280 Nominal_d13C_VPDB = None, 281 Nominal_d18O_VPDB = None, 282 ALPHA_18O_ACID_REACTION = None, 283 R13_VPDB = None, 284 R17_VSMOW = None, 285 R18_VSMOW = None, 286 LAMBDA_17 = None, 287 R18_VPDB = None, 288 ): 289 ''' 290 Compute working-gas delta values for a single analysis, assuming a stochastic working 291 gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values). 292 293 **Parameters** 294 295 + `sample`: sample name 296 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 297 (respectively –4 and +26 ‰ by default) 298 + `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 299 + `D47`, `D48`, `D49`, `D17O`: clumped-isotope and oxygen-17 anomalies 300 of the carbonate sample 301 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and 302 Δ48 values if `D47` or `D48` are not specified 303 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 304 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 305 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 306 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 307 correction parameters (by default equal to the `D4xdata` default values) 308 309 Returns a dictionary with fields 310 `['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']`. 311 ''' 312 313 if Nominal_d13C_VPDB is None: 314 Nominal_d13C_VPDB = D4xdata().Nominal_d13C_VPDB 315 316 if Nominal_d18O_VPDB is None: 317 Nominal_d18O_VPDB = D4xdata().Nominal_d18O_VPDB 318 319 if ALPHA_18O_ACID_REACTION is None: 320 ALPHA_18O_ACID_REACTION = D4xdata().ALPHA_18O_ACID_REACTION 321 322 if R13_VPDB is None: 323 R13_VPDB = D4xdata().R13_VPDB 324 325 if R17_VSMOW is None: 326 R17_VSMOW = D4xdata().R17_VSMOW 327 328 if R18_VSMOW is None: 329 R18_VSMOW = D4xdata().R18_VSMOW 330 331 if LAMBDA_17 is None: 332 LAMBDA_17 = D4xdata().LAMBDA_17 333 334 if R18_VPDB is None: 335 R18_VPDB = D4xdata().R18_VPDB 336 337 R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW) ** LAMBDA_17 338 339 if Nominal_D47 is None: 340 Nominal_D47 = D47data().Nominal_D47 341 342 if Nominal_D48 is None: 343 Nominal_D48 = D48data().Nominal_D48 344 345 if d13C_VPDB is None: 346 if sample in Nominal_d13C_VPDB: 347 d13C_VPDB = Nominal_d13C_VPDB[sample] 348 else: 349 raise KeyError(f"Sample {sample} is missing d13C_VDP value, and it is not defined in Nominal_d13C_VDP.") 350 351 if d18O_VPDB is None: 352 if sample in Nominal_d18O_VPDB: 353 d18O_VPDB = Nominal_d18O_VPDB[sample] 354 else: 355 raise KeyError(f"Sample {sample} is missing d18O_VPDB value, and it is not defined in Nominal_d18O_VPDB.") 356 357 if D47 is None: 358 if sample in Nominal_D47: 359 D47 = Nominal_D47[sample] 360 else: 361 raise KeyError(f"Sample {sample} is missing D47 value, and it is not defined in Nominal_D47.") 362 363 if D48 is None: 364 if sample in Nominal_D48: 365 D48 = Nominal_D48[sample] 366 else: 367 raise KeyError(f"Sample {sample} is missing D48 value, and it is not defined in Nominal_D48.") 368 369 X = D4xdata() 370 X.R13_VPDB = R13_VPDB 371 X.R17_VSMOW = R17_VSMOW 372 X.R18_VSMOW = R18_VSMOW 373 X.LAMBDA_17 = LAMBDA_17 374 X.R18_VPDB = R18_VPDB 375 X.R17_VPDB = R17_VSMOW * (R18_VPDB / R18_VSMOW)**LAMBDA_17 376 377 R45wg, R46wg, R47wg, R48wg, R49wg = X.compute_isobar_ratios( 378 R13 = R13_VPDB * (1 + d13Cwg_VPDB/1000), 379 R18 = R18_VSMOW * (1 + d18Owg_VSMOW/1000), 380 ) 381 R45, R46, R47, R48, R49 = X.compute_isobar_ratios( 382 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 383 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 384 D17O=D17O, D47=D47, D48=D48, D49=D49, 385 ) 386 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = X.compute_isobar_ratios( 387 R13 = R13_VPDB * (1 + d13C_VPDB/1000), 388 R18 = R18_VPDB * (1 + d18O_VPDB/1000) * ALPHA_18O_ACID_REACTION, 389 D17O=D17O, 390 ) 391 392 d45 = 1000 * (R45/R45wg - 1) 393 d46 = 1000 * (R46/R46wg - 1) 394 d47 = 1000 * (R47/R47wg - 1) 395 d48 = 1000 * (R48/R48wg - 1) 396 d49 = 1000 * (R49/R49wg - 1) 397 398 for k in range(3): # dumb iteration to adjust for small changes in d47 399 R47raw = (1 + (a47 * D47 + b47 * d47 + c47)/1000) * R47stoch 400 R48raw = (1 + (a48 * D48 + b48 * d48 + c48)/1000) * R48stoch 401 d47 = 1000 * (R47raw/R47wg - 1) 402 d48 = 1000 * (R48raw/R48wg - 1) 403 404 return dict( 405 Sample = sample, 406 D17O = D17O, 407 d13Cwg_VPDB = d13Cwg_VPDB, 408 d18Owg_VSMOW = d18Owg_VSMOW, 409 d45 = d45, 410 d46 = d46, 411 d47 = d47, 412 d48 = d48, 413 d49 = d49, 414 )
Compute working-gas delta values for a single analysis, assuming a stochastic working gas and a “perfect” measurement (i.e. raw Δ values are identical to absolute values).
Parameters
sample
: sample named13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (respectively –4 and +26 ‰ by default)d13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
: clumped-isotope and oxygen-17 anomalies of the carbonate sampleNominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specifiedNominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specifiedALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factorR13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to theD4xdata
default values)
Returns a dictionary with fields
['Sample', 'D17O', 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'd45', 'd46', 'd47', 'd48', 'd49']
.
417def virtual_data( 418 samples = [], 419 a47 = 1., b47 = 0., c47 = -0.9, 420 a48 = 1., b48 = 0., c48 = -0.45, 421 rd45 = 0.020, rd46 = 0.060, 422 rD47 = 0.015, rD48 = 0.045, 423 d13Cwg_VPDB = None, d18Owg_VSMOW = None, 424 session = None, 425 Nominal_D47 = None, Nominal_D48 = None, 426 Nominal_d13C_VPDB = None, Nominal_d18O_VPDB = None, 427 ALPHA_18O_ACID_REACTION = None, 428 R13_VPDB = None, 429 R17_VSMOW = None, 430 R18_VSMOW = None, 431 LAMBDA_17 = None, 432 R18_VPDB = None, 433 seed = 0, 434 shuffle = True, 435 ): 436 ''' 437 Return list with simulated analyses from a single session. 438 439 **Parameters** 440 441 + `samples`: a list of entries; each entry is a dictionary with the following fields: 442 * `Sample`: the name of the sample 443 * `d13C_VPDB`, `d18O_VPDB`: bulk composition of the carbonate sample 444 * `D47`, `D48`, `D49`, `D17O` (all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sample 445 * `N`: how many analyses to generate for this sample 446 + `a47`: scrambling factor for Δ47 447 + `b47`: compositional nonlinearity for Δ47 448 + `c47`: working gas offset for Δ47 449 + `a48`: scrambling factor for Δ48 450 + `b48`: compositional nonlinearity for Δ48 451 + `c48`: working gas offset for Δ48 452 + `rd45`: analytical repeatability of δ45 453 + `rd46`: analytical repeatability of δ46 454 + `rD47`: analytical repeatability of Δ47 455 + `rD48`: analytical repeatability of Δ48 456 + `d13Cwg_VPDB`, `d18Owg_VSMOW`: bulk composition of the working gas 457 (by default equal to the `simulate_single_analysis` default values) 458 + `session`: name of the session (no name by default) 459 + `Nominal_D47`, `Nominal_D48`: where to lookup Δ47 and Δ48 values 460 if `D47` or `D48` are not specified (by default equal to the `simulate_single_analysis` defaults) 461 + `Nominal_d13C_VPDB`, `Nominal_d18O_VPDB`: where to lookup δ13C and 462 δ18O values if `d13C_VPDB` or `d18O_VPDB` are not specified 463 (by default equal to the `simulate_single_analysis` defaults) 464 + `ALPHA_18O_ACID_REACTION`: 18O/16O acid fractionation factor 465 (by default equal to the `simulate_single_analysis` defaults) 466 + `R13_VPDB`, `R17_VSMOW`, `R18_VSMOW`, `LAMBDA_17`, `R18_VPDB`: oxygen-17 467 correction parameters (by default equal to the `simulate_single_analysis` default) 468 + `seed`: explicitly set to a non-zero value to achieve random but repeatable simulations 469 + `shuffle`: randomly reorder the sequence of analyses 470 471 472 Here is an example of using this method to generate an arbitrary combination of 473 anchors and unknowns for a bunch of sessions: 474 475 ```py 476 .. include:: ../code_examples/virtual_data/example.py 477 ``` 478 479 This should output something like: 480 481 ``` 482 .. include:: ../code_examples/virtual_data/output.txt 483 ``` 484 ''' 485 486 kwargs = locals().copy() 487 488 from numpy import random as nprandom 489 if seed: 490 rng = nprandom.default_rng(seed) 491 else: 492 rng = nprandom.default_rng() 493 494 N = sum([s['N'] for s in samples]) 495 errors45 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 496 errors45 *= rd45 / stdev(errors45) # scale errors to rd45 497 errors46 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 498 errors46 *= rd46 / stdev(errors46) # scale errors to rd46 499 errors47 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 500 errors47 *= rD47 / stdev(errors47) # scale errors to rD47 501 errors48 = rng.normal(loc = 0, scale = 1, size = N) # generate random measurement errors 502 errors48 *= rD48 / stdev(errors48) # scale errors to rD48 503 504 k = 0 505 out = [] 506 for s in samples: 507 kw = {} 508 kw['sample'] = s['Sample'] 509 kw = { 510 **kw, 511 **{var: kwargs[var] 512 for var in [ 513 'd13Cwg_VPDB', 'd18Owg_VSMOW', 'ALPHA_18O_ACID_REACTION', 514 'Nominal_D47', 'Nominal_D48', 'Nominal_d13C_VPDB', 'Nominal_d18O_VPDB', 515 'R13_VPDB', 'R17_VSMOW', 'R18_VSMOW', 'LAMBDA_17', 'R18_VPDB', 516 'a47', 'b47', 'c47', 'a48', 'b48', 'c48', 517 ] 518 if kwargs[var] is not None}, 519 **{var: s[var] 520 for var in ['d13C_VPDB', 'd18O_VPDB', 'D47', 'D48', 'D49', 'D17O'] 521 if var in s}, 522 } 523 524 sN = s['N'] 525 while sN: 526 out.append(simulate_single_analysis(**kw)) 527 out[-1]['d45'] += errors45[k] 528 out[-1]['d46'] += errors46[k] 529 out[-1]['d47'] += (errors45[k] + errors46[k] + errors47[k]) * a47 530 out[-1]['d48'] += (2*errors46[k] + errors48[k]) * a48 531 sN -= 1 532 k += 1 533 534 if session is not None: 535 for r in out: 536 r['Session'] = session 537 538 if shuffle: 539 nprandom.shuffle(out) 540 541 return out
Return list with simulated analyses from a single session.
Parameters
samples
: a list of entries; each entry is a dictionary with the following fields:Sample
: the name of the sampled13C_VPDB
,d18O_VPDB
: bulk composition of the carbonate sampleD47
,D48
,D49
,D17O
(all optional): clumped-isotope and oxygen-17 anomalies of the carbonate sampleN
: how many analyses to generate for this sample
a47
: scrambling factor for Δ47b47
: compositional nonlinearity for Δ47c47
: working gas offset for Δ47a48
: scrambling factor for Δ48b48
: compositional nonlinearity for Δ48c48
: working gas offset for Δ48rd45
: analytical repeatability of δ45rd46
: analytical repeatability of δ46rD47
: analytical repeatability of Δ47rD48
: analytical repeatability of Δ48d13Cwg_VPDB
,d18Owg_VSMOW
: bulk composition of the working gas (by default equal to thesimulate_single_analysis
default values)session
: name of the session (no name by default)Nominal_D47
,Nominal_D48
: where to lookup Δ47 and Δ48 values ifD47
orD48
are not specified (by default equal to thesimulate_single_analysis
defaults)Nominal_d13C_VPDB
,Nominal_d18O_VPDB
: where to lookup δ13C and δ18O values ifd13C_VPDB
ord18O_VPDB
are not specified (by default equal to thesimulate_single_analysis
defaults)ALPHA_18O_ACID_REACTION
: 18O/16O acid fractionation factor (by default equal to thesimulate_single_analysis
defaults)R13_VPDB
,R17_VSMOW
,R18_VSMOW
,LAMBDA_17
,R18_VPDB
: oxygen-17 correction parameters (by default equal to thesimulate_single_analysis
default)seed
: explicitly set to a non-zero value to achieve random but repeatable simulationsshuffle
: randomly reorder the sequence of analyses
Here is an example of using this method to generate an arbitrary combination of anchors and unknowns for a bunch of sessions:
from D47crunch import virtual_data, D47data
args = dict(
samples = [
dict(Sample = 'ETH-1', N = 3),
dict(Sample = 'ETH-2', N = 3),
dict(Sample = 'ETH-3', N = 3),
dict(Sample = 'FOO', N = 3,
d13C_VPDB = -5., d18O_VPDB = -10.,
D47 = 0.3, D48 = 0.15),
dict(Sample = 'BAR', N = 3,
d13C_VPDB = -15., d18O_VPDB = -2.,
D47 = 0.6, D48 = 0.2),
], rD47 = 0.010, rD48 = 0.030)
session1 = virtual_data(session = 'Session_01', **args, seed = 123)
session2 = virtual_data(session = 'Session_02', **args, seed = 1234)
session3 = virtual_data(session = 'Session_03', **args, seed = 12345)
session4 = virtual_data(session = 'Session_04', **args, seed = 123456)
D = D47data(session1 + session2 + session3 + session4)
D.crunch()
D.standardize()
D.table_of_sessions(verbose = True, save_to_file = False)
D.table_of_samples(verbose = True, save_to_file = False)
D.table_of_analyses(verbose = True, save_to_file = False)
This should output something like:
[table_of_sessions]
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
Session Na Nu d13Cwg_VPDB d18Owg_VSMOW r_d13C r_d18O r_D47 a ± SE 1e3 x b ± SE c ± SE
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
Session_01 9 6 -4.000 26.000 0.0205 0.0633 0.0091 1.015 ± 0.015 0.427 ± 0.232 -0.909 ± 0.006
Session_02 9 6 -4.000 26.000 0.0210 0.0882 0.0100 0.990 ± 0.015 0.484 ± 0.232 -0.905 ± 0.006
Session_03 9 6 -4.000 26.000 0.0186 0.0505 0.0111 0.997 ± 0.015 0.167 ± 0.233 -0.901 ± 0.006
Session_04 9 6 -4.000 26.000 0.0192 0.0467 0.0086 1.017 ± 0.015 0.229 ± 0.232 -0.910 ± 0.006
–––––––––– –– –– ––––––––––– –––––––––––– –––––– –––––– –––––– ––––––––––––– ––––––––––––– ––––––––––––––
[table_of_samples]
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
Sample N d13C_VPDB d18O_VSMOW D47 SE 95% CL SD p_Levene
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
ETH-1 12 2.02 37.01 0.2052 0.0083
ETH-2 12 -10.17 19.88 0.2085 0.0090
ETH-3 12 1.71 37.46 0.6132 0.0083
BAR 12 -15.02 37.22 0.6057 0.0042 ± 0.0085 0.0088 0.753
FOO 12 -5.00 28.89 0.3024 0.0031 ± 0.0062 0.0070 0.497
–––––– –– ––––––––– –––––––––– –––––– –––––– –––––––– –––––– ––––––––
[table_of_analyses]
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
UID Session Sample d13Cwg_VPDB d18Owg_VSMOW d45 d46 d47 d48 d49 d13C_VPDB d18O_VSMOW D47raw D48raw D49raw D47
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
1 Session_01 ETH-1 -4.000 26.000 6.049381 10.706856 16.135579 21.196941 27.780042 2.057827 36.937067 -0.685751 -0.324384 0.045870 0.212791
2 Session_01 ETH-3 -4.000 26.000 5.755174 11.255104 16.792797 22.451660 28.306614 1.723596 37.497816 -0.270825 -0.181089 -0.195908 0.621458
3 Session_01 ETH-2 -4.000 26.000 -5.982229 -6.110437 -12.827036 -12.492272 -18.023381 -10.166188 19.784916 -0.693555 -0.312598 0.251040 0.217274
4 Session_01 ETH-1 -4.000 26.000 5.995601 10.755323 16.116087 21.285428 27.780042 1.998631 36.986704 -0.696924 -0.333640 0.008600 0.201787
5 Session_01 BAR -4.000 26.000 -9.920507 10.903408 0.065076 21.704075 10.707292 -14.998270 37.174839 -0.307018 -0.216978 -0.026076 0.592818
6 Session_01 FOO -4.000 26.000 -0.876454 2.906764 1.341194 5.490264 4.665655 -5.048760 28.984806 -0.608593 -0.329808 -0.114437 0.295055
7 Session_01 FOO -4.000 26.000 -0.838118 2.819853 1.310384 5.326005 4.665655 -5.004629 28.895933 -0.593755 -0.319861 0.014956 0.309692
8 Session_01 ETH-2 -4.000 26.000 -5.974124 -5.955517 -12.668784 -12.208184 -18.023381 -10.163274 19.943159 -0.694902 -0.336672 -0.063946 0.215880
9 Session_01 ETH-3 -4.000 26.000 5.727341 11.211663 16.713472 22.364770 28.306614 1.695479 37.453503 -0.278056 -0.180158 -0.082015 0.614365
10 Session_01 FOO -4.000 26.000 -0.848028 2.874679 1.346196 5.439150 4.665655 -5.017230 28.951964 -0.601502 -0.316664 -0.081898 0.302042
11 Session_01 BAR -4.000 26.000 -9.959983 10.926995 0.053806 21.724901 10.707292 -15.041279 37.199026 -0.300066 -0.243252 -0.029371 0.599675
12 Session_01 BAR -4.000 26.000 -9.915975 10.968470 0.153453 21.749385 10.707292 -14.995822 37.241294 -0.286638 -0.301325 -0.157376 0.612868
13 Session_01 ETH-3 -4.000 26.000 5.734896 11.229855 16.740410 22.402091 28.306614 1.702875 37.472070 -0.276998 -0.179635 -0.125368 0.615396
14 Session_01 ETH-2 -4.000 26.000 -5.991278 -5.995054 -12.741562 -12.184075 -18.023381 -10.180122 19.902809 -0.711697 -0.232746 0.032602 0.199357
15 Session_01 ETH-1 -4.000 26.000 6.010276 10.840276 16.207960 21.475150 27.780042 2.011176 37.073454 -0.704188 -0.315986 -0.172089 0.194589
16 Session_02 ETH-3 -4.000 26.000 5.757137 11.232751 16.744567 22.398244 28.306614 1.731295 37.514660 -0.298533 -0.189123 -0.154557 0.604363
17 Session_02 ETH-1 -4.000 26.000 5.993918 10.617469 15.991900 21.070358 27.780042 2.006934 36.882679 -0.683329 -0.271476 0.278458 0.216152
18 Session_02 ETH-3 -4.000 26.000 5.719281 11.207303 16.681693 22.370886 28.306614 1.691780 37.488633 -0.296801 -0.165556 -0.065004 0.606143
19 Session_02 ETH-3 -4.000 26.000 5.716356 11.091821 16.582487 22.123857 28.306614 1.692901 37.370126 -0.279100 -0.178789 0.162540 0.624067
20 Session_02 ETH-1 -4.000 26.000 6.030532 10.851030 16.245571 21.457100 27.780042 2.037466 37.122284 -0.698413 -0.354920 -0.214443 0.200795
21 Session_02 BAR -4.000 26.000 -9.963888 10.865863 -0.023549 21.615868 10.707292 -15.053743 37.174715 -0.313906 -0.229031 0.093637 0.597041
22 Session_02 FOO -4.000 26.000 -0.819742 2.826793 1.317044 5.330616 4.665655 -4.986618 28.903335 -0.612871 -0.329113 -0.018244 0.294481
23 Session_02 ETH-1 -4.000 26.000 6.019963 10.773112 16.163825 21.331060 27.780042 2.029040 37.042346 -0.692234 -0.324161 -0.051788 0.207075
24 Session_02 ETH-2 -4.000 26.000 -5.982371 -6.036210 -12.762399 -12.309944 -18.023381 -10.175178 19.819614 -0.701348 -0.277354 0.104418 0.212021
25 Session_02 FOO -4.000 26.000 -0.835046 2.870518 1.355370 5.487896 4.665655 -5.004585 28.948243 -0.601666 -0.259900 -0.087592 0.305777
26 Session_02 ETH-2 -4.000 26.000 -5.950370 -5.959974 -12.650784 -12.197864 -18.023381 -10.143809 19.897777 -0.696916 -0.317263 -0.080604 0.216441
27 Session_02 BAR -4.000 26.000 -9.936020 10.862339 0.024660 21.563307 10.707292 -15.023836 37.171034 -0.291333 -0.273498 0.070452 0.619812
28 Session_02 FOO -4.000 26.000 -0.848415 2.849823 1.308081 5.427767 4.665655 -5.018107 28.927036 -0.614791 -0.278426 -0.032784 0.292547
29 Session_02 BAR -4.000 26.000 -9.957566 10.903888 0.031785 21.739434 10.707292 -15.048386 37.213724 -0.302139 -0.183327 0.012926 0.608897
30 Session_02 ETH-2 -4.000 26.000 -5.993476 -5.944866 -12.696865 -12.149754 -18.023381 -10.190430 19.913381 -0.713779 -0.298963 -0.064251 0.199436
31 Session_03 FOO -4.000 26.000 -0.800284 2.851299 1.376828 5.379547 4.665655 -4.951581 28.910199 -0.597293 -0.329315 -0.087015 0.304784
32 Session_03 ETH-3 -4.000 26.000 5.753467 11.206589 16.719131 22.373244 28.306614 1.723960 37.511190 -0.294350 -0.161838 -0.099835 0.606103
33 Session_03 ETH-2 -4.000 26.000 -5.997147 -5.905858 -12.655382 -12.081612 -18.023381 -10.165400 19.891551 -0.706536 -0.308464 -0.137414 0.197550
34 Session_03 FOO -4.000 26.000 -0.873798 2.820799 1.272165 5.370745 4.665655 -5.028782 28.878917 -0.596008 -0.277258 0.051165 0.306090
35 Session_03 BAR -4.000 26.000 -9.928709 10.989665 0.148059 21.852677 10.707292 -14.976237 37.324152 -0.299358 -0.242185 -0.184835 0.603855
36 Session_03 ETH-2 -4.000 26.000 -6.000290 -5.947172 -12.697463 -12.164602 -18.023381 -10.167221 19.848953 -0.705037 -0.309350 -0.052386 0.199061
37 Session_03 ETH-2 -4.000 26.000 -6.008525 -5.909707 -12.647727 -12.075913 -18.023381 -10.177379 19.887608 -0.683183 -0.294956 -0.117608 0.220975
38 Session_03 ETH-3 -4.000 26.000 5.748546 11.079879 16.580826 22.120063 28.306614 1.723364 37.380534 -0.302133 -0.158882 0.151641 0.598318
39 Session_03 FOO -4.000 26.000 -0.823857 2.761300 1.258060 5.239992 4.665655 -4.973383 28.817444 -0.603327 -0.288652 0.114488 0.298751
40 Session_03 ETH-1 -4.000 26.000 5.994622 10.743980 16.116098 21.243734 27.780042 1.997857 37.033567 -0.684883 -0.352014 0.031692 0.214449
41 Session_03 ETH-3 -4.000 26.000 5.718991 11.146227 16.640814 22.243185 28.306614 1.689442 37.449023 -0.277332 -0.169668 0.053997 0.623187
42 Session_03 ETH-1 -4.000 26.000 6.040566 10.786620 16.205283 21.374963 27.780042 2.045244 37.077432 -0.685706 -0.307909 -0.099869 0.213609
43 Session_03 BAR -4.000 26.000 -9.952115 11.034508 0.169809 21.885915 10.707292 -15.002819 37.370451 -0.296804 -0.298351 -0.246731 0.606414
44 Session_03 ETH-1 -4.000 26.000 6.004078 10.683951 16.045192 21.214355 27.780042 2.010134 36.971642 -0.705956 -0.262026 0.138399 0.193323
45 Session_03 BAR -4.000 26.000 -9.957114 10.898997 0.044946 21.602296 10.707292 -15.003175 37.230716 -0.284699 -0.307849 0.021944 0.618578
46 Session_04 ETH-2 -4.000 26.000 -5.966627 -5.893789 -12.597717 -12.120719 -18.023381 -10.161842 19.911776 -0.691757 -0.372308 -0.193986 0.217132
47 Session_04 ETH-3 -4.000 26.000 5.751908 11.207110 16.726741 22.380392 28.306614 1.705481 37.480657 -0.285776 -0.155878 -0.099197 0.609567
48 Session_04 BAR -4.000 26.000 -9.951025 10.951923 0.089386 21.738926 10.707292 -15.031949 37.254709 -0.298065 -0.278834 -0.087463 0.601230
49 Session_04 FOO -4.000 26.000 -0.848192 2.777763 1.251297 5.280272 4.665655 -5.023358 28.822585 -0.601094 -0.281419 0.108186 0.303128
50 Session_04 ETH-1 -4.000 26.000 6.017312 10.735930 16.123043 21.270597 27.780042 2.005824 36.995214 -0.693479 -0.309795 0.023309 0.208980
51 Session_04 ETH-2 -4.000 26.000 -5.973623 -5.975018 -12.694278 -12.194472 -18.023381 -10.166297 19.828211 -0.701951 -0.283570 -0.025935 0.207135
52 Session_04 BAR -4.000 26.000 -9.931741 10.819830 -0.023748 21.529372 10.707292 -15.006533 37.118743 -0.302866 -0.222623 0.148462 0.596536
53 Session_04 ETH-1 -4.000 26.000 6.023822 10.730714 16.121184 21.235757 27.780042 2.012958 36.989833 -0.696908 -0.333582 0.026555 0.205610
54 Session_04 FOO -4.000 26.000 -0.791191 2.708220 1.256167 5.145784 4.665655 -4.960004 28.750896 -0.586913 -0.276505 0.183674 0.317065
55 Session_04 FOO -4.000 26.000 -0.853969 2.805035 1.267571 5.353907 4.665655 -5.030523 28.850660 -0.605611 -0.262571 0.060903 0.298685
56 Session_04 ETH-2 -4.000 26.000 -5.986501 -5.915157 -12.656583 -12.060382 -18.023381 -10.182247 19.889836 -0.709603 -0.268277 -0.130450 0.199604
57 Session_04 ETH-3 -4.000 26.000 5.739420 11.128582 16.641344 22.166106 28.306614 1.695046 37.399884 -0.280608 -0.210162 0.066645 0.614665
58 Session_04 BAR -4.000 26.000 -9.926078 10.884823 0.060864 21.650722 10.707292 -15.002880 37.185606 -0.287358 -0.232425 0.016044 0.611760
59 Session_04 ETH-1 -4.000 26.000 6.029937 10.766997 16.151273 21.345479 27.780042 2.018148 37.027152 -0.708855 -0.297953 -0.050465 0.193862
60 Session_04 ETH-3 -4.000 26.000 5.798016 11.254135 16.832228 22.432473 28.306614 1.752928 37.528936 -0.275047 -0.197935 -0.239408 0.620088
––– –––––––––– –––––– ––––––––––– –––––––––––– ––––––––– ––––––––– –––––––––– –––––––––– –––––––––– –––––––––– –––––––––– ––––––––– ––––––––– ––––––––– ––––––––
543def table_of_samples( 544 data47 = None, 545 data48 = None, 546 dir = 'output', 547 filename = None, 548 save_to_file = True, 549 print_out = True, 550 output = None, 551 ): 552 ''' 553 Print out, save to disk and/or return a combined table of samples 554 for a pair of `D47data` and `D48data` objects. 555 556 **Parameters** 557 558 + `data47`: `D47data` instance 559 + `data48`: `D48data` instance 560 + `dir`: the directory in which to save the table 561 + `filename`: the name to the csv file to write to 562 + `save_to_file`: whether to save the table to disk 563 + `print_out`: whether to print out the table 564 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 565 if set to `'raw'`: return a list of list of strings 566 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 567 ''' 568 if data47 is None: 569 if data48 is None: 570 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 571 else: 572 return data48.table_of_samples( 573 dir = dir, 574 filename = filename, 575 save_to_file = save_to_file, 576 print_out = print_out, 577 output = output 578 ) 579 else: 580 if data48 is None: 581 return data47.table_of_samples( 582 dir = dir, 583 filename = filename, 584 save_to_file = save_to_file, 585 print_out = print_out, 586 output = output 587 ) 588 else: 589 out47 = data47.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 590 out48 = data48.table_of_samples(save_to_file = False, print_out = False, output = 'raw') 591 out = transpose_table(transpose_table(out47) + transpose_table(out48)[4:]) 592 593 if save_to_file: 594 if not os.path.exists(dir): 595 os.makedirs(dir) 596 if filename is None: 597 filename = f'D47D48_samples.csv' 598 with open(f'{dir}/{filename}', 'w') as fid: 599 fid.write(make_csv(out)) 600 if print_out: 601 print('\n'+pretty_table(out)) 602 if output == 'raw': 603 return out 604 elif output == 'pretty': 605 return pretty_table(out)
Print out, save to disk and/or return a combined table of samples
for a pair of D47data
and D48data
objects.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
608def table_of_sessions( 609 data47 = None, 610 data48 = None, 611 dir = 'output', 612 filename = None, 613 save_to_file = True, 614 print_out = True, 615 output = None, 616 ): 617 ''' 618 Print out, save to disk and/or return a combined table of sessions 619 for a pair of `D47data` and `D48data` objects. 620 ***Only applicable if the sessions in `data47` and those in `data48` 621 consist of the exact same sets of analyses.*** 622 623 **Parameters** 624 625 + `data47`: `D47data` instance 626 + `data48`: `D48data` instance 627 + `dir`: the directory in which to save the table 628 + `filename`: the name to the csv file to write to 629 + `save_to_file`: whether to save the table to disk 630 + `print_out`: whether to print out the table 631 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 632 if set to `'raw'`: return a list of list of strings 633 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 634 ''' 635 if data47 is None: 636 if data48 is None: 637 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 638 else: 639 return data48.table_of_sessions( 640 dir = dir, 641 filename = filename, 642 save_to_file = save_to_file, 643 print_out = print_out, 644 output = output 645 ) 646 else: 647 if data48 is None: 648 return data47.table_of_sessions( 649 dir = dir, 650 filename = filename, 651 save_to_file = save_to_file, 652 print_out = print_out, 653 output = output 654 ) 655 else: 656 out47 = data47.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 657 out48 = data48.table_of_sessions(save_to_file = False, print_out = False, output = 'raw') 658 for k,x in enumerate(out47[0]): 659 if k>7: 660 out47[0][k] = out47[0][k].replace('a', 'a_47').replace('b', 'b_47').replace('c', 'c_47') 661 out48[0][k] = out48[0][k].replace('a', 'a_48').replace('b', 'b_48').replace('c', 'c_48') 662 out = transpose_table(transpose_table(out47) + transpose_table(out48)[7:]) 663 664 if save_to_file: 665 if not os.path.exists(dir): 666 os.makedirs(dir) 667 if filename is None: 668 filename = f'D47D48_sessions.csv' 669 with open(f'{dir}/{filename}', 'w') as fid: 670 fid.write(make_csv(out)) 671 if print_out: 672 print('\n'+pretty_table(out)) 673 if output == 'raw': 674 return out 675 elif output == 'pretty': 676 return pretty_table(out)
Print out, save to disk and/or return a combined table of sessions
for a pair of D47data
and D48data
objects.
Only applicable if the sessions in data47
and those in data48
consist of the exact same sets of analyses.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
679def table_of_analyses( 680 data47 = None, 681 data48 = None, 682 dir = 'output', 683 filename = None, 684 save_to_file = True, 685 print_out = True, 686 output = None, 687 ): 688 ''' 689 Print out, save to disk and/or return a combined table of analyses 690 for a pair of `D47data` and `D48data` objects. 691 692 If the sessions in `data47` and those in `data48` do not consist of 693 the exact same sets of analyses, the table will have two columns 694 `Session_47` and `Session_48` instead of a single `Session` column. 695 696 **Parameters** 697 698 + `data47`: `D47data` instance 699 + `data48`: `D48data` instance 700 + `dir`: the directory in which to save the table 701 + `filename`: the name to the csv file to write to 702 + `save_to_file`: whether to save the table to disk 703 + `print_out`: whether to print out the table 704 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 705 if set to `'raw'`: return a list of list of strings 706 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 707 ''' 708 if data47 is None: 709 if data48 is None: 710 raise TypeError("Arguments must include at least one D47data() or D48data() instance.") 711 else: 712 return data48.table_of_analyses( 713 dir = dir, 714 filename = filename, 715 save_to_file = save_to_file, 716 print_out = print_out, 717 output = output 718 ) 719 else: 720 if data48 is None: 721 return data47.table_of_analyses( 722 dir = dir, 723 filename = filename, 724 save_to_file = save_to_file, 725 print_out = print_out, 726 output = output 727 ) 728 else: 729 out47 = data47.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 730 out48 = data48.table_of_analyses(save_to_file = False, print_out = False, output = 'raw') 731 732 if [l[1] for l in out47[1:]] == [l[1] for l in out48[1:]]: # if sessions are identical 733 out = transpose_table(transpose_table(out47) + transpose_table(out48)[-1:]) 734 else: 735 out47[0][1] = 'Session_47' 736 out48[0][1] = 'Session_48' 737 out47 = transpose_table(out47) 738 out48 = transpose_table(out48) 739 out = transpose_table(out47[:2] + out48[1:2] + out47[2:] + out48[-1:]) 740 741 if save_to_file: 742 if not os.path.exists(dir): 743 os.makedirs(dir) 744 if filename is None: 745 filename = f'D47D48_sessions.csv' 746 with open(f'{dir}/{filename}', 'w') as fid: 747 fid.write(make_csv(out)) 748 if print_out: 749 print('\n'+pretty_table(out)) 750 if output == 'raw': 751 return out 752 elif output == 'pretty': 753 return pretty_table(out)
Print out, save to disk and/or return a combined table of analyses
for a pair of D47data
and D48data
objects.
If the sessions in data47
and those in data48
do not consist of
the exact same sets of analyses, the table will have two columns
Session_47
and Session_48
instead of a single Session
column.
Parameters
data47
:D47data
instancedata48
:D48data
instancedir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
801class D4xdata(list): 802 ''' 803 Store and process data for a large set of Δ47 and/or Δ48 804 analyses, usually comprising more than one analytical session. 805 ''' 806 807 ### 17O CORRECTION PARAMETERS 808 R13_VPDB = 0.01118 # (Chang & Li, 1990) 809 ''' 810 Absolute (13C/12C) ratio of VPDB. 811 By default equal to 0.01118 ([Chang & Li, 1990](http://www.cnki.com.cn/Article/CJFDTotal-JXTW199004006.htm)) 812 ''' 813 814 R18_VSMOW = 0.0020052 # (Baertschi, 1976) 815 ''' 816 Absolute (18O/16C) ratio of VSMOW. 817 By default equal to 0.0020052 ([Baertschi, 1976](https://doi.org/10.1016/0012-821X(76)90115-1)) 818 ''' 819 820 LAMBDA_17 = 0.528 # (Barkan & Luz, 2005) 821 ''' 822 Mass-dependent exponent for triple oxygen isotopes. 823 By default equal to 0.528 ([Barkan & Luz, 2005](https://doi.org/10.1002/rcm.2250)) 824 ''' 825 826 R17_VSMOW = 0.00038475 # (Assonov & Brenninkmeijer, 2003, rescaled to R13_VPDB) 827 ''' 828 Absolute (17O/16C) ratio of VSMOW. 829 By default equal to 0.00038475 830 ([Assonov & Brenninkmeijer, 2003](https://dx.doi.org/10.1002/rcm.1011), 831 rescaled to `R13_VPDB`) 832 ''' 833 834 R18_VPDB = R18_VSMOW * 1.03092 835 ''' 836 Absolute (18O/16C) ratio of VPDB. 837 By definition equal to `R18_VSMOW * 1.03092`. 838 ''' 839 840 R17_VPDB = R17_VSMOW * 1.03092 ** LAMBDA_17 841 ''' 842 Absolute (17O/16C) ratio of VPDB. 843 By definition equal to `R17_VSMOW * 1.03092 ** LAMBDA_17`. 844 ''' 845 846 LEVENE_REF_SAMPLE = 'ETH-3' 847 ''' 848 After the Δ4x standardization step, each sample is tested to 849 assess whether the Δ4x variance within all analyses for that 850 sample differs significantly from that observed for a given reference 851 sample (using [Levene's test](https://en.wikipedia.org/wiki/Levene%27s_test), 852 which yields a p-value corresponding to the null hypothesis that the 853 underlying variances are equal). 854 855 `LEVENE_REF_SAMPLE` (by default equal to `'ETH-3'`) specifies which 856 sample should be used as a reference for this test. 857 ''' 858 859 ALPHA_18O_ACID_REACTION = round(np.exp(3.59 / (90 + 273.15) - 1.79e-3), 6) # (Kim et al., 2007, calcite) 860 ''' 861 Specifies the 18O/16O fractionation factor generally applicable 862 to acid reactions in the dataset. Currently used by `D4xdata.wg()`, 863 `D4xdata.standardize_d13C`, and `D4xdata.standardize_d18O`. 864 865 By default equal to 1.008129 (calcite reacted at 90 °C, 866 [Kim et al., 2007](https://dx.doi.org/10.1016/j.chemgeo.2007.08.005)). 867 ''' 868 869 Nominal_d13C_VPDB = { 870 'ETH-1': 2.02, 871 'ETH-2': -10.17, 872 'ETH-3': 1.71, 873 } # (Bernasconi et al., 2018) 874 ''' 875 Nominal δ13C_VPDB values assigned to carbonate standards, used by 876 `D4xdata.standardize_d13C()`. 877 878 By default equal to `{'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}` after 879 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 880 ''' 881 882 Nominal_d18O_VPDB = { 883 'ETH-1': -2.19, 884 'ETH-2': -18.69, 885 'ETH-3': -1.78, 886 } # (Bernasconi et al., 2018) 887 ''' 888 Nominal δ18O_VPDB values assigned to carbonate standards, used by 889 `D4xdata.standardize_d18O()`. 890 891 By default equal to `{'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}` after 892 [Bernasconi et al. (2018)](https://doi.org/10.1029/2017GC007385). 893 ''' 894 895 d13C_STANDARDIZATION_METHOD = '2pt' 896 ''' 897 Method by which to standardize δ13C values: 898 899 + `none`: do not apply any δ13C standardization. 900 + `'1pt'`: within each session, offset all initial δ13C values so as to 901 minimize the difference between final δ13C_VPDB values and 902 `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` is defined). 903 + `'2pt'`: within each session, apply a affine trasformation to all δ13C 904 values so as to minimize the difference between final δ13C_VPDB 905 values and `Nominal_d13C_VPDB` (averaged over all analyses for which `Nominal_d13C_VPDB` 906 is defined). 907 ''' 908 909 d18O_STANDARDIZATION_METHOD = '2pt' 910 ''' 911 Method by which to standardize δ18O values: 912 913 + `none`: do not apply any δ18O standardization. 914 + `'1pt'`: within each session, offset all initial δ18O values so as to 915 minimize the difference between final δ18O_VPDB values and 916 `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` is defined). 917 + `'2pt'`: within each session, apply a affine trasformation to all δ18O 918 values so as to minimize the difference between final δ18O_VPDB 919 values and `Nominal_d18O_VPDB` (averaged over all analyses for which `Nominal_d18O_VPDB` 920 is defined). 921 ''' 922 923 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 924 ''' 925 **Parameters** 926 927 + `l`: a list of dictionaries, with each dictionary including at least the keys 928 `Sample`, `d45`, `d46`, and `d47` or `d48`. 929 + `mass`: `'47'` or `'48'` 930 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 931 + `session`: define session name for analyses without a `Session` key 932 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 933 934 Returns a `D4xdata` object derived from `list`. 935 ''' 936 self._4x = mass 937 self.verbose = verbose 938 self.prefix = 'D4xdata' 939 self.logfile = logfile 940 list.__init__(self, l) 941 self.Nf = None 942 self.repeatability = {} 943 self.refresh(session = session) 944 945 946 def make_verbal(oldfun): 947 ''' 948 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 949 ''' 950 @wraps(oldfun) 951 def newfun(*args, verbose = '', **kwargs): 952 myself = args[0] 953 oldprefix = myself.prefix 954 myself.prefix = oldfun.__name__ 955 if verbose != '': 956 oldverbose = myself.verbose 957 myself.verbose = verbose 958 out = oldfun(*args, **kwargs) 959 myself.prefix = oldprefix 960 if verbose != '': 961 myself.verbose = oldverbose 962 return out 963 return newfun 964 965 966 def msg(self, txt): 967 ''' 968 Log a message to `self.logfile`, and print it out if `verbose = True` 969 ''' 970 self.log(txt) 971 if self.verbose: 972 print(f'{f"[{self.prefix}]":<16} {txt}') 973 974 975 def vmsg(self, txt): 976 ''' 977 Log a message to `self.logfile` and print it out 978 ''' 979 self.log(txt) 980 print(txt) 981 982 983 def log(self, *txts): 984 ''' 985 Log a message to `self.logfile` 986 ''' 987 if self.logfile: 988 with open(self.logfile, 'a') as fid: 989 for txt in txts: 990 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}') 991 992 993 def refresh(self, session = 'mySession'): 994 ''' 995 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 996 ''' 997 self.fill_in_missing_info(session = session) 998 self.refresh_sessions() 999 self.refresh_samples() 1000 1001 1002 def refresh_sessions(self): 1003 ''' 1004 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1005 to `False` for all sessions. 1006 ''' 1007 self.sessions = { 1008 s: {'data': [r for r in self if r['Session'] == s]} 1009 for s in sorted({r['Session'] for r in self}) 1010 } 1011 for s in self.sessions: 1012 self.sessions[s]['scrambling_drift'] = False 1013 self.sessions[s]['slope_drift'] = False 1014 self.sessions[s]['wg_drift'] = False 1015 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1016 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD 1017 1018 1019 def refresh_samples(self): 1020 ''' 1021 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1022 ''' 1023 self.samples = { 1024 s: {'data': [r for r in self if r['Sample'] == s]} 1025 for s in sorted({r['Sample'] for r in self}) 1026 } 1027 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1028 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x} 1029 1030 1031 def read(self, filename, sep = '', session = ''): 1032 ''' 1033 Read file in csv format to load data into a `D47data` object. 1034 1035 In the csv file, spaces before and after field separators (`','` by default) 1036 are optional. Each line corresponds to a single analysis. 1037 1038 The required fields are: 1039 1040 + `UID`: a unique identifier 1041 + `Session`: an identifier for the analytical session 1042 + `Sample`: a sample identifier 1043 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1044 1045 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1046 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1047 and `d49` are optional, and set to NaN by default. 1048 1049 **Parameters** 1050 1051 + `fileneme`: the path of the file to read 1052 + `sep`: csv separator delimiting the fields 1053 + `session`: set `Session` field to this string for all analyses 1054 ''' 1055 with open(filename) as fid: 1056 self.input(fid.read(), sep = sep, session = session) 1057 1058 1059 def input(self, txt, sep = '', session = ''): 1060 ''' 1061 Read `txt` string in csv format to load analysis data into a `D47data` object. 1062 1063 In the csv string, spaces before and after field separators (`','` by default) 1064 are optional. Each line corresponds to a single analysis. 1065 1066 The required fields are: 1067 1068 + `UID`: a unique identifier 1069 + `Session`: an identifier for the analytical session 1070 + `Sample`: a sample identifier 1071 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1072 1073 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1074 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1075 and `d49` are optional, and set to NaN by default. 1076 1077 **Parameters** 1078 1079 + `txt`: the csv string to read 1080 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1081 whichever appers most often in `txt`. 1082 + `session`: set `Session` field to this string for all analyses 1083 ''' 1084 if sep == '': 1085 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1086 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1087 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1088 1089 if session != '': 1090 for r in data: 1091 r['Session'] = session 1092 1093 self += data 1094 self.refresh() 1095 1096 1097 @make_verbal 1098 def wg(self, samples = None, a18_acid = None): 1099 ''' 1100 Compute bulk composition of the working gas for each session based on 1101 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1102 `self.Nominal_d18O_VPDB`. 1103 ''' 1104 1105 self.msg('Computing WG composition:') 1106 1107 if a18_acid is None: 1108 a18_acid = self.ALPHA_18O_ACID_REACTION 1109 if samples is None: 1110 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1111 1112 assert a18_acid, f'Acid fractionation factor should not be zero.' 1113 1114 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1115 R45R46_standards = {} 1116 for sample in samples: 1117 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1118 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1119 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1120 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1121 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1122 1123 C12_s = 1 / (1 + R13_s) 1124 C13_s = R13_s / (1 + R13_s) 1125 C16_s = 1 / (1 + R17_s + R18_s) 1126 C17_s = R17_s / (1 + R17_s + R18_s) 1127 C18_s = R18_s / (1 + R17_s + R18_s) 1128 1129 C626_s = C12_s * C16_s ** 2 1130 C627_s = 2 * C12_s * C16_s * C17_s 1131 C628_s = 2 * C12_s * C16_s * C18_s 1132 C636_s = C13_s * C16_s ** 2 1133 C637_s = 2 * C13_s * C16_s * C17_s 1134 C727_s = C12_s * C17_s ** 2 1135 1136 R45_s = (C627_s + C636_s) / C626_s 1137 R46_s = (C628_s + C637_s + C727_s) / C626_s 1138 R45R46_standards[sample] = (R45_s, R46_s) 1139 1140 for s in self.sessions: 1141 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1142 assert db, f'No sample from {samples} found in session "{s}".' 1143# dbsamples = sorted({r['Sample'] for r in db}) 1144 1145 X = [r['d45'] for r in db] 1146 Y = [R45R46_standards[r['Sample']][0] for r in db] 1147 x1, x2 = np.min(X), np.max(X) 1148 1149 if x1 < x2: 1150 wgcoord = x1/(x1-x2) 1151 else: 1152 wgcoord = 999 1153 1154 if wgcoord < -.5 or wgcoord > 1.5: 1155 # unreasonable to extrapolate to d45 = 0 1156 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1157 else : 1158 # d45 = 0 is reasonably well bracketed 1159 R45_wg = np.polyfit(X, Y, 1)[1] 1160 1161 X = [r['d46'] for r in db] 1162 Y = [R45R46_standards[r['Sample']][1] for r in db] 1163 x1, x2 = np.min(X), np.max(X) 1164 1165 if x1 < x2: 1166 wgcoord = x1/(x1-x2) 1167 else: 1168 wgcoord = 999 1169 1170 if wgcoord < -.5 or wgcoord > 1.5: 1171 # unreasonable to extrapolate to d46 = 0 1172 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1173 else : 1174 # d46 = 0 is reasonably well bracketed 1175 R46_wg = np.polyfit(X, Y, 1)[1] 1176 1177 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1178 1179 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1180 1181 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1182 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1183 for r in self.sessions[s]['data']: 1184 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1185 r['d18Owg_VSMOW'] = d18Owg_VSMOW 1186 1187 1188 def compute_bulk_delta(self, R45, R46, D17O = 0): 1189 ''' 1190 Compute δ13C_VPDB and δ18O_VSMOW, 1191 by solving the generalized form of equation (17) from 1192 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1193 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1194 solving the corresponding second-order Taylor polynomial. 1195 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1196 ''' 1197 1198 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1199 1200 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1201 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1202 C = 2 * self.R18_VSMOW 1203 D = -R46 1204 1205 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1206 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1207 cc = A + B + C + D 1208 1209 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1210 1211 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1212 R17 = K * R18 ** self.LAMBDA_17 1213 R13 = R45 - 2 * R17 1214 1215 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1216 1217 return d13C_VPDB, d18O_VSMOW 1218 1219 1220 @make_verbal 1221 def crunch(self, verbose = ''): 1222 ''' 1223 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1224 ''' 1225 for r in self: 1226 self.compute_bulk_and_clumping_deltas(r) 1227 self.standardize_d13C() 1228 self.standardize_d18O() 1229 self.msg(f"Crunched {len(self)} analyses.") 1230 1231 1232 def fill_in_missing_info(self, session = 'mySession'): 1233 ''' 1234 Fill in optional fields with default values 1235 ''' 1236 for i,r in enumerate(self): 1237 if 'D17O' not in r: 1238 r['D17O'] = 0. 1239 if 'UID' not in r: 1240 r['UID'] = f'{i+1}' 1241 if 'Session' not in r: 1242 r['Session'] = session 1243 for k in ['d47', 'd48', 'd49']: 1244 if k not in r: 1245 r[k] = np.nan 1246 1247 1248 def standardize_d13C(self): 1249 ''' 1250 Perform δ13C standadization within each session `s` according to 1251 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1252 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1253 may be redefined abitrarily at a later stage. 1254 ''' 1255 for s in self.sessions: 1256 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1257 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1258 X,Y = zip(*XY) 1259 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1260 offset = np.mean(Y) - np.mean(X) 1261 for r in self.sessions[s]['data']: 1262 r['d13C_VPDB'] += offset 1263 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1264 a,b = np.polyfit(X,Y,1) 1265 for r in self.sessions[s]['data']: 1266 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b 1267 1268 def standardize_d18O(self): 1269 ''' 1270 Perform δ18O standadization within each session `s` according to 1271 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1272 which is defined by default by `D47data.refresh_sessions()`as equal to 1273 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1274 ''' 1275 for s in self.sessions: 1276 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1277 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1278 X,Y = zip(*XY) 1279 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1280 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1281 offset = np.mean(Y) - np.mean(X) 1282 for r in self.sessions[s]['data']: 1283 r['d18O_VSMOW'] += offset 1284 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1285 a,b = np.polyfit(X,Y,1) 1286 for r in self.sessions[s]['data']: 1287 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b 1288 1289 1290 def compute_bulk_and_clumping_deltas(self, r): 1291 ''' 1292 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1293 ''' 1294 1295 # Compute working gas R13, R18, and isobar ratios 1296 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1297 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1298 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1299 1300 # Compute analyte isobar ratios 1301 R45 = (1 + r['d45'] / 1000) * R45_wg 1302 R46 = (1 + r['d46'] / 1000) * R46_wg 1303 R47 = (1 + r['d47'] / 1000) * R47_wg 1304 R48 = (1 + r['d48'] / 1000) * R48_wg 1305 R49 = (1 + r['d49'] / 1000) * R49_wg 1306 1307 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1308 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1309 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1310 1311 # Compute stochastic isobar ratios of the analyte 1312 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1313 R13, R18, D17O = r['D17O'] 1314 ) 1315 1316 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1317 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1318 if (R45 / R45stoch - 1) > 5e-8: 1319 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1320 if (R46 / R46stoch - 1) > 5e-8: 1321 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1322 1323 # Compute raw clumped isotope anomalies 1324 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1325 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1326 r['D49raw'] = 1000 * (R49 / R49stoch - 1) 1327 1328 1329 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1330 ''' 1331 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1332 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1333 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1334 ''' 1335 1336 # Compute R17 1337 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1338 1339 # Compute isotope concentrations 1340 C12 = (1 + R13) ** -1 1341 C13 = C12 * R13 1342 C16 = (1 + R17 + R18) ** -1 1343 C17 = C16 * R17 1344 C18 = C16 * R18 1345 1346 # Compute stochastic isotopologue concentrations 1347 C626 = C16 * C12 * C16 1348 C627 = C16 * C12 * C17 * 2 1349 C628 = C16 * C12 * C18 * 2 1350 C636 = C16 * C13 * C16 1351 C637 = C16 * C13 * C17 * 2 1352 C638 = C16 * C13 * C18 * 2 1353 C727 = C17 * C12 * C17 1354 C728 = C17 * C12 * C18 * 2 1355 C737 = C17 * C13 * C17 1356 C738 = C17 * C13 * C18 * 2 1357 C828 = C18 * C12 * C18 1358 C838 = C18 * C13 * C18 1359 1360 # Compute stochastic isobar ratios 1361 R45 = (C636 + C627) / C626 1362 R46 = (C628 + C637 + C727) / C626 1363 R47 = (C638 + C728 + C737) / C626 1364 R48 = (C738 + C828) / C626 1365 R49 = C838 / C626 1366 1367 # Account for stochastic anomalies 1368 R47 *= 1 + D47 / 1000 1369 R48 *= 1 + D48 / 1000 1370 R49 *= 1 + D49 / 1000 1371 1372 # Return isobar ratios 1373 return R45, R46, R47, R48, R49 1374 1375 1376 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1377 ''' 1378 Split unknown samples by UID (treat all analyses as different samples) 1379 or by session (treat analyses of a given sample in different sessions as 1380 different samples). 1381 1382 **Parameters** 1383 1384 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1385 + `grouping`: `by_uid` | `by_session` 1386 ''' 1387 if samples_to_split == 'all': 1388 samples_to_split = [s for s in self.unknowns] 1389 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1390 self.grouping = grouping.lower() 1391 if self.grouping in gkeys: 1392 gkey = gkeys[self.grouping] 1393 for r in self: 1394 if r['Sample'] in samples_to_split: 1395 r['Sample_original'] = r['Sample'] 1396 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1397 elif r['Sample'] in self.unknowns: 1398 r['Sample_original'] = r['Sample'] 1399 self.refresh_samples() 1400 1401 1402 def unsplit_samples(self, tables = False): 1403 ''' 1404 Reverse the effects of `D47data.split_samples()`. 1405 1406 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1407 1408 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1409 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1410 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1411 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1412 that case session-averaged Δ4x values are statistically independent). 1413 ''' 1414 unknowns_old = sorted({s for s in self.unknowns}) 1415 CM_old = self.standardization.covar[:,:] 1416 VD_old = self.standardization.params.valuesdict().copy() 1417 vars_old = self.standardization.var_names 1418 1419 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1420 1421 Ns = len(vars_old) - len(unknowns_old) 1422 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1423 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1424 1425 W = np.zeros((len(vars_new), len(vars_old))) 1426 W[:Ns,:Ns] = np.eye(Ns) 1427 for u in unknowns_new: 1428 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1429 if self.grouping == 'by_session': 1430 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1431 elif self.grouping == 'by_uid': 1432 weights = [1 for s in splits] 1433 sw = sum(weights) 1434 weights = [w/sw for w in weights] 1435 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1436 1437 CM_new = W @ CM_old @ W.T 1438 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1439 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1440 1441 self.standardization.covar = CM_new 1442 self.standardization.params.valuesdict = lambda : VD_new 1443 self.standardization.var_names = vars_new 1444 1445 for r in self: 1446 if r['Sample'] in self.unknowns: 1447 r['Sample_split'] = r['Sample'] 1448 r['Sample'] = r['Sample_original'] 1449 1450 self.refresh_samples() 1451 self.consolidate_samples() 1452 self.repeatabilities() 1453 1454 if tables: 1455 self.table_of_analyses() 1456 self.table_of_samples() 1457 1458 def assign_timestamps(self): 1459 ''' 1460 Assign a time field `t` of type `float` to each analysis. 1461 1462 If `TimeTag` is one of the data fields, `t` is equal within a given session 1463 to `TimeTag` minus the mean value of `TimeTag` for that session. 1464 Otherwise, `TimeTag` is by default equal to the index of each analysis 1465 in the dataset and `t` is defined as above. 1466 ''' 1467 for session in self.sessions: 1468 sdata = self.sessions[session]['data'] 1469 try: 1470 t0 = np.mean([r['TimeTag'] for r in sdata]) 1471 for r in sdata: 1472 r['t'] = r['TimeTag'] - t0 1473 except KeyError: 1474 t0 = (len(sdata)-1)/2 1475 for t,r in enumerate(sdata): 1476 r['t'] = t - t0 1477 1478 1479 def report(self): 1480 ''' 1481 Prints a report on the standardization fit. 1482 Only applicable after `D4xdata.standardize(method='pooled')`. 1483 ''' 1484 report_fit(self.standardization) 1485 1486 1487 def combine_samples(self, sample_groups): 1488 ''' 1489 Combine analyses of different samples to compute weighted average Δ4x 1490 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1491 dictionary. 1492 1493 Caution: samples are weighted by number of replicate analyses, which is a 1494 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1495 correlated analytical errors for one or more samples). 1496 1497 Returns a tuplet of: 1498 1499 + the list of group names 1500 + an array of the corresponding Δ4x values 1501 + the corresponding (co)variance matrix 1502 1503 **Parameters** 1504 1505 + `sample_groups`: a dictionary of the form: 1506 ```py 1507 {'group1': ['sample_1', 'sample_2'], 1508 'group2': ['sample_3', 'sample_4', 'sample_5']} 1509 ``` 1510 ''' 1511 1512 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1513 groups = sorted(sample_groups.keys()) 1514 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1515 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1516 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1517 W = np.array([ 1518 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1519 for j in groups]) 1520 D4x_new = W @ D4x_old 1521 CM_new = W @ CM_old @ W.T 1522 1523 return groups, D4x_new[:,0], CM_new 1524 1525 1526 @make_verbal 1527 def standardize(self, 1528 method = 'pooled', 1529 weighted_sessions = [], 1530 consolidate = True, 1531 consolidate_tables = False, 1532 consolidate_plots = False, 1533 constraints = {}, 1534 ): 1535 ''' 1536 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1537 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1538 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1539 i.e. that their true Δ4x value does not change between sessions, 1540 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1541 `'indep_sessions'`, the standardization processes each session independently, based only 1542 on anchors analyses. 1543 ''' 1544 1545 self.standardization_method = method 1546 self.assign_timestamps() 1547 1548 if method == 'pooled': 1549 if weighted_sessions: 1550 for session_group in weighted_sessions: 1551 if self._4x == '47': 1552 X = D47data([r for r in self if r['Session'] in session_group]) 1553 elif self._4x == '48': 1554 X = D48data([r for r in self if r['Session'] in session_group]) 1555 X.Nominal_D4x = self.Nominal_D4x.copy() 1556 X.refresh() 1557 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1558 w = np.sqrt(result.redchi) 1559 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1560 for r in X: 1561 r[f'wD{self._4x}raw'] *= w 1562 else: 1563 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1564 for r in self: 1565 r[f'wD{self._4x}raw'] = 1. 1566 1567 params = Parameters() 1568 for k,session in enumerate(self.sessions): 1569 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1570 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1571 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1572 s = pf(session) 1573 params.add(f'a_{s}', value = 0.9) 1574 params.add(f'b_{s}', value = 0.) 1575 params.add(f'c_{s}', value = -0.9) 1576 params.add(f'a2_{s}', value = 0., 1577# vary = self.sessions[session]['scrambling_drift'], 1578 ) 1579 params.add(f'b2_{s}', value = 0., 1580# vary = self.sessions[session]['slope_drift'], 1581 ) 1582 params.add(f'c2_{s}', value = 0., 1583# vary = self.sessions[session]['wg_drift'], 1584 ) 1585 if not self.sessions[session]['scrambling_drift']: 1586 params[f'a2_{s}'].expr = '0' 1587 if not self.sessions[session]['slope_drift']: 1588 params[f'b2_{s}'].expr = '0' 1589 if not self.sessions[session]['wg_drift']: 1590 params[f'c2_{s}'].expr = '0' 1591 1592 for sample in self.unknowns: 1593 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1594 1595 for k in constraints: 1596 params[k].expr = constraints[k] 1597 1598 def residuals(p): 1599 R = [] 1600 for r in self: 1601 session = pf(r['Session']) 1602 sample = pf(r['Sample']) 1603 if r['Sample'] in self.Nominal_D4x: 1604 R += [ ( 1605 r[f'D{self._4x}raw'] - ( 1606 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1607 + p[f'b_{session}'] * r[f'd{self._4x}'] 1608 + p[f'c_{session}'] 1609 + r['t'] * ( 1610 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1611 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1612 + p[f'c2_{session}'] 1613 ) 1614 ) 1615 ) / r[f'wD{self._4x}raw'] ] 1616 else: 1617 R += [ ( 1618 r[f'D{self._4x}raw'] - ( 1619 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1620 + p[f'b_{session}'] * r[f'd{self._4x}'] 1621 + p[f'c_{session}'] 1622 + r['t'] * ( 1623 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1624 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1625 + p[f'c2_{session}'] 1626 ) 1627 ) 1628 ) / r[f'wD{self._4x}raw'] ] 1629 return R 1630 1631 M = Minimizer(residuals, params) 1632 result = M.least_squares() 1633 self.Nf = result.nfree 1634 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1635 new_names, new_covar, new_se = _fullcovar(result)[:3] 1636 result.var_names = new_names 1637 result.covar = new_covar 1638 1639 for r in self: 1640 s = pf(r["Session"]) 1641 a = result.params.valuesdict()[f'a_{s}'] 1642 b = result.params.valuesdict()[f'b_{s}'] 1643 c = result.params.valuesdict()[f'c_{s}'] 1644 a2 = result.params.valuesdict()[f'a2_{s}'] 1645 b2 = result.params.valuesdict()[f'b2_{s}'] 1646 c2 = result.params.valuesdict()[f'c2_{s}'] 1647 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1648 1649 1650 self.standardization = result 1651 1652 for session in self.sessions: 1653 self.sessions[session]['Np'] = 3 1654 for k in ['scrambling', 'slope', 'wg']: 1655 if self.sessions[session][f'{k}_drift']: 1656 self.sessions[session]['Np'] += 1 1657 1658 if consolidate: 1659 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1660 return result 1661 1662 1663 elif method == 'indep_sessions': 1664 1665 if weighted_sessions: 1666 for session_group in weighted_sessions: 1667 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1668 X.Nominal_D4x = self.Nominal_D4x.copy() 1669 X.refresh() 1670 # This is only done to assign r['wD47raw'] for r in X: 1671 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1672 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1673 else: 1674 self.msg('All weights set to 1 ‰') 1675 for r in self: 1676 r[f'wD{self._4x}raw'] = 1 1677 1678 for session in self.sessions: 1679 s = self.sessions[session] 1680 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1681 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1682 s['Np'] = sum(p_active) 1683 sdata = s['data'] 1684 1685 A = np.array([ 1686 [ 1687 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1688 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1689 1 / r[f'wD{self._4x}raw'], 1690 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1691 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1692 r['t'] / r[f'wD{self._4x}raw'] 1693 ] 1694 for r in sdata if r['Sample'] in self.anchors 1695 ])[:,p_active] # only keep columns for the active parameters 1696 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1697 s['Na'] = Y.size 1698 CM = linalg.inv(A.T @ A) 1699 bf = (CM @ A.T @ Y).T[0,:] 1700 k = 0 1701 for n,a in zip(p_names, p_active): 1702 if a: 1703 s[n] = bf[k] 1704# self.msg(f'{n} = {bf[k]}') 1705 k += 1 1706 else: 1707 s[n] = 0. 1708# self.msg(f'{n} = 0.0') 1709 1710 for r in sdata : 1711 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1712 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1713 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1714 1715 s['CM'] = np.zeros((6,6)) 1716 i = 0 1717 k_active = [j for j,a in enumerate(p_active) if a] 1718 for j,a in enumerate(p_active): 1719 if a: 1720 s['CM'][j,k_active] = CM[i,:] 1721 i += 1 1722 1723 if not weighted_sessions: 1724 w = self.rmswd()['rmswd'] 1725 for r in self: 1726 r[f'wD{self._4x}'] *= w 1727 r[f'wD{self._4x}raw'] *= w 1728 for session in self.sessions: 1729 self.sessions[session]['CM'] *= w**2 1730 1731 for session in self.sessions: 1732 s = self.sessions[session] 1733 s['SE_a'] = s['CM'][0,0]**.5 1734 s['SE_b'] = s['CM'][1,1]**.5 1735 s['SE_c'] = s['CM'][2,2]**.5 1736 s['SE_a2'] = s['CM'][3,3]**.5 1737 s['SE_b2'] = s['CM'][4,4]**.5 1738 s['SE_c2'] = s['CM'][5,5]**.5 1739 1740 if not weighted_sessions: 1741 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1742 else: 1743 self.Nf = 0 1744 for sg in weighted_sessions: 1745 self.Nf += self.rmswd(sessions = sg)['Nf'] 1746 1747 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1748 1749 avgD4x = { 1750 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1751 for sample in self.samples 1752 } 1753 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1754 rD4x = (chi2/self.Nf)**.5 1755 self.repeatability[f'sigma_{self._4x}'] = rD4x 1756 1757 if consolidate: 1758 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1759 1760 1761 def standardization_error(self, session, d4x, D4x, t = 0): 1762 ''' 1763 Compute standardization error for a given session and 1764 (δ47, Δ47) composition. 1765 ''' 1766 a = self.sessions[session]['a'] 1767 b = self.sessions[session]['b'] 1768 c = self.sessions[session]['c'] 1769 a2 = self.sessions[session]['a2'] 1770 b2 = self.sessions[session]['b2'] 1771 c2 = self.sessions[session]['c2'] 1772 CM = self.sessions[session]['CM'] 1773 1774 x, y = D4x, d4x 1775 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1776# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1777 dxdy = -(b+b2*t) / (a+a2*t) 1778 dxdz = 1. / (a+a2*t) 1779 dxda = -x / (a+a2*t) 1780 dxdb = -y / (a+a2*t) 1781 dxdc = -1. / (a+a2*t) 1782 dxda2 = -x * a2 / (a+a2*t) 1783 dxdb2 = -y * t / (a+a2*t) 1784 dxdc2 = -t / (a+a2*t) 1785 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1786 sx = (V @ CM @ V.T) ** .5 1787 return sx 1788 1789 1790 @make_verbal 1791 def summary(self, 1792 dir = 'output', 1793 filename = None, 1794 save_to_file = True, 1795 print_out = True, 1796 ): 1797 ''' 1798 Print out an/or save to disk a summary of the standardization results. 1799 1800 **Parameters** 1801 1802 + `dir`: the directory in which to save the table 1803 + `filename`: the name to the csv file to write to 1804 + `save_to_file`: whether to save the table to disk 1805 + `print_out`: whether to print out the table 1806 ''' 1807 1808 out = [] 1809 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1810 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1811 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1812 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1813 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1814 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1815 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1816 out += [['Model degrees of freedom', f"{self.Nf}"]] 1817 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1818 out += [['Standardization method', self.standardization_method]] 1819 1820 if save_to_file: 1821 if not os.path.exists(dir): 1822 os.makedirs(dir) 1823 if filename is None: 1824 filename = f'D{self._4x}_summary.csv' 1825 with open(f'{dir}/{filename}', 'w') as fid: 1826 fid.write(make_csv(out)) 1827 if print_out: 1828 self.msg('\n' + pretty_table(out, header = 0)) 1829 1830 1831 @make_verbal 1832 def table_of_sessions(self, 1833 dir = 'output', 1834 filename = None, 1835 save_to_file = True, 1836 print_out = True, 1837 output = None, 1838 ): 1839 ''' 1840 Print out an/or save to disk a table of sessions. 1841 1842 **Parameters** 1843 1844 + `dir`: the directory in which to save the table 1845 + `filename`: the name to the csv file to write to 1846 + `save_to_file`: whether to save the table to disk 1847 + `print_out`: whether to print out the table 1848 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1849 if set to `'raw'`: return a list of list of strings 1850 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1851 ''' 1852 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1853 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1854 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1855 1856 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1857 if include_a2: 1858 out[-1] += ['a2 ± SE'] 1859 if include_b2: 1860 out[-1] += ['b2 ± SE'] 1861 if include_c2: 1862 out[-1] += ['c2 ± SE'] 1863 for session in self.sessions: 1864 out += [[ 1865 session, 1866 f"{self.sessions[session]['Na']}", 1867 f"{self.sessions[session]['Nu']}", 1868 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1869 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1870 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1871 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1872 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1873 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1874 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1875 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1876 ]] 1877 if include_a2: 1878 if self.sessions[session]['scrambling_drift']: 1879 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1880 else: 1881 out[-1] += [''] 1882 if include_b2: 1883 if self.sessions[session]['slope_drift']: 1884 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1885 else: 1886 out[-1] += [''] 1887 if include_c2: 1888 if self.sessions[session]['wg_drift']: 1889 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1890 else: 1891 out[-1] += [''] 1892 1893 if save_to_file: 1894 if not os.path.exists(dir): 1895 os.makedirs(dir) 1896 if filename is None: 1897 filename = f'D{self._4x}_sessions.csv' 1898 with open(f'{dir}/{filename}', 'w') as fid: 1899 fid.write(make_csv(out)) 1900 if print_out: 1901 self.msg('\n' + pretty_table(out)) 1902 if output == 'raw': 1903 return out 1904 elif output == 'pretty': 1905 return pretty_table(out) 1906 1907 1908 @make_verbal 1909 def table_of_analyses( 1910 self, 1911 dir = 'output', 1912 filename = None, 1913 save_to_file = True, 1914 print_out = True, 1915 output = None, 1916 ): 1917 ''' 1918 Print out an/or save to disk a table of analyses. 1919 1920 **Parameters** 1921 1922 + `dir`: the directory in which to save the table 1923 + `filename`: the name to the csv file to write to 1924 + `save_to_file`: whether to save the table to disk 1925 + `print_out`: whether to print out the table 1926 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1927 if set to `'raw'`: return a list of list of strings 1928 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1929 ''' 1930 1931 out = [['UID','Session','Sample']] 1932 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1933 for f in extra_fields: 1934 out[-1] += [f[0]] 1935 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1936 for r in self: 1937 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1938 for f in extra_fields: 1939 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1940 out[-1] += [ 1941 f"{r['d13Cwg_VPDB']:.3f}", 1942 f"{r['d18Owg_VSMOW']:.3f}", 1943 f"{r['d45']:.6f}", 1944 f"{r['d46']:.6f}", 1945 f"{r['d47']:.6f}", 1946 f"{r['d48']:.6f}", 1947 f"{r['d49']:.6f}", 1948 f"{r['d13C_VPDB']:.6f}", 1949 f"{r['d18O_VSMOW']:.6f}", 1950 f"{r['D47raw']:.6f}", 1951 f"{r['D48raw']:.6f}", 1952 f"{r['D49raw']:.6f}", 1953 f"{r[f'D{self._4x}']:.6f}" 1954 ] 1955 if save_to_file: 1956 if not os.path.exists(dir): 1957 os.makedirs(dir) 1958 if filename is None: 1959 filename = f'D{self._4x}_analyses.csv' 1960 with open(f'{dir}/{filename}', 'w') as fid: 1961 fid.write(make_csv(out)) 1962 if print_out: 1963 self.msg('\n' + pretty_table(out)) 1964 return out 1965 1966 @make_verbal 1967 def covar_table( 1968 self, 1969 correl = False, 1970 dir = 'output', 1971 filename = None, 1972 save_to_file = True, 1973 print_out = True, 1974 output = None, 1975 ): 1976 ''' 1977 Print out, save to disk and/or return the variance-covariance matrix of D4x 1978 for all unknown samples. 1979 1980 **Parameters** 1981 1982 + `dir`: the directory in which to save the csv 1983 + `filename`: the name of the csv file to write to 1984 + `save_to_file`: whether to save the csv 1985 + `print_out`: whether to print out the matrix 1986 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1987 if set to `'raw'`: return a list of list of strings 1988 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1989 ''' 1990 samples = sorted([u for u in self.unknowns]) 1991 out = [[''] + samples] 1992 for s1 in samples: 1993 out.append([s1]) 1994 for s2 in samples: 1995 if correl: 1996 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1997 else: 1998 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1999 2000 if save_to_file: 2001 if not os.path.exists(dir): 2002 os.makedirs(dir) 2003 if filename is None: 2004 if correl: 2005 filename = f'D{self._4x}_correl.csv' 2006 else: 2007 filename = f'D{self._4x}_covar.csv' 2008 with open(f'{dir}/{filename}', 'w') as fid: 2009 fid.write(make_csv(out)) 2010 if print_out: 2011 self.msg('\n'+pretty_table(out)) 2012 if output == 'raw': 2013 return out 2014 elif output == 'pretty': 2015 return pretty_table(out) 2016 2017 @make_verbal 2018 def table_of_samples( 2019 self, 2020 dir = 'output', 2021 filename = None, 2022 save_to_file = True, 2023 print_out = True, 2024 output = None, 2025 ): 2026 ''' 2027 Print out, save to disk and/or return a table of samples. 2028 2029 **Parameters** 2030 2031 + `dir`: the directory in which to save the csv 2032 + `filename`: the name of the csv file to write to 2033 + `save_to_file`: whether to save the csv 2034 + `print_out`: whether to print out the table 2035 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2036 if set to `'raw'`: return a list of list of strings 2037 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2038 ''' 2039 2040 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2041 for sample in self.anchors: 2042 out += [[ 2043 f"{sample}", 2044 f"{self.samples[sample]['N']}", 2045 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2046 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2047 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2048 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2049 ]] 2050 for sample in self.unknowns: 2051 out += [[ 2052 f"{sample}", 2053 f"{self.samples[sample]['N']}", 2054 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2055 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2056 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2057 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2058 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2059 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2060 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2061 ]] 2062 if save_to_file: 2063 if not os.path.exists(dir): 2064 os.makedirs(dir) 2065 if filename is None: 2066 filename = f'D{self._4x}_samples.csv' 2067 with open(f'{dir}/{filename}', 'w') as fid: 2068 fid.write(make_csv(out)) 2069 if print_out: 2070 self.msg('\n'+pretty_table(out)) 2071 if output == 'raw': 2072 return out 2073 elif output == 'pretty': 2074 return pretty_table(out) 2075 2076 2077 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2078 ''' 2079 Generate session plots and save them to disk. 2080 2081 **Parameters** 2082 2083 + `dir`: the directory in which to save the plots 2084 + `figsize`: the width and height (in inches) of each plot 2085 + `filetype`: 'pdf' or 'png' 2086 + `dpi`: resolution for PNG output 2087 ''' 2088 if not os.path.exists(dir): 2089 os.makedirs(dir) 2090 2091 for session in self.sessions: 2092 sp = self.plot_single_session(session, xylimits = 'constant') 2093 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2094 ppl.close(sp.fig) 2095 2096 2097 @make_verbal 2098 def consolidate_samples(self): 2099 ''' 2100 Compile various statistics for each sample. 2101 2102 For each anchor sample: 2103 2104 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2105 + `SE_D47` or `SE_D48`: set to zero by definition 2106 2107 For each unknown sample: 2108 2109 + `D47` or `D48`: the standardized Δ4x value for this unknown 2110 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2111 2112 For each anchor and unknown: 2113 2114 + `N`: the total number of analyses of this sample 2115 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2116 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2117 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2118 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2119 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2120 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2121 ''' 2122 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2123 for sample in self.samples: 2124 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2125 if self.samples[sample]['N'] > 1: 2126 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2127 2128 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2129 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2130 2131 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2132 if len(D4x_pop) > 2: 2133 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2134 2135 if self.standardization_method == 'pooled': 2136 for sample in self.anchors: 2137 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2138 self.samples[sample][f'SE_D{self._4x}'] = 0. 2139 for sample in self.unknowns: 2140 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2141 try: 2142 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2143 except ValueError: 2144 # when `sample` is constrained by self.standardize(constraints = {...}), 2145 # it is no longer listed in self.standardization.var_names. 2146 # Temporary fix: define SE as zero for now 2147 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2148 2149 elif self.standardization_method == 'indep_sessions': 2150 for sample in self.anchors: 2151 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2152 self.samples[sample][f'SE_D{self._4x}'] = 0. 2153 for sample in self.unknowns: 2154 self.msg(f'Consolidating sample {sample}') 2155 self.unknowns[sample][f'session_D{self._4x}'] = {} 2156 session_avg = [] 2157 for session in self.sessions: 2158 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2159 if sdata: 2160 self.msg(f'{sample} found in session {session}') 2161 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2162 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2163 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2164 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2165 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2166 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2167 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2168 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2169 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2170 wsum = sum([weights[s] for s in weights]) 2171 for s in weights: 2172 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2173 2174 for r in self: 2175 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}'] 2176 2177 2178 2179 def consolidate_sessions(self): 2180 ''' 2181 Compute various statistics for each session. 2182 2183 + `Na`: Number of anchor analyses in the session 2184 + `Nu`: Number of unknown analyses in the session 2185 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2186 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2187 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2188 + `a`: scrambling factor 2189 + `b`: compositional slope 2190 + `c`: WG offset 2191 + `SE_a`: Model stadard erorr of `a` 2192 + `SE_b`: Model stadard erorr of `b` 2193 + `SE_c`: Model stadard erorr of `c` 2194 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2195 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2196 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2197 + `a2`: scrambling factor drift 2198 + `b2`: compositional slope drift 2199 + `c2`: WG offset drift 2200 + `Np`: Number of standardization parameters to fit 2201 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2202 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2203 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2204 ''' 2205 for session in self.sessions: 2206 if 'd13Cwg_VPDB' not in self.sessions[session]: 2207 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2208 if 'd18Owg_VSMOW' not in self.sessions[session]: 2209 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2210 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2211 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2212 2213 self.msg(f'Computing repeatabilities for session {session}') 2214 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2215 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2216 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2217 2218 if self.standardization_method == 'pooled': 2219 for session in self.sessions: 2220 2221 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2222 i = self.standardization.var_names.index(f'a_{pf(session)}') 2223 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2224 2225 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2226 i = self.standardization.var_names.index(f'b_{pf(session)}') 2227 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2228 2229 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2230 i = self.standardization.var_names.index(f'c_{pf(session)}') 2231 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2232 2233 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2234 if self.sessions[session]['scrambling_drift']: 2235 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2236 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2237 else: 2238 self.sessions[session]['SE_a2'] = 0. 2239 2240 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2241 if self.sessions[session]['slope_drift']: 2242 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2243 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2244 else: 2245 self.sessions[session]['SE_b2'] = 0. 2246 2247 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2248 if self.sessions[session]['wg_drift']: 2249 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2250 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2251 else: 2252 self.sessions[session]['SE_c2'] = 0. 2253 2254 i = self.standardization.var_names.index(f'a_{pf(session)}') 2255 j = self.standardization.var_names.index(f'b_{pf(session)}') 2256 k = self.standardization.var_names.index(f'c_{pf(session)}') 2257 CM = np.zeros((6,6)) 2258 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2259 try: 2260 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2261 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2262 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2263 try: 2264 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2265 CM[3,4] = self.standardization.covar[i2,j2] 2266 CM[4,3] = self.standardization.covar[j2,i2] 2267 except ValueError: 2268 pass 2269 try: 2270 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2271 CM[3,5] = self.standardization.covar[i2,k2] 2272 CM[5,3] = self.standardization.covar[k2,i2] 2273 except ValueError: 2274 pass 2275 except ValueError: 2276 pass 2277 try: 2278 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2279 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2280 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2281 try: 2282 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2283 CM[4,5] = self.standardization.covar[j2,k2] 2284 CM[5,4] = self.standardization.covar[k2,j2] 2285 except ValueError: 2286 pass 2287 except ValueError: 2288 pass 2289 try: 2290 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2291 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2292 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2293 except ValueError: 2294 pass 2295 2296 self.sessions[session]['CM'] = CM 2297 2298 elif self.standardization_method == 'indep_sessions': 2299 pass # Not implemented yet 2300 2301 2302 @make_verbal 2303 def repeatabilities(self): 2304 ''' 2305 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2306 (for all samples, for anchors, and for unknowns). 2307 ''' 2308 self.msg('Computing reproducibilities for all sessions') 2309 2310 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2311 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2312 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2313 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2314 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples') 2315 2316 2317 @make_verbal 2318 def consolidate(self, tables = True, plots = True): 2319 ''' 2320 Collect information about samples, sessions and repeatabilities. 2321 ''' 2322 self.consolidate_samples() 2323 self.consolidate_sessions() 2324 self.repeatabilities() 2325 2326 if tables: 2327 self.summary() 2328 self.table_of_sessions() 2329 self.table_of_analyses() 2330 self.table_of_samples() 2331 2332 if plots: 2333 self.plot_sessions() 2334 2335 2336 @make_verbal 2337 def rmswd(self, 2338 samples = 'all samples', 2339 sessions = 'all sessions', 2340 ): 2341 ''' 2342 Compute the χ2, root mean squared weighted deviation 2343 (i.e. reduced χ2), and corresponding degrees of freedom of the 2344 Δ4x values for samples in `samples` and sessions in `sessions`. 2345 2346 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2347 ''' 2348 if samples == 'all samples': 2349 mysamples = [k for k in self.samples] 2350 elif samples == 'anchors': 2351 mysamples = [k for k in self.anchors] 2352 elif samples == 'unknowns': 2353 mysamples = [k for k in self.unknowns] 2354 else: 2355 mysamples = samples 2356 2357 if sessions == 'all sessions': 2358 sessions = [k for k in self.sessions] 2359 2360 chisq, Nf = 0, 0 2361 for sample in mysamples : 2362 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2363 if len(G) > 1 : 2364 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2365 Nf += (len(G) - 1) 2366 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2367 r = (chisq / Nf)**.5 if Nf > 0 else 0 2368 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2369 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf} 2370 2371 2372 @make_verbal 2373 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2374 ''' 2375 Compute the repeatability of `[r[key] for r in self]` 2376 ''' 2377 2378 if samples == 'all samples': 2379 mysamples = [k for k in self.samples] 2380 elif samples == 'anchors': 2381 mysamples = [k for k in self.anchors] 2382 elif samples == 'unknowns': 2383 mysamples = [k for k in self.unknowns] 2384 else: 2385 mysamples = samples 2386 2387 if sessions == 'all sessions': 2388 sessions = [k for k in self.sessions] 2389 2390 if key in ['D47', 'D48']: 2391 # Full disclosure: the definition of Nf is tricky/debatable 2392 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2393 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2394 Nf = len(G) 2395# print(f'len(G) = {Nf}') 2396 Nf -= len([s for s in mysamples if s in self.unknowns]) 2397# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2398 for session in sessions: 2399 Np = len([ 2400 _ for _ in self.standardization.params 2401 if ( 2402 self.standardization.params[_].expr is not None 2403 and ( 2404 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2405 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2406 ) 2407 ) 2408 ]) 2409# print(f'session {session}: {Np} parameters to consider') 2410 Na = len({ 2411 r['Sample'] for r in self.sessions[session]['data'] 2412 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2413 }) 2414# print(f'session {session}: {Na} different anchors in that session') 2415 Nf -= min(Np, Na) 2416# print(f'Nf = {Nf}') 2417 2418# for sample in mysamples : 2419# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2420# if len(X) > 1 : 2421# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2422# if sample in self.unknowns: 2423# Nf += len(X) - 1 2424# else: 2425# Nf += len(X) 2426# if samples in ['anchors', 'all samples']: 2427# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2428 r = (chisq / Nf)**.5 if Nf > 0 else 0 2429 2430 else: # if key not in ['D47', 'D48'] 2431 chisq, Nf = 0, 0 2432 for sample in mysamples : 2433 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2434 if len(X) > 1 : 2435 Nf += len(X) - 1 2436 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2437 r = (chisq / Nf)**.5 if Nf > 0 else 0 2438 2439 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2440 return r 2441 2442 def sample_average(self, samples, weights = 'equal', normalize = True): 2443 ''' 2444 Weighted average Δ4x value of a group of samples, accounting for covariance. 2445 2446 Returns the weighed average Δ4x value and associated SE 2447 of a group of samples. Weights are equal by default. If `normalize` is 2448 true, `weights` will be rescaled so that their sum equals 1. 2449 2450 **Examples** 2451 2452 ```python 2453 self.sample_average(['X','Y'], [1, 2]) 2454 ``` 2455 2456 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2457 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2458 values of samples X and Y, respectively. 2459 2460 ```python 2461 self.sample_average(['X','Y'], [1, -1], normalize = False) 2462 ``` 2463 2464 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2465 ''' 2466 if weights == 'equal': 2467 weights = [1/len(samples)] * len(samples) 2468 2469 if normalize: 2470 s = sum(weights) 2471 if s: 2472 weights = [w/s for w in weights] 2473 2474 try: 2475# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2476# C = self.standardization.covar[indices,:][:,indices] 2477 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2478 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2479 return correlated_sum(X, C, weights) 2480 except ValueError: 2481 return (0., 0.) 2482 2483 2484 def sample_D4x_covar(self, sample1, sample2 = None): 2485 ''' 2486 Covariance between Δ4x values of samples 2487 2488 Returns the error covariance between the average Δ4x values of two 2489 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2490 returns the Δ4x variance for that sample. 2491 ''' 2492 if sample2 is None: 2493 sample2 = sample1 2494 if self.standardization_method == 'pooled': 2495 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2496 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2497 return self.standardization.covar[i, j] 2498 elif self.standardization_method == 'indep_sessions': 2499 if sample1 == sample2: 2500 return self.samples[sample1][f'SE_D{self._4x}']**2 2501 else: 2502 c = 0 2503 for session in self.sessions: 2504 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2505 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2506 if sdata1 and sdata2: 2507 a = self.sessions[session]['a'] 2508 # !! TODO: CM below does not account for temporal changes in standardization parameters 2509 CM = self.sessions[session]['CM'][:3,:3] 2510 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2511 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2512 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2513 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2514 c += ( 2515 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2516 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2517 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2518 @ CM 2519 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2520 ) / a**2 2521 return float(c) 2522 2523 def sample_D4x_correl(self, sample1, sample2 = None): 2524 ''' 2525 Correlation between Δ4x errors of samples 2526 2527 Returns the error correlation between the average Δ4x values of two samples. 2528 ''' 2529 if sample2 is None or sample2 == sample1: 2530 return 1. 2531 return ( 2532 self.sample_D4x_covar(sample1, sample2) 2533 / self.unknowns[sample1][f'SE_D{self._4x}'] 2534 / self.unknowns[sample2][f'SE_D{self._4x}'] 2535 ) 2536 2537 def plot_single_session(self, 2538 session, 2539 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2540 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2541 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2542 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2543 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2544 xylimits = 'free', # | 'constant' 2545 x_label = None, 2546 y_label = None, 2547 error_contour_interval = 'auto', 2548 fig = 'new', 2549 ): 2550 ''' 2551 Generate plot for a single session 2552 ''' 2553 if x_label is None: 2554 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2555 if y_label is None: 2556 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2557 2558 out = _SessionPlot() 2559 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2560 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2561 2562 if fig == 'new': 2563 out.fig = ppl.figure(figsize = (6,6)) 2564 ppl.subplots_adjust(.1,.1,.9,.9) 2565 2566 out.anchor_analyses, = ppl.plot( 2567 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2568 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2569 **kw_plot_anchors) 2570 out.unknown_analyses, = ppl.plot( 2571 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2572 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2573 **kw_plot_unknowns) 2574 out.anchor_avg = ppl.plot( 2575 np.array([ np.array([ 2576 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2577 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2578 ]) for sample in anchors]).T, 2579 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2580 **kw_plot_anchor_avg) 2581 out.unknown_avg = ppl.plot( 2582 np.array([ np.array([ 2583 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2584 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2585 ]) for sample in unknowns]).T, 2586 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2587 **kw_plot_unknown_avg) 2588 if xylimits == 'constant': 2589 x = [r[f'd{self._4x}'] for r in self] 2590 y = [r[f'D{self._4x}'] for r in self] 2591 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2592 w, h = x2-x1, y2-y1 2593 x1 -= w/20 2594 x2 += w/20 2595 y1 -= h/20 2596 y2 += h/20 2597 ppl.axis([x1, x2, y1, y2]) 2598 elif xylimits == 'free': 2599 x1, x2, y1, y2 = ppl.axis() 2600 else: 2601 x1, x2, y1, y2 = ppl.axis(xylimits) 2602 2603 if error_contour_interval != 'none': 2604 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2605 XI,YI = np.meshgrid(xi, yi) 2606 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2607 if error_contour_interval == 'auto': 2608 rng = np.max(SI) - np.min(SI) 2609 if rng <= 0.01: 2610 cinterval = 0.001 2611 elif rng <= 0.03: 2612 cinterval = 0.004 2613 elif rng <= 0.1: 2614 cinterval = 0.01 2615 elif rng <= 0.3: 2616 cinterval = 0.03 2617 elif rng <= 1.: 2618 cinterval = 0.1 2619 else: 2620 cinterval = 0.5 2621 else: 2622 cinterval = error_contour_interval 2623 2624 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2625 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2626 out.clabel = ppl.clabel(out.contour) 2627 2628 ppl.xlabel(x_label) 2629 ppl.ylabel(y_label) 2630 ppl.title(session, weight = 'bold') 2631 ppl.grid(alpha = .2) 2632 out.ax = ppl.gca() 2633 2634 return out 2635 2636 def plot_residuals( 2637 self, 2638 kde = False, 2639 hist = False, 2640 binwidth = 2/3, 2641 dir = 'output', 2642 filename = None, 2643 highlight = [], 2644 colors = None, 2645 figsize = None, 2646 dpi = 100, 2647 yspan = None, 2648 ): 2649 ''' 2650 Plot residuals of each analysis as a function of time (actually, as a function of 2651 the order of analyses in the `D4xdata` object) 2652 2653 + `kde`: whether to add a kernel density estimate of residuals 2654 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2655 + `histbins`: specify bin edges for the histogram 2656 + `dir`: the directory in which to save the plot 2657 + `highlight`: a list of samples to highlight 2658 + `colors`: a dict of `{<sample>: <color>}` for all samples 2659 + `figsize`: (width, height) of figure 2660 + `dpi`: resolution for PNG output 2661 + `yspan`: factor controlling the range of y values shown in plot 2662 (by default: `yspan = 1.5 if kde else 1.0`) 2663 ''' 2664 2665 from matplotlib import ticker 2666 2667 if yspan is None: 2668 if kde: 2669 yspan = 1.5 2670 else: 2671 yspan = 1.0 2672 2673 # Layout 2674 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2675 if hist or kde: 2676 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2677 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2678 else: 2679 ppl.subplots_adjust(.08,.05,.78,.8) 2680 ax1 = ppl.subplot(111) 2681 2682 # Colors 2683 N = len(self.anchors) 2684 if colors is None: 2685 if len(highlight) > 0: 2686 Nh = len(highlight) 2687 if Nh == 1: 2688 colors = {highlight[0]: (0,0,0)} 2689 elif Nh == 3: 2690 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2691 elif Nh == 4: 2692 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2693 else: 2694 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2695 else: 2696 if N == 3: 2697 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2698 elif N == 4: 2699 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2700 else: 2701 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2702 2703 ppl.sca(ax1) 2704 2705 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2706 2707 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2708 2709 session = self[0]['Session'] 2710 x1 = 0 2711# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2712 x_sessions = {} 2713 one_or_more_singlets = False 2714 one_or_more_multiplets = False 2715 multiplets = set() 2716 for k,r in enumerate(self): 2717 if r['Session'] != session: 2718 x2 = k-1 2719 x_sessions[session] = (x1+x2)/2 2720 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2721 session = r['Session'] 2722 x1 = k 2723 singlet = len(self.samples[r['Sample']]['data']) == 1 2724 if not singlet: 2725 multiplets.add(r['Sample']) 2726 if r['Sample'] in self.unknowns: 2727 if singlet: 2728 one_or_more_singlets = True 2729 else: 2730 one_or_more_multiplets = True 2731 kw = dict( 2732 marker = 'x' if singlet else '+', 2733 ms = 4 if singlet else 5, 2734 ls = 'None', 2735 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2736 mew = 1, 2737 alpha = 0.2 if singlet else 1, 2738 ) 2739 if highlight and r['Sample'] not in highlight: 2740 kw['alpha'] = 0.2 2741 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2742 x2 = k 2743 x_sessions[session] = (x1+x2)/2 2744 2745 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2746 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2747 if not (hist or kde): 2748 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2749 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2750 2751 xmin, xmax, ymin, ymax = ppl.axis() 2752 if yspan != 1: 2753 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2754 for s in x_sessions: 2755 ppl.text( 2756 x_sessions[s], 2757 ymax +1, 2758 s, 2759 va = 'bottom', 2760 **( 2761 dict(ha = 'center') 2762 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2763 else dict(ha = 'left', rotation = 45) 2764 ) 2765 ) 2766 2767 if hist or kde: 2768 ppl.sca(ax2) 2769 2770 for s in colors: 2771 kw['marker'] = '+' 2772 kw['ms'] = 5 2773 kw['mec'] = colors[s] 2774 kw['label'] = s 2775 kw['alpha'] = 1 2776 ppl.plot([], [], **kw) 2777 2778 kw['mec'] = (0,0,0) 2779 2780 if one_or_more_singlets: 2781 kw['marker'] = 'x' 2782 kw['ms'] = 4 2783 kw['alpha'] = .2 2784 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2785 ppl.plot([], [], **kw) 2786 2787 if one_or_more_multiplets: 2788 kw['marker'] = '+' 2789 kw['ms'] = 4 2790 kw['alpha'] = 1 2791 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2792 ppl.plot([], [], **kw) 2793 2794 if hist or kde: 2795 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2796 else: 2797 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2798 leg.set_zorder(-1000) 2799 2800 ppl.sca(ax1) 2801 2802 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2803 ppl.xticks([]) 2804 ppl.axis([-1, len(self), None, None]) 2805 2806 if hist or kde: 2807 ppl.sca(ax2) 2808 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2809 2810 if kde: 2811 from scipy.stats import gaussian_kde 2812 yi = np.linspace(ymin, ymax, 201) 2813 xi = gaussian_kde(X).evaluate(yi) 2814 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2815# ppl.plot(xi, yi, 'k-', lw = 1) 2816 elif hist: 2817 ppl.hist( 2818 X, 2819 orientation = 'horizontal', 2820 histtype = 'stepfilled', 2821 ec = [.4]*3, 2822 fc = [.25]*3, 2823 alpha = .25, 2824 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2825 ) 2826 ppl.text(0, 0, 2827 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2828 size = 7.5, 2829 alpha = 1, 2830 va = 'center', 2831 ha = 'left', 2832 ) 2833 2834 ppl.axis([0, None, ymin, ymax]) 2835 ppl.xticks([]) 2836 ppl.yticks([]) 2837# ax2.spines['left'].set_visible(False) 2838 ax2.spines['right'].set_visible(False) 2839 ax2.spines['top'].set_visible(False) 2840 ax2.spines['bottom'].set_visible(False) 2841 2842 ax1.axis([None, None, ymin, ymax]) 2843 2844 if not os.path.exists(dir): 2845 os.makedirs(dir) 2846 if filename is None: 2847 return fig 2848 elif filename == '': 2849 filename = f'D{self._4x}_residuals.pdf' 2850 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2851 ppl.close(fig) 2852 2853 2854 def simulate(self, *args, **kwargs): 2855 ''' 2856 Legacy function with warning message pointing to `virtual_data()` 2857 ''' 2858 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()') 2859 2860 def plot_distribution_of_analyses( 2861 self, 2862 dir = 'output', 2863 filename = None, 2864 vs_time = False, 2865 figsize = (6,4), 2866 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2867 output = None, 2868 dpi = 100, 2869 ): 2870 ''' 2871 Plot temporal distribution of all analyses in the data set. 2872 2873 **Parameters** 2874 2875 + `dir`: the directory in which to save the plot 2876 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2877 + `dpi`: resolution for PNG output 2878 + `figsize`: (width, height) of figure 2879 + `dpi`: resolution for PNG output 2880 ''' 2881 2882 asamples = [s for s in self.anchors] 2883 usamples = [s for s in self.unknowns] 2884 if output is None or output == 'fig': 2885 fig = ppl.figure(figsize = figsize) 2886 ppl.subplots_adjust(*subplots_adjust) 2887 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2888 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2889 Xmax += (Xmax-Xmin)/40 2890 Xmin -= (Xmax-Xmin)/41 2891 for k, s in enumerate(asamples + usamples): 2892 if vs_time: 2893 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2894 else: 2895 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2896 Y = [-k for x in X] 2897 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2898 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2899 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2900 ppl.axis([Xmin, Xmax, -k-1, 1]) 2901 ppl.xlabel('\ntime') 2902 ppl.gca().annotate('', 2903 xy = (0.6, -0.02), 2904 xycoords = 'axes fraction', 2905 xytext = (.4, -0.02), 2906 arrowprops = dict(arrowstyle = "->", color = 'k'), 2907 ) 2908 2909 2910 x2 = -1 2911 for session in self.sessions: 2912 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2913 if vs_time: 2914 ppl.axvline(x1, color = 'k', lw = .75) 2915 if x2 > -1: 2916 if not vs_time: 2917 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2918 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2919# from xlrd import xldate_as_datetime 2920# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2921 if vs_time: 2922 ppl.axvline(x2, color = 'k', lw = .75) 2923 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2924 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2925 2926 ppl.xticks([]) 2927 ppl.yticks([]) 2928 2929 if output is None: 2930 if not os.path.exists(dir): 2931 os.makedirs(dir) 2932 if filename == None: 2933 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2934 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2935 ppl.close(fig) 2936 elif output == 'ax': 2937 return ppl.gca() 2938 elif output == 'fig': 2939 return fig 2940 2941 2942 def plot_bulk_compositions( 2943 self, 2944 samples = None, 2945 dir = 'output/bulk_compositions', 2946 figsize = (6,6), 2947 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2948 show = False, 2949 sample_color = (0,.5,1), 2950 analysis_color = (.7,.7,.7), 2951 labeldist = 0.3, 2952 radius = 0.05, 2953 ): 2954 ''' 2955 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2956 2957 By default, creates a directory `./output/bulk_compositions` where plots for 2958 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2959 2960 2961 **Parameters** 2962 2963 + `samples`: Only these samples are processed (by default: all samples). 2964 + `dir`: where to save the plots 2965 + `figsize`: (width, height) of figure 2966 + `subplots_adjust`: passed to `subplots_adjust()` 2967 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2968 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2969 + `sample_color`: color used for replicate markers/labels 2970 + `analysis_color`: color used for sample markers/labels 2971 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2972 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2973 ''' 2974 2975 from matplotlib.patches import Ellipse 2976 2977 if samples is None: 2978 samples = [_ for _ in self.samples] 2979 2980 saved = {} 2981 2982 for s in samples: 2983 2984 fig = ppl.figure(figsize = figsize) 2985 fig.subplots_adjust(*subplots_adjust) 2986 ax = ppl.subplot(111) 2987 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2988 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2989 ppl.title(s) 2990 2991 2992 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2993 UID = [_['UID'] for _ in self.samples[s]['data']] 2994 XY0 = XY.mean(0) 2995 2996 for xy in XY: 2997 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2998 2999 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3000 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3001 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3002 saved[s] = [XY, XY0] 3003 3004 x1, x2, y1, y2 = ppl.axis() 3005 x0, dx = (x1+x2)/2, (x2-x1)/2 3006 y0, dy = (y1+y2)/2, (y2-y1)/2 3007 dx, dy = [max(max(dx, dy), radius)]*2 3008 3009 ppl.axis([ 3010 x0 - 1.2*dx, 3011 x0 + 1.2*dx, 3012 y0 - 1.2*dy, 3013 y0 + 1.2*dy, 3014 ]) 3015 3016 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3017 3018 for xy, uid in zip(XY, UID): 3019 3020 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3021 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3022 3023 if (vector_in_display_space**2).sum() > 0: 3024 3025 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3026 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3027 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3028 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3029 3030 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3031 3032 else: 3033 3034 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3035 3036 if radius: 3037 ax.add_artist(Ellipse( 3038 xy = XY0, 3039 width = radius*2, 3040 height = radius*2, 3041 ls = (0, (2,2)), 3042 lw = .7, 3043 ec = analysis_color, 3044 fc = 'None', 3045 )) 3046 ppl.text( 3047 XY0[0], 3048 XY0[1]-radius, 3049 f'\n± {radius*1e3:.0f} ppm', 3050 color = analysis_color, 3051 va = 'top', 3052 ha = 'center', 3053 linespacing = 0.4, 3054 size = 8, 3055 ) 3056 3057 if not os.path.exists(dir): 3058 os.makedirs(dir) 3059 fig.savefig(f'{dir}/{s}.pdf') 3060 ppl.close(fig) 3061 3062 fig = ppl.figure(figsize = figsize) 3063 fig.subplots_adjust(*subplots_adjust) 3064 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3065 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3066 3067 for s in saved: 3068 for xy in saved[s][0]: 3069 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3070 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3071 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3072 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3073 3074 x1, x2, y1, y2 = ppl.axis() 3075 ppl.axis([ 3076 x1 - (x2-x1)/10, 3077 x2 + (x2-x1)/10, 3078 y1 - (y2-y1)/10, 3079 y2 + (y2-y1)/10, 3080 ]) 3081 3082 3083 if not os.path.exists(dir): 3084 os.makedirs(dir) 3085 fig.savefig(f'{dir}/__all__.pdf') 3086 if show: 3087 ppl.show() 3088 ppl.close(fig) 3089 3090 3091 def _save_D4x_correl( 3092 self, 3093 samples = None, 3094 dir = 'output', 3095 filename = None, 3096 D4x_precision = 4, 3097 correl_precision = 4, 3098 ): 3099 ''' 3100 Save D4x values along with their SE and correlation matrix. 3101 3102 **Parameters** 3103 3104 + `samples`: Only these samples are output (by default: all samples). 3105 + `dir`: the directory in which to save the faile (by defaut: `output`) 3106 + `filename`: the name to the csv file to write to (by default: `D4x_correl.csv`) 3107 + `D4x_precision`: the precision to use when writing `D4x` and `D4x_SE` values (by default: 4) 3108 + `correl_precision`: the precision to use when writing correlation factor values (by default: 4) 3109 ''' 3110 if samples is None: 3111 samples = sorted([s for s in self.unknowns]) 3112 3113 out = [['Sample']] + [[s] for s in samples] 3114 out[0] += [f'D{self._4x}', f'D{self._4x}_SE', f'D{self._4x}_correl'] 3115 for k,s in enumerate(samples): 3116 out[k+1] += [f'{self.samples[s][f"D{self._4x}"]:.4f}', f'{self.samples[s][f"SE_D{self._4x}"]:.4f}'] 3117 for s2 in samples: 3118 out[k+1] += [f'{self.sample_D4x_correl(s,s2):.4f}'] 3119 3120 if not os.path.exists(dir): 3121 os.makedirs(dir) 3122 if filename is None: 3123 filename = f'D{self._4x}_correl.csv' 3124 with open(f'{dir}/{filename}', 'w') as fid: 3125 fid.write(make_csv(out))
Store and process data for a large set of Δ47 and/or Δ48 analyses, usually comprising more than one analytical session.
923 def __init__(self, l = [], mass = '47', logfile = '', session = 'mySession', verbose = False): 924 ''' 925 **Parameters** 926 927 + `l`: a list of dictionaries, with each dictionary including at least the keys 928 `Sample`, `d45`, `d46`, and `d47` or `d48`. 929 + `mass`: `'47'` or `'48'` 930 + `logfile`: if specified, write detailed logs to this file path when calling `D4xdata` methods. 931 + `session`: define session name for analyses without a `Session` key 932 + `verbose`: if `True`, print out detailed logs when calling `D4xdata` methods. 933 934 Returns a `D4xdata` object derived from `list`. 935 ''' 936 self._4x = mass 937 self.verbose = verbose 938 self.prefix = 'D4xdata' 939 self.logfile = logfile 940 list.__init__(self, l) 941 self.Nf = None 942 self.repeatability = {} 943 self.refresh(session = session)
Parameters
l
: a list of dictionaries, with each dictionary including at least the keysSample
,d45
,d46
, andd47
ord48
.mass
:'47'
or'48'
logfile
: if specified, write detailed logs to this file path when callingD4xdata
methods.session
: define session name for analyses without aSession
keyverbose
: ifTrue
, print out detailed logs when callingD4xdata
methods.
Returns a D4xdata
object derived from list
.
Absolute (18O/16C) ratio of VSMOW. By default equal to 0.0020052 (Baertschi, 1976)
Mass-dependent exponent for triple oxygen isotopes. By default equal to 0.528 (Barkan & Luz, 2005)
Absolute (17O/16C) ratio of VSMOW.
By default equal to 0.00038475
(Assonov & Brenninkmeijer, 2003,
rescaled to R13_VPDB
)
Absolute (18O/16C) ratio of VPDB.
By definition equal to R18_VSMOW * 1.03092
.
Absolute (17O/16C) ratio of VPDB.
By definition equal to R17_VSMOW * 1.03092 ** LAMBDA_17
.
After the Δ4x standardization step, each sample is tested to assess whether the Δ4x variance within all analyses for that sample differs significantly from that observed for a given reference sample (using Levene's test, which yields a p-value corresponding to the null hypothesis that the underlying variances are equal).
LEVENE_REF_SAMPLE
(by default equal to 'ETH-3'
) specifies which
sample should be used as a reference for this test.
Specifies the 18O/16O fractionation factor generally applicable
to acid reactions in the dataset. Currently used by D4xdata.wg()
,
D4xdata.standardize_d13C
, and D4xdata.standardize_d18O
.
By default equal to 1.008129 (calcite reacted at 90 °C, Kim et al., 2007).
Nominal δ13CVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d13C()
.
By default equal to {'ETH-1': 2.02, 'ETH-2': -10.17, 'ETH-3': 1.71}
after
Bernasconi et al. (2018).
Nominal δ18OVPDB values assigned to carbonate standards, used by
D4xdata.standardize_d18O()
.
By default equal to {'ETH-1': -2.19, 'ETH-2': -18.69, 'ETH-3': -1.78}
after
Bernasconi et al. (2018).
Method by which to standardize δ13C values:
none
: do not apply any δ13C standardization.'1pt'
: within each session, offset all initial δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ13C values so as to minimize the difference between final δ13CVPDB values andNominal_d13C_VPDB
(averaged over all analyses for whichNominal_d13C_VPDB
is defined).
Method by which to standardize δ18O values:
none
: do not apply any δ18O standardization.'1pt'
: within each session, offset all initial δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).'2pt'
: within each session, apply a affine trasformation to all δ18O values so as to minimize the difference between final δ18OVPDB values andNominal_d18O_VPDB
(averaged over all analyses for whichNominal_d18O_VPDB
is defined).
946 def make_verbal(oldfun): 947 ''' 948 Decorator: allow temporarily changing `self.prefix` and overriding `self.verbose`. 949 ''' 950 @wraps(oldfun) 951 def newfun(*args, verbose = '', **kwargs): 952 myself = args[0] 953 oldprefix = myself.prefix 954 myself.prefix = oldfun.__name__ 955 if verbose != '': 956 oldverbose = myself.verbose 957 myself.verbose = verbose 958 out = oldfun(*args, **kwargs) 959 myself.prefix = oldprefix 960 if verbose != '': 961 myself.verbose = oldverbose 962 return out 963 return newfun
Decorator: allow temporarily changing self.prefix
and overriding self.verbose
.
966 def msg(self, txt): 967 ''' 968 Log a message to `self.logfile`, and print it out if `verbose = True` 969 ''' 970 self.log(txt) 971 if self.verbose: 972 print(f'{f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
, and print it out if verbose = True
975 def vmsg(self, txt): 976 ''' 977 Log a message to `self.logfile` and print it out 978 ''' 979 self.log(txt) 980 print(txt)
Log a message to self.logfile
and print it out
983 def log(self, *txts): 984 ''' 985 Log a message to `self.logfile` 986 ''' 987 if self.logfile: 988 with open(self.logfile, 'a') as fid: 989 for txt in txts: 990 fid.write(f'\n{dt.now().strftime("%Y-%m-%d %H:%M:%S")} {f"[{self.prefix}]":<16} {txt}')
Log a message to self.logfile
993 def refresh(self, session = 'mySession'): 994 ''' 995 Update `self.sessions`, `self.samples`, `self.anchors`, and `self.unknowns`. 996 ''' 997 self.fill_in_missing_info(session = session) 998 self.refresh_sessions() 999 self.refresh_samples()
Update self.sessions
, self.samples
, self.anchors
, and self.unknowns
.
1002 def refresh_sessions(self): 1003 ''' 1004 Update `self.sessions` and set `scrambling_drift`, `slope_drift`, and `wg_drift` 1005 to `False` for all sessions. 1006 ''' 1007 self.sessions = { 1008 s: {'data': [r for r in self if r['Session'] == s]} 1009 for s in sorted({r['Session'] for r in self}) 1010 } 1011 for s in self.sessions: 1012 self.sessions[s]['scrambling_drift'] = False 1013 self.sessions[s]['slope_drift'] = False 1014 self.sessions[s]['wg_drift'] = False 1015 self.sessions[s]['d13C_standardization_method'] = self.d13C_STANDARDIZATION_METHOD 1016 self.sessions[s]['d18O_standardization_method'] = self.d18O_STANDARDIZATION_METHOD
Update self.sessions
and set scrambling_drift
, slope_drift
, and wg_drift
to False
for all sessions.
1019 def refresh_samples(self): 1020 ''' 1021 Define `self.samples`, `self.anchors`, and `self.unknowns`. 1022 ''' 1023 self.samples = { 1024 s: {'data': [r for r in self if r['Sample'] == s]} 1025 for s in sorted({r['Sample'] for r in self}) 1026 } 1027 self.anchors = {s: self.samples[s] for s in self.samples if s in self.Nominal_D4x} 1028 self.unknowns = {s: self.samples[s] for s in self.samples if s not in self.Nominal_D4x}
Define self.samples
, self.anchors
, and self.unknowns
.
1031 def read(self, filename, sep = '', session = ''): 1032 ''' 1033 Read file in csv format to load data into a `D47data` object. 1034 1035 In the csv file, spaces before and after field separators (`','` by default) 1036 are optional. Each line corresponds to a single analysis. 1037 1038 The required fields are: 1039 1040 + `UID`: a unique identifier 1041 + `Session`: an identifier for the analytical session 1042 + `Sample`: a sample identifier 1043 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1044 1045 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1046 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1047 and `d49` are optional, and set to NaN by default. 1048 1049 **Parameters** 1050 1051 + `fileneme`: the path of the file to read 1052 + `sep`: csv separator delimiting the fields 1053 + `session`: set `Session` field to this string for all analyses 1054 ''' 1055 with open(filename) as fid: 1056 self.input(fid.read(), sep = sep, session = session)
Read file in csv format to load data into a D47data
object.
In the csv file, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
fileneme
: the path of the file to readsep
: csv separator delimiting the fieldssession
: setSession
field to this string for all analyses
1059 def input(self, txt, sep = '', session = ''): 1060 ''' 1061 Read `txt` string in csv format to load analysis data into a `D47data` object. 1062 1063 In the csv string, spaces before and after field separators (`','` by default) 1064 are optional. Each line corresponds to a single analysis. 1065 1066 The required fields are: 1067 1068 + `UID`: a unique identifier 1069 + `Session`: an identifier for the analytical session 1070 + `Sample`: a sample identifier 1071 + `d45`, `d46`, and at least one of `d47` or `d48`: the working-gas delta values 1072 1073 Independently known oxygen-17 anomalies may be provided as `D17O` (in ‰ relative to 1074 VSMOW, λ = `self.LAMBDA_17`), and are otherwise assumed to be zero. Working-gas deltas `d47`, `d48` 1075 and `d49` are optional, and set to NaN by default. 1076 1077 **Parameters** 1078 1079 + `txt`: the csv string to read 1080 + `sep`: csv separator delimiting the fields. By default, use `,`, `;`, or `\t`, 1081 whichever appers most often in `txt`. 1082 + `session`: set `Session` field to this string for all analyses 1083 ''' 1084 if sep == '': 1085 sep = sorted(',;\t', key = lambda x: - txt.count(x))[0] 1086 txt = [[x.strip() for x in l.split(sep)] for l in txt.splitlines() if l.strip()] 1087 data = [{k: v if k in ['UID', 'Session', 'Sample'] else smart_type(v) for k,v in zip(txt[0], l) if v != ''} for l in txt[1:]] 1088 1089 if session != '': 1090 for r in data: 1091 r['Session'] = session 1092 1093 self += data 1094 self.refresh()
Read txt
string in csv format to load analysis data into a D47data
object.
In the csv string, spaces before and after field separators (','
by default)
are optional. Each line corresponds to a single analysis.
The required fields are:
UID
: a unique identifierSession
: an identifier for the analytical sessionSample
: a sample identifierd45
,d46
, and at least one ofd47
ord48
: the working-gas delta values
Independently known oxygen-17 anomalies may be provided as D17O
(in ‰ relative to
VSMOW, λ = self.LAMBDA_17
), and are otherwise assumed to be zero. Working-gas deltas d47
, d48
and d49
are optional, and set to NaN by default.
Parameters
txt
: the csv string to readsep
: csv separator delimiting the fields. By default, use,
,;
, or, whichever appers most often in
txt
.session
: setSession
field to this string for all analyses
1097 @make_verbal 1098 def wg(self, samples = None, a18_acid = None): 1099 ''' 1100 Compute bulk composition of the working gas for each session based on 1101 the carbonate standards defined in both `self.Nominal_d13C_VPDB` and 1102 `self.Nominal_d18O_VPDB`. 1103 ''' 1104 1105 self.msg('Computing WG composition:') 1106 1107 if a18_acid is None: 1108 a18_acid = self.ALPHA_18O_ACID_REACTION 1109 if samples is None: 1110 samples = [s for s in self.Nominal_d13C_VPDB if s in self.Nominal_d18O_VPDB] 1111 1112 assert a18_acid, f'Acid fractionation factor should not be zero.' 1113 1114 samples = [s for s in samples if s in self.Nominal_d13C_VPDB and s in self.Nominal_d18O_VPDB] 1115 R45R46_standards = {} 1116 for sample in samples: 1117 d13C_vpdb = self.Nominal_d13C_VPDB[sample] 1118 d18O_vpdb = self.Nominal_d18O_VPDB[sample] 1119 R13_s = self.R13_VPDB * (1 + d13C_vpdb / 1000) 1120 R17_s = self.R17_VPDB * ((1 + d18O_vpdb / 1000) * a18_acid) ** self.LAMBDA_17 1121 R18_s = self.R18_VPDB * (1 + d18O_vpdb / 1000) * a18_acid 1122 1123 C12_s = 1 / (1 + R13_s) 1124 C13_s = R13_s / (1 + R13_s) 1125 C16_s = 1 / (1 + R17_s + R18_s) 1126 C17_s = R17_s / (1 + R17_s + R18_s) 1127 C18_s = R18_s / (1 + R17_s + R18_s) 1128 1129 C626_s = C12_s * C16_s ** 2 1130 C627_s = 2 * C12_s * C16_s * C17_s 1131 C628_s = 2 * C12_s * C16_s * C18_s 1132 C636_s = C13_s * C16_s ** 2 1133 C637_s = 2 * C13_s * C16_s * C17_s 1134 C727_s = C12_s * C17_s ** 2 1135 1136 R45_s = (C627_s + C636_s) / C626_s 1137 R46_s = (C628_s + C637_s + C727_s) / C626_s 1138 R45R46_standards[sample] = (R45_s, R46_s) 1139 1140 for s in self.sessions: 1141 db = [r for r in self.sessions[s]['data'] if r['Sample'] in samples] 1142 assert db, f'No sample from {samples} found in session "{s}".' 1143# dbsamples = sorted({r['Sample'] for r in db}) 1144 1145 X = [r['d45'] for r in db] 1146 Y = [R45R46_standards[r['Sample']][0] for r in db] 1147 x1, x2 = np.min(X), np.max(X) 1148 1149 if x1 < x2: 1150 wgcoord = x1/(x1-x2) 1151 else: 1152 wgcoord = 999 1153 1154 if wgcoord < -.5 or wgcoord > 1.5: 1155 # unreasonable to extrapolate to d45 = 0 1156 R45_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1157 else : 1158 # d45 = 0 is reasonably well bracketed 1159 R45_wg = np.polyfit(X, Y, 1)[1] 1160 1161 X = [r['d46'] for r in db] 1162 Y = [R45R46_standards[r['Sample']][1] for r in db] 1163 x1, x2 = np.min(X), np.max(X) 1164 1165 if x1 < x2: 1166 wgcoord = x1/(x1-x2) 1167 else: 1168 wgcoord = 999 1169 1170 if wgcoord < -.5 or wgcoord > 1.5: 1171 # unreasonable to extrapolate to d46 = 0 1172 R46_wg = np.mean([y/(1+x/1000) for x,y in zip(X,Y)]) 1173 else : 1174 # d46 = 0 is reasonably well bracketed 1175 R46_wg = np.polyfit(X, Y, 1)[1] 1176 1177 d13Cwg_VPDB, d18Owg_VSMOW = self.compute_bulk_delta(R45_wg, R46_wg) 1178 1179 self.msg(f'Session {s} WG: δ13C_VPDB = {d13Cwg_VPDB:.3f} δ18O_VSMOW = {d18Owg_VSMOW:.3f}') 1180 1181 self.sessions[s]['d13Cwg_VPDB'] = d13Cwg_VPDB 1182 self.sessions[s]['d18Owg_VSMOW'] = d18Owg_VSMOW 1183 for r in self.sessions[s]['data']: 1184 r['d13Cwg_VPDB'] = d13Cwg_VPDB 1185 r['d18Owg_VSMOW'] = d18Owg_VSMOW
Compute bulk composition of the working gas for each session based on
the carbonate standards defined in both self.Nominal_d13C_VPDB
and
self.Nominal_d18O_VPDB
.
1188 def compute_bulk_delta(self, R45, R46, D17O = 0): 1189 ''' 1190 Compute δ13C_VPDB and δ18O_VSMOW, 1191 by solving the generalized form of equation (17) from 1192 [Brand et al. (2010)](https://doi.org/10.1351/PAC-REP-09-01-05), 1193 assuming that δ18O_VSMOW is not too big (0 ± 50 ‰) and 1194 solving the corresponding second-order Taylor polynomial. 1195 (Appendix A of [Daëron et al., 2016](https://doi.org/10.1016/j.chemgeo.2016.08.014)) 1196 ''' 1197 1198 K = np.exp(D17O / 1000) * self.R17_VSMOW * self.R18_VSMOW ** -self.LAMBDA_17 1199 1200 A = -3 * K ** 2 * self.R18_VSMOW ** (2 * self.LAMBDA_17) 1201 B = 2 * K * R45 * self.R18_VSMOW ** self.LAMBDA_17 1202 C = 2 * self.R18_VSMOW 1203 D = -R46 1204 1205 aa = A * self.LAMBDA_17 * (2 * self.LAMBDA_17 - 1) + B * self.LAMBDA_17 * (self.LAMBDA_17 - 1) / 2 1206 bb = 2 * A * self.LAMBDA_17 + B * self.LAMBDA_17 + C 1207 cc = A + B + C + D 1208 1209 d18O_VSMOW = 1000 * (-bb + (bb ** 2 - 4 * aa * cc) ** .5) / (2 * aa) 1210 1211 R18 = (1 + d18O_VSMOW / 1000) * self.R18_VSMOW 1212 R17 = K * R18 ** self.LAMBDA_17 1213 R13 = R45 - 2 * R17 1214 1215 d13C_VPDB = 1000 * (R13 / self.R13_VPDB - 1) 1216 1217 return d13C_VPDB, d18O_VSMOW
Compute δ13CVPDB and δ18OVSMOW, by solving the generalized form of equation (17) from Brand et al. (2010), assuming that δ18OVSMOW is not too big (0 ± 50 ‰) and solving the corresponding second-order Taylor polynomial. (Appendix A of Daëron et al., 2016)
1220 @make_verbal 1221 def crunch(self, verbose = ''): 1222 ''' 1223 Compute bulk composition and raw clumped isotope anomalies for all analyses. 1224 ''' 1225 for r in self: 1226 self.compute_bulk_and_clumping_deltas(r) 1227 self.standardize_d13C() 1228 self.standardize_d18O() 1229 self.msg(f"Crunched {len(self)} analyses.")
Compute bulk composition and raw clumped isotope anomalies for all analyses.
1232 def fill_in_missing_info(self, session = 'mySession'): 1233 ''' 1234 Fill in optional fields with default values 1235 ''' 1236 for i,r in enumerate(self): 1237 if 'D17O' not in r: 1238 r['D17O'] = 0. 1239 if 'UID' not in r: 1240 r['UID'] = f'{i+1}' 1241 if 'Session' not in r: 1242 r['Session'] = session 1243 for k in ['d47', 'd48', 'd49']: 1244 if k not in r: 1245 r[k] = np.nan
Fill in optional fields with default values
1248 def standardize_d13C(self): 1249 ''' 1250 Perform δ13C standadization within each session `s` according to 1251 `self.sessions[s]['d13C_standardization_method']`, which is defined by default 1252 by `D47data.refresh_sessions()`as equal to `self.d13C_STANDARDIZATION_METHOD`, but 1253 may be redefined abitrarily at a later stage. 1254 ''' 1255 for s in self.sessions: 1256 if self.sessions[s]['d13C_standardization_method'] in ['1pt', '2pt']: 1257 XY = [(r['d13C_VPDB'], self.Nominal_d13C_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d13C_VPDB] 1258 X,Y = zip(*XY) 1259 if self.sessions[s]['d13C_standardization_method'] == '1pt': 1260 offset = np.mean(Y) - np.mean(X) 1261 for r in self.sessions[s]['data']: 1262 r['d13C_VPDB'] += offset 1263 elif self.sessions[s]['d13C_standardization_method'] == '2pt': 1264 a,b = np.polyfit(X,Y,1) 1265 for r in self.sessions[s]['data']: 1266 r['d13C_VPDB'] = a * r['d13C_VPDB'] + b
Perform δ13C standadization within each session s
according to
self.sessions[s]['d13C_standardization_method']
, which is defined by default
by D47data.refresh_sessions()
as equal to self.d13C_STANDARDIZATION_METHOD
, but
may be redefined abitrarily at a later stage.
1268 def standardize_d18O(self): 1269 ''' 1270 Perform δ18O standadization within each session `s` according to 1271 `self.ALPHA_18O_ACID_REACTION` and `self.sessions[s]['d18O_standardization_method']`, 1272 which is defined by default by `D47data.refresh_sessions()`as equal to 1273 `self.d18O_STANDARDIZATION_METHOD`, but may be redefined abitrarily at a later stage. 1274 ''' 1275 for s in self.sessions: 1276 if self.sessions[s]['d18O_standardization_method'] in ['1pt', '2pt']: 1277 XY = [(r['d18O_VSMOW'], self.Nominal_d18O_VPDB[r['Sample']]) for r in self.sessions[s]['data'] if r['Sample'] in self.Nominal_d18O_VPDB] 1278 X,Y = zip(*XY) 1279 Y = [(1000+y) * self.R18_VPDB * self.ALPHA_18O_ACID_REACTION / self.R18_VSMOW - 1000 for y in Y] 1280 if self.sessions[s]['d18O_standardization_method'] == '1pt': 1281 offset = np.mean(Y) - np.mean(X) 1282 for r in self.sessions[s]['data']: 1283 r['d18O_VSMOW'] += offset 1284 elif self.sessions[s]['d18O_standardization_method'] == '2pt': 1285 a,b = np.polyfit(X,Y,1) 1286 for r in self.sessions[s]['data']: 1287 r['d18O_VSMOW'] = a * r['d18O_VSMOW'] + b
Perform δ18O standadization within each session s
according to
self.ALPHA_18O_ACID_REACTION
and self.sessions[s]['d18O_standardization_method']
,
which is defined by default by D47data.refresh_sessions()
as equal to
self.d18O_STANDARDIZATION_METHOD
, but may be redefined abitrarily at a later stage.
1290 def compute_bulk_and_clumping_deltas(self, r): 1291 ''' 1292 Compute δ13C_VPDB, δ18O_VSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis `r`. 1293 ''' 1294 1295 # Compute working gas R13, R18, and isobar ratios 1296 R13_wg = self.R13_VPDB * (1 + r['d13Cwg_VPDB'] / 1000) 1297 R18_wg = self.R18_VSMOW * (1 + r['d18Owg_VSMOW'] / 1000) 1298 R45_wg, R46_wg, R47_wg, R48_wg, R49_wg = self.compute_isobar_ratios(R13_wg, R18_wg) 1299 1300 # Compute analyte isobar ratios 1301 R45 = (1 + r['d45'] / 1000) * R45_wg 1302 R46 = (1 + r['d46'] / 1000) * R46_wg 1303 R47 = (1 + r['d47'] / 1000) * R47_wg 1304 R48 = (1 + r['d48'] / 1000) * R48_wg 1305 R49 = (1 + r['d49'] / 1000) * R49_wg 1306 1307 r['d13C_VPDB'], r['d18O_VSMOW'] = self.compute_bulk_delta(R45, R46, D17O = r['D17O']) 1308 R13 = (1 + r['d13C_VPDB'] / 1000) * self.R13_VPDB 1309 R18 = (1 + r['d18O_VSMOW'] / 1000) * self.R18_VSMOW 1310 1311 # Compute stochastic isobar ratios of the analyte 1312 R45stoch, R46stoch, R47stoch, R48stoch, R49stoch = self.compute_isobar_ratios( 1313 R13, R18, D17O = r['D17O'] 1314 ) 1315 1316 # Check that R45/R45stoch and R46/R46stoch are undistinguishable from 1, 1317 # and raise a warning if the corresponding anomalies exceed 0.02 ppm. 1318 if (R45 / R45stoch - 1) > 5e-8: 1319 self.vmsg(f'This is unexpected: R45/R45stoch - 1 = {1e6 * (R45 / R45stoch - 1):.3f} ppm') 1320 if (R46 / R46stoch - 1) > 5e-8: 1321 self.vmsg(f'This is unexpected: R46/R46stoch - 1 = {1e6 * (R46 / R46stoch - 1):.3f} ppm') 1322 1323 # Compute raw clumped isotope anomalies 1324 r['D47raw'] = 1000 * (R47 / R47stoch - 1) 1325 r['D48raw'] = 1000 * (R48 / R48stoch - 1) 1326 r['D49raw'] = 1000 * (R49 / R49stoch - 1)
Compute δ13CVPDB, δ18OVSMOW, and raw Δ47, Δ48, Δ49 values for a single analysis r
.
1329 def compute_isobar_ratios(self, R13, R18, D17O=0, D47=0, D48=0, D49=0): 1330 ''' 1331 Compute isobar ratios for a sample with isotopic ratios `R13` and `R18`, 1332 optionally accounting for non-zero values of Δ17O (`D17O`) and clumped isotope 1333 anomalies (`D47`, `D48`, `D49`), all expressed in permil. 1334 ''' 1335 1336 # Compute R17 1337 R17 = self.R17_VSMOW * np.exp(D17O / 1000) * (R18 / self.R18_VSMOW) ** self.LAMBDA_17 1338 1339 # Compute isotope concentrations 1340 C12 = (1 + R13) ** -1 1341 C13 = C12 * R13 1342 C16 = (1 + R17 + R18) ** -1 1343 C17 = C16 * R17 1344 C18 = C16 * R18 1345 1346 # Compute stochastic isotopologue concentrations 1347 C626 = C16 * C12 * C16 1348 C627 = C16 * C12 * C17 * 2 1349 C628 = C16 * C12 * C18 * 2 1350 C636 = C16 * C13 * C16 1351 C637 = C16 * C13 * C17 * 2 1352 C638 = C16 * C13 * C18 * 2 1353 C727 = C17 * C12 * C17 1354 C728 = C17 * C12 * C18 * 2 1355 C737 = C17 * C13 * C17 1356 C738 = C17 * C13 * C18 * 2 1357 C828 = C18 * C12 * C18 1358 C838 = C18 * C13 * C18 1359 1360 # Compute stochastic isobar ratios 1361 R45 = (C636 + C627) / C626 1362 R46 = (C628 + C637 + C727) / C626 1363 R47 = (C638 + C728 + C737) / C626 1364 R48 = (C738 + C828) / C626 1365 R49 = C838 / C626 1366 1367 # Account for stochastic anomalies 1368 R47 *= 1 + D47 / 1000 1369 R48 *= 1 + D48 / 1000 1370 R49 *= 1 + D49 / 1000 1371 1372 # Return isobar ratios 1373 return R45, R46, R47, R48, R49
Compute isobar ratios for a sample with isotopic ratios R13
and R18
,
optionally accounting for non-zero values of Δ17O (D17O
) and clumped isotope
anomalies (D47
, D48
, D49
), all expressed in permil.
1376 def split_samples(self, samples_to_split = 'all', grouping = 'by_session'): 1377 ''' 1378 Split unknown samples by UID (treat all analyses as different samples) 1379 or by session (treat analyses of a given sample in different sessions as 1380 different samples). 1381 1382 **Parameters** 1383 1384 + `samples_to_split`: a list of samples to split, e.g., `['IAEA-C1', 'IAEA-C2']` 1385 + `grouping`: `by_uid` | `by_session` 1386 ''' 1387 if samples_to_split == 'all': 1388 samples_to_split = [s for s in self.unknowns] 1389 gkeys = {'by_uid':'UID', 'by_session':'Session'} 1390 self.grouping = grouping.lower() 1391 if self.grouping in gkeys: 1392 gkey = gkeys[self.grouping] 1393 for r in self: 1394 if r['Sample'] in samples_to_split: 1395 r['Sample_original'] = r['Sample'] 1396 r['Sample'] = f"{r['Sample']}__{r[gkey]}" 1397 elif r['Sample'] in self.unknowns: 1398 r['Sample_original'] = r['Sample'] 1399 self.refresh_samples()
Split unknown samples by UID (treat all analyses as different samples) or by session (treat analyses of a given sample in different sessions as different samples).
Parameters
samples_to_split
: a list of samples to split, e.g.,['IAEA-C1', 'IAEA-C2']
grouping
:by_uid
|by_session
1402 def unsplit_samples(self, tables = False): 1403 ''' 1404 Reverse the effects of `D47data.split_samples()`. 1405 1406 This should only be used after `D4xdata.standardize()` with `method='pooled'`. 1407 1408 After `D4xdata.standardize()` with `method='indep_sessions'`, one should 1409 probably use `D4xdata.combine_samples()` instead to reverse the effects of 1410 `D47data.split_samples()` with `grouping='by_uid'`, or `w_avg()` to reverse the 1411 effects of `D47data.split_samples()` with `grouping='by_sessions'` (because in 1412 that case session-averaged Δ4x values are statistically independent). 1413 ''' 1414 unknowns_old = sorted({s for s in self.unknowns}) 1415 CM_old = self.standardization.covar[:,:] 1416 VD_old = self.standardization.params.valuesdict().copy() 1417 vars_old = self.standardization.var_names 1418 1419 unknowns_new = sorted({r['Sample_original'] for r in self if 'Sample_original' in r}) 1420 1421 Ns = len(vars_old) - len(unknowns_old) 1422 vars_new = vars_old[:Ns] + [f'D{self._4x}_{pf(u)}' for u in unknowns_new] 1423 VD_new = {k: VD_old[k] for k in vars_old[:Ns]} 1424 1425 W = np.zeros((len(vars_new), len(vars_old))) 1426 W[:Ns,:Ns] = np.eye(Ns) 1427 for u in unknowns_new: 1428 splits = sorted({r['Sample'] for r in self if 'Sample_original' in r and r['Sample_original'] == u}) 1429 if self.grouping == 'by_session': 1430 weights = [self.samples[s][f'SE_D{self._4x}']**-2 for s in splits] 1431 elif self.grouping == 'by_uid': 1432 weights = [1 for s in splits] 1433 sw = sum(weights) 1434 weights = [w/sw for w in weights] 1435 W[vars_new.index(f'D{self._4x}_{pf(u)}'),[vars_old.index(f'D{self._4x}_{pf(s)}') for s in splits]] = weights[:] 1436 1437 CM_new = W @ CM_old @ W.T 1438 V = W @ np.array([[VD_old[k]] for k in vars_old]) 1439 VD_new = {k:v[0] for k,v in zip(vars_new, V)} 1440 1441 self.standardization.covar = CM_new 1442 self.standardization.params.valuesdict = lambda : VD_new 1443 self.standardization.var_names = vars_new 1444 1445 for r in self: 1446 if r['Sample'] in self.unknowns: 1447 r['Sample_split'] = r['Sample'] 1448 r['Sample'] = r['Sample_original'] 1449 1450 self.refresh_samples() 1451 self.consolidate_samples() 1452 self.repeatabilities() 1453 1454 if tables: 1455 self.table_of_analyses() 1456 self.table_of_samples()
Reverse the effects of D47data.split_samples()
.
This should only be used after D4xdata.standardize()
with method='pooled'
.
After D4xdata.standardize()
with method='indep_sessions'
, one should
probably use D4xdata.combine_samples()
instead to reverse the effects of
D47data.split_samples()
with grouping='by_uid'
, or w_avg()
to reverse the
effects of D47data.split_samples()
with grouping='by_sessions'
(because in
that case session-averaged Δ4x values are statistically independent).
1458 def assign_timestamps(self): 1459 ''' 1460 Assign a time field `t` of type `float` to each analysis. 1461 1462 If `TimeTag` is one of the data fields, `t` is equal within a given session 1463 to `TimeTag` minus the mean value of `TimeTag` for that session. 1464 Otherwise, `TimeTag` is by default equal to the index of each analysis 1465 in the dataset and `t` is defined as above. 1466 ''' 1467 for session in self.sessions: 1468 sdata = self.sessions[session]['data'] 1469 try: 1470 t0 = np.mean([r['TimeTag'] for r in sdata]) 1471 for r in sdata: 1472 r['t'] = r['TimeTag'] - t0 1473 except KeyError: 1474 t0 = (len(sdata)-1)/2 1475 for t,r in enumerate(sdata): 1476 r['t'] = t - t0
Assign a time field t
of type float
to each analysis.
If TimeTag
is one of the data fields, t
is equal within a given session
to TimeTag
minus the mean value of TimeTag
for that session.
Otherwise, TimeTag
is by default equal to the index of each analysis
in the dataset and t
is defined as above.
1479 def report(self): 1480 ''' 1481 Prints a report on the standardization fit. 1482 Only applicable after `D4xdata.standardize(method='pooled')`. 1483 ''' 1484 report_fit(self.standardization)
Prints a report on the standardization fit.
Only applicable after D4xdata.standardize(method='pooled')
.
1487 def combine_samples(self, sample_groups): 1488 ''' 1489 Combine analyses of different samples to compute weighted average Δ4x 1490 and new error (co)variances corresponding to the groups defined by the `sample_groups` 1491 dictionary. 1492 1493 Caution: samples are weighted by number of replicate analyses, which is a 1494 reasonable default behavior but is not always optimal (e.g., in the case of strongly 1495 correlated analytical errors for one or more samples). 1496 1497 Returns a tuplet of: 1498 1499 + the list of group names 1500 + an array of the corresponding Δ4x values 1501 + the corresponding (co)variance matrix 1502 1503 **Parameters** 1504 1505 + `sample_groups`: a dictionary of the form: 1506 ```py 1507 {'group1': ['sample_1', 'sample_2'], 1508 'group2': ['sample_3', 'sample_4', 'sample_5']} 1509 ``` 1510 ''' 1511 1512 samples = [s for k in sorted(sample_groups.keys()) for s in sorted(sample_groups[k])] 1513 groups = sorted(sample_groups.keys()) 1514 group_total_weights = {k: sum([self.samples[s]['N'] for s in sample_groups[k]]) for k in groups} 1515 D4x_old = np.array([[self.samples[x][f'D{self._4x}']] for x in samples]) 1516 CM_old = np.array([[self.sample_D4x_covar(x,y) for x in samples] for y in samples]) 1517 W = np.array([ 1518 [self.samples[i]['N']/group_total_weights[j] if i in sample_groups[j] else 0 for i in samples] 1519 for j in groups]) 1520 D4x_new = W @ D4x_old 1521 CM_new = W @ CM_old @ W.T 1522 1523 return groups, D4x_new[:,0], CM_new
Combine analyses of different samples to compute weighted average Δ4x
and new error (co)variances corresponding to the groups defined by the sample_groups
dictionary.
Caution: samples are weighted by number of replicate analyses, which is a reasonable default behavior but is not always optimal (e.g., in the case of strongly correlated analytical errors for one or more samples).
Returns a tuplet of:
- the list of group names
- an array of the corresponding Δ4x values
- the corresponding (co)variance matrix
Parameters
sample_groups
: a dictionary of the form:
{'group1': ['sample_1', 'sample_2'],
'group2': ['sample_3', 'sample_4', 'sample_5']}
1526 @make_verbal 1527 def standardize(self, 1528 method = 'pooled', 1529 weighted_sessions = [], 1530 consolidate = True, 1531 consolidate_tables = False, 1532 consolidate_plots = False, 1533 constraints = {}, 1534 ): 1535 ''' 1536 Compute absolute Δ4x values for all replicate analyses and for sample averages. 1537 If `method` argument is set to `'pooled'`, the standardization processes all sessions 1538 in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous, 1539 i.e. that their true Δ4x value does not change between sessions, 1540 ([Daëron, 2021](https://doi.org/10.1029/2020GC009592)). If `method` argument is set to 1541 `'indep_sessions'`, the standardization processes each session independently, based only 1542 on anchors analyses. 1543 ''' 1544 1545 self.standardization_method = method 1546 self.assign_timestamps() 1547 1548 if method == 'pooled': 1549 if weighted_sessions: 1550 for session_group in weighted_sessions: 1551 if self._4x == '47': 1552 X = D47data([r for r in self if r['Session'] in session_group]) 1553 elif self._4x == '48': 1554 X = D48data([r for r in self if r['Session'] in session_group]) 1555 X.Nominal_D4x = self.Nominal_D4x.copy() 1556 X.refresh() 1557 result = X.standardize(method = 'pooled', weighted_sessions = [], consolidate = False) 1558 w = np.sqrt(result.redchi) 1559 self.msg(f'Session group {session_group} MRSWD = {w:.4f}') 1560 for r in X: 1561 r[f'wD{self._4x}raw'] *= w 1562 else: 1563 self.msg(f'All D{self._4x}raw weights set to 1 ‰') 1564 for r in self: 1565 r[f'wD{self._4x}raw'] = 1. 1566 1567 params = Parameters() 1568 for k,session in enumerate(self.sessions): 1569 self.msg(f"Session {session}: scrambling_drift is {self.sessions[session]['scrambling_drift']}.") 1570 self.msg(f"Session {session}: slope_drift is {self.sessions[session]['slope_drift']}.") 1571 self.msg(f"Session {session}: wg_drift is {self.sessions[session]['wg_drift']}.") 1572 s = pf(session) 1573 params.add(f'a_{s}', value = 0.9) 1574 params.add(f'b_{s}', value = 0.) 1575 params.add(f'c_{s}', value = -0.9) 1576 params.add(f'a2_{s}', value = 0., 1577# vary = self.sessions[session]['scrambling_drift'], 1578 ) 1579 params.add(f'b2_{s}', value = 0., 1580# vary = self.sessions[session]['slope_drift'], 1581 ) 1582 params.add(f'c2_{s}', value = 0., 1583# vary = self.sessions[session]['wg_drift'], 1584 ) 1585 if not self.sessions[session]['scrambling_drift']: 1586 params[f'a2_{s}'].expr = '0' 1587 if not self.sessions[session]['slope_drift']: 1588 params[f'b2_{s}'].expr = '0' 1589 if not self.sessions[session]['wg_drift']: 1590 params[f'c2_{s}'].expr = '0' 1591 1592 for sample in self.unknowns: 1593 params.add(f'D{self._4x}_{pf(sample)}', value = 0.5) 1594 1595 for k in constraints: 1596 params[k].expr = constraints[k] 1597 1598 def residuals(p): 1599 R = [] 1600 for r in self: 1601 session = pf(r['Session']) 1602 sample = pf(r['Sample']) 1603 if r['Sample'] in self.Nominal_D4x: 1604 R += [ ( 1605 r[f'D{self._4x}raw'] - ( 1606 p[f'a_{session}'] * self.Nominal_D4x[r['Sample']] 1607 + p[f'b_{session}'] * r[f'd{self._4x}'] 1608 + p[f'c_{session}'] 1609 + r['t'] * ( 1610 p[f'a2_{session}'] * self.Nominal_D4x[r['Sample']] 1611 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1612 + p[f'c2_{session}'] 1613 ) 1614 ) 1615 ) / r[f'wD{self._4x}raw'] ] 1616 else: 1617 R += [ ( 1618 r[f'D{self._4x}raw'] - ( 1619 p[f'a_{session}'] * p[f'D{self._4x}_{sample}'] 1620 + p[f'b_{session}'] * r[f'd{self._4x}'] 1621 + p[f'c_{session}'] 1622 + r['t'] * ( 1623 p[f'a2_{session}'] * p[f'D{self._4x}_{sample}'] 1624 + p[f'b2_{session}'] * r[f'd{self._4x}'] 1625 + p[f'c2_{session}'] 1626 ) 1627 ) 1628 ) / r[f'wD{self._4x}raw'] ] 1629 return R 1630 1631 M = Minimizer(residuals, params) 1632 result = M.least_squares() 1633 self.Nf = result.nfree 1634 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1635 new_names, new_covar, new_se = _fullcovar(result)[:3] 1636 result.var_names = new_names 1637 result.covar = new_covar 1638 1639 for r in self: 1640 s = pf(r["Session"]) 1641 a = result.params.valuesdict()[f'a_{s}'] 1642 b = result.params.valuesdict()[f'b_{s}'] 1643 c = result.params.valuesdict()[f'c_{s}'] 1644 a2 = result.params.valuesdict()[f'a2_{s}'] 1645 b2 = result.params.valuesdict()[f'b2_{s}'] 1646 c2 = result.params.valuesdict()[f'c2_{s}'] 1647 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1648 1649 1650 self.standardization = result 1651 1652 for session in self.sessions: 1653 self.sessions[session]['Np'] = 3 1654 for k in ['scrambling', 'slope', 'wg']: 1655 if self.sessions[session][f'{k}_drift']: 1656 self.sessions[session]['Np'] += 1 1657 1658 if consolidate: 1659 self.consolidate(tables = consolidate_tables, plots = consolidate_plots) 1660 return result 1661 1662 1663 elif method == 'indep_sessions': 1664 1665 if weighted_sessions: 1666 for session_group in weighted_sessions: 1667 X = D4xdata([r for r in self if r['Session'] in session_group], mass = self._4x) 1668 X.Nominal_D4x = self.Nominal_D4x.copy() 1669 X.refresh() 1670 # This is only done to assign r['wD47raw'] for r in X: 1671 X.standardize(method = method, weighted_sessions = [], consolidate = False) 1672 self.msg(f'D{self._4x}raw weights set to {1000*X[0][f"wD{self._4x}raw"]:.1f} ppm for sessions in {session_group}') 1673 else: 1674 self.msg('All weights set to 1 ‰') 1675 for r in self: 1676 r[f'wD{self._4x}raw'] = 1 1677 1678 for session in self.sessions: 1679 s = self.sessions[session] 1680 p_names = ['a', 'b', 'c', 'a2', 'b2', 'c2'] 1681 p_active = [True, True, True, s['scrambling_drift'], s['slope_drift'], s['wg_drift']] 1682 s['Np'] = sum(p_active) 1683 sdata = s['data'] 1684 1685 A = np.array([ 1686 [ 1687 self.Nominal_D4x[r['Sample']] / r[f'wD{self._4x}raw'], 1688 r[f'd{self._4x}'] / r[f'wD{self._4x}raw'], 1689 1 / r[f'wD{self._4x}raw'], 1690 self.Nominal_D4x[r['Sample']] * r['t'] / r[f'wD{self._4x}raw'], 1691 r[f'd{self._4x}'] * r['t'] / r[f'wD{self._4x}raw'], 1692 r['t'] / r[f'wD{self._4x}raw'] 1693 ] 1694 for r in sdata if r['Sample'] in self.anchors 1695 ])[:,p_active] # only keep columns for the active parameters 1696 Y = np.array([[r[f'D{self._4x}raw'] / r[f'wD{self._4x}raw']] for r in sdata if r['Sample'] in self.anchors]) 1697 s['Na'] = Y.size 1698 CM = linalg.inv(A.T @ A) 1699 bf = (CM @ A.T @ Y).T[0,:] 1700 k = 0 1701 for n,a in zip(p_names, p_active): 1702 if a: 1703 s[n] = bf[k] 1704# self.msg(f'{n} = {bf[k]}') 1705 k += 1 1706 else: 1707 s[n] = 0. 1708# self.msg(f'{n} = 0.0') 1709 1710 for r in sdata : 1711 a, b, c, a2, b2, c2 = s['a'], s['b'], s['c'], s['a2'], s['b2'], s['c2'] 1712 r[f'D{self._4x}'] = (r[f'D{self._4x}raw'] - c - b * r[f'd{self._4x}'] - c2 * r['t'] - b2 * r['t'] * r[f'd{self._4x}']) / (a + a2 * r['t']) 1713 r[f'wD{self._4x}'] = r[f'wD{self._4x}raw'] / (a + a2 * r['t']) 1714 1715 s['CM'] = np.zeros((6,6)) 1716 i = 0 1717 k_active = [j for j,a in enumerate(p_active) if a] 1718 for j,a in enumerate(p_active): 1719 if a: 1720 s['CM'][j,k_active] = CM[i,:] 1721 i += 1 1722 1723 if not weighted_sessions: 1724 w = self.rmswd()['rmswd'] 1725 for r in self: 1726 r[f'wD{self._4x}'] *= w 1727 r[f'wD{self._4x}raw'] *= w 1728 for session in self.sessions: 1729 self.sessions[session]['CM'] *= w**2 1730 1731 for session in self.sessions: 1732 s = self.sessions[session] 1733 s['SE_a'] = s['CM'][0,0]**.5 1734 s['SE_b'] = s['CM'][1,1]**.5 1735 s['SE_c'] = s['CM'][2,2]**.5 1736 s['SE_a2'] = s['CM'][3,3]**.5 1737 s['SE_b2'] = s['CM'][4,4]**.5 1738 s['SE_c2'] = s['CM'][5,5]**.5 1739 1740 if not weighted_sessions: 1741 self.Nf = len(self) - len(self.unknowns) - np.sum([self.sessions[s]['Np'] for s in self.sessions]) 1742 else: 1743 self.Nf = 0 1744 for sg in weighted_sessions: 1745 self.Nf += self.rmswd(sessions = sg)['Nf'] 1746 1747 self.t95 = tstudent.ppf(1 - 0.05/2, self.Nf) 1748 1749 avgD4x = { 1750 sample: np.mean([r[f'D{self._4x}'] for r in self if r['Sample'] == sample]) 1751 for sample in self.samples 1752 } 1753 chi2 = np.sum([(r[f'D{self._4x}'] - avgD4x[r['Sample']])**2 for r in self]) 1754 rD4x = (chi2/self.Nf)**.5 1755 self.repeatability[f'sigma_{self._4x}'] = rD4x 1756 1757 if consolidate: 1758 self.consolidate(tables = consolidate_tables, plots = consolidate_plots)
Compute absolute Δ4x values for all replicate analyses and for sample averages.
If method
argument is set to 'pooled'
, the standardization processes all sessions
in a single step, assuming that all samples (anchors and unknowns alike) are homogeneous,
i.e. that their true Δ4x value does not change between sessions,
(Daëron, 2021). If method
argument is set to
'indep_sessions'
, the standardization processes each session independently, based only
on anchors analyses.
1761 def standardization_error(self, session, d4x, D4x, t = 0): 1762 ''' 1763 Compute standardization error for a given session and 1764 (δ47, Δ47) composition. 1765 ''' 1766 a = self.sessions[session]['a'] 1767 b = self.sessions[session]['b'] 1768 c = self.sessions[session]['c'] 1769 a2 = self.sessions[session]['a2'] 1770 b2 = self.sessions[session]['b2'] 1771 c2 = self.sessions[session]['c2'] 1772 CM = self.sessions[session]['CM'] 1773 1774 x, y = D4x, d4x 1775 z = a * x + b * y + c + a2 * x * t + b2 * y * t + c2 * t 1776# x = (z - b*y - b2*y*t - c - c2*t) / (a+a2*t) 1777 dxdy = -(b+b2*t) / (a+a2*t) 1778 dxdz = 1. / (a+a2*t) 1779 dxda = -x / (a+a2*t) 1780 dxdb = -y / (a+a2*t) 1781 dxdc = -1. / (a+a2*t) 1782 dxda2 = -x * a2 / (a+a2*t) 1783 dxdb2 = -y * t / (a+a2*t) 1784 dxdc2 = -t / (a+a2*t) 1785 V = np.array([dxda, dxdb, dxdc, dxda2, dxdb2, dxdc2]) 1786 sx = (V @ CM @ V.T) ** .5 1787 return sx
Compute standardization error for a given session and (δ47, Δ47) composition.
1790 @make_verbal 1791 def summary(self, 1792 dir = 'output', 1793 filename = None, 1794 save_to_file = True, 1795 print_out = True, 1796 ): 1797 ''' 1798 Print out an/or save to disk a summary of the standardization results. 1799 1800 **Parameters** 1801 1802 + `dir`: the directory in which to save the table 1803 + `filename`: the name to the csv file to write to 1804 + `save_to_file`: whether to save the table to disk 1805 + `print_out`: whether to print out the table 1806 ''' 1807 1808 out = [] 1809 out += [['N samples (anchors + unknowns)', f"{len(self.samples)} ({len(self.anchors)} + {len(self.unknowns)})"]] 1810 out += [['N analyses (anchors + unknowns)', f"{len(self)} ({len([r for r in self if r['Sample'] in self.anchors])} + {len([r for r in self if r['Sample'] in self.unknowns])})"]] 1811 out += [['Repeatability of δ13C_VPDB', f"{1000 * self.repeatability['r_d13C_VPDB']:.1f} ppm"]] 1812 out += [['Repeatability of δ18O_VSMOW', f"{1000 * self.repeatability['r_d18O_VSMOW']:.1f} ppm"]] 1813 out += [[f'Repeatability of Δ{self._4x} (anchors)', f"{1000 * self.repeatability[f'r_D{self._4x}a']:.1f} ppm"]] 1814 out += [[f'Repeatability of Δ{self._4x} (unknowns)', f"{1000 * self.repeatability[f'r_D{self._4x}u']:.1f} ppm"]] 1815 out += [[f'Repeatability of Δ{self._4x} (all)', f"{1000 * self.repeatability[f'r_D{self._4x}']:.1f} ppm"]] 1816 out += [['Model degrees of freedom', f"{self.Nf}"]] 1817 out += [['Student\'s 95% t-factor', f"{self.t95:.2f}"]] 1818 out += [['Standardization method', self.standardization_method]] 1819 1820 if save_to_file: 1821 if not os.path.exists(dir): 1822 os.makedirs(dir) 1823 if filename is None: 1824 filename = f'D{self._4x}_summary.csv' 1825 with open(f'{dir}/{filename}', 'w') as fid: 1826 fid.write(make_csv(out)) 1827 if print_out: 1828 self.msg('\n' + pretty_table(out, header = 0))
Print out an/or save to disk a summary of the standardization results.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the table
1831 @make_verbal 1832 def table_of_sessions(self, 1833 dir = 'output', 1834 filename = None, 1835 save_to_file = True, 1836 print_out = True, 1837 output = None, 1838 ): 1839 ''' 1840 Print out an/or save to disk a table of sessions. 1841 1842 **Parameters** 1843 1844 + `dir`: the directory in which to save the table 1845 + `filename`: the name to the csv file to write to 1846 + `save_to_file`: whether to save the table to disk 1847 + `print_out`: whether to print out the table 1848 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1849 if set to `'raw'`: return a list of list of strings 1850 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1851 ''' 1852 include_a2 = any([self.sessions[session]['scrambling_drift'] for session in self.sessions]) 1853 include_b2 = any([self.sessions[session]['slope_drift'] for session in self.sessions]) 1854 include_c2 = any([self.sessions[session]['wg_drift'] for session in self.sessions]) 1855 1856 out = [['Session','Na','Nu','d13Cwg_VPDB','d18Owg_VSMOW','r_d13C','r_d18O',f'r_D{self._4x}','a ± SE','1e3 x b ± SE','c ± SE']] 1857 if include_a2: 1858 out[-1] += ['a2 ± SE'] 1859 if include_b2: 1860 out[-1] += ['b2 ± SE'] 1861 if include_c2: 1862 out[-1] += ['c2 ± SE'] 1863 for session in self.sessions: 1864 out += [[ 1865 session, 1866 f"{self.sessions[session]['Na']}", 1867 f"{self.sessions[session]['Nu']}", 1868 f"{self.sessions[session]['d13Cwg_VPDB']:.3f}", 1869 f"{self.sessions[session]['d18Owg_VSMOW']:.3f}", 1870 f"{self.sessions[session]['r_d13C_VPDB']:.4f}", 1871 f"{self.sessions[session]['r_d18O_VSMOW']:.4f}", 1872 f"{self.sessions[session][f'r_D{self._4x}']:.4f}", 1873 f"{self.sessions[session]['a']:.3f} ± {self.sessions[session]['SE_a']:.3f}", 1874 f"{1e3*self.sessions[session]['b']:.3f} ± {1e3*self.sessions[session]['SE_b']:.3f}", 1875 f"{self.sessions[session]['c']:.3f} ± {self.sessions[session]['SE_c']:.3f}", 1876 ]] 1877 if include_a2: 1878 if self.sessions[session]['scrambling_drift']: 1879 out[-1] += [f"{self.sessions[session]['a2']:.1e} ± {self.sessions[session]['SE_a2']:.1e}"] 1880 else: 1881 out[-1] += [''] 1882 if include_b2: 1883 if self.sessions[session]['slope_drift']: 1884 out[-1] += [f"{self.sessions[session]['b2']:.1e} ± {self.sessions[session]['SE_b2']:.1e}"] 1885 else: 1886 out[-1] += [''] 1887 if include_c2: 1888 if self.sessions[session]['wg_drift']: 1889 out[-1] += [f"{self.sessions[session]['c2']:.1e} ± {self.sessions[session]['SE_c2']:.1e}"] 1890 else: 1891 out[-1] += [''] 1892 1893 if save_to_file: 1894 if not os.path.exists(dir): 1895 os.makedirs(dir) 1896 if filename is None: 1897 filename = f'D{self._4x}_sessions.csv' 1898 with open(f'{dir}/{filename}', 'w') as fid: 1899 fid.write(make_csv(out)) 1900 if print_out: 1901 self.msg('\n' + pretty_table(out)) 1902 if output == 'raw': 1903 return out 1904 elif output == 'pretty': 1905 return pretty_table(out)
Print out an/or save to disk a table of sessions.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1908 @make_verbal 1909 def table_of_analyses( 1910 self, 1911 dir = 'output', 1912 filename = None, 1913 save_to_file = True, 1914 print_out = True, 1915 output = None, 1916 ): 1917 ''' 1918 Print out an/or save to disk a table of analyses. 1919 1920 **Parameters** 1921 1922 + `dir`: the directory in which to save the table 1923 + `filename`: the name to the csv file to write to 1924 + `save_to_file`: whether to save the table to disk 1925 + `print_out`: whether to print out the table 1926 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 1927 if set to `'raw'`: return a list of list of strings 1928 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1929 ''' 1930 1931 out = [['UID','Session','Sample']] 1932 extra_fields = [f for f in [('SampleMass','.2f'),('ColdFingerPressure','.1f'),('AcidReactionYield','.3f')] if f[0] in {k for r in self for k in r}] 1933 for f in extra_fields: 1934 out[-1] += [f[0]] 1935 out[-1] += ['d13Cwg_VPDB','d18Owg_VSMOW','d45','d46','d47','d48','d49','d13C_VPDB','d18O_VSMOW','D47raw','D48raw','D49raw',f'D{self._4x}'] 1936 for r in self: 1937 out += [[f"{r['UID']}",f"{r['Session']}",f"{r['Sample']}"]] 1938 for f in extra_fields: 1939 out[-1] += [f"{r[f[0]]:{f[1]}}"] 1940 out[-1] += [ 1941 f"{r['d13Cwg_VPDB']:.3f}", 1942 f"{r['d18Owg_VSMOW']:.3f}", 1943 f"{r['d45']:.6f}", 1944 f"{r['d46']:.6f}", 1945 f"{r['d47']:.6f}", 1946 f"{r['d48']:.6f}", 1947 f"{r['d49']:.6f}", 1948 f"{r['d13C_VPDB']:.6f}", 1949 f"{r['d18O_VSMOW']:.6f}", 1950 f"{r['D47raw']:.6f}", 1951 f"{r['D48raw']:.6f}", 1952 f"{r['D49raw']:.6f}", 1953 f"{r[f'D{self._4x}']:.6f}" 1954 ] 1955 if save_to_file: 1956 if not os.path.exists(dir): 1957 os.makedirs(dir) 1958 if filename is None: 1959 filename = f'D{self._4x}_analyses.csv' 1960 with open(f'{dir}/{filename}', 'w') as fid: 1961 fid.write(make_csv(out)) 1962 if print_out: 1963 self.msg('\n' + pretty_table(out)) 1964 return out
Print out an/or save to disk a table of analyses.
Parameters
dir
: the directory in which to save the tablefilename
: the name to the csv file to write tosave_to_file
: whether to save the table to diskprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
1966 @make_verbal 1967 def covar_table( 1968 self, 1969 correl = False, 1970 dir = 'output', 1971 filename = None, 1972 save_to_file = True, 1973 print_out = True, 1974 output = None, 1975 ): 1976 ''' 1977 Print out, save to disk and/or return the variance-covariance matrix of D4x 1978 for all unknown samples. 1979 1980 **Parameters** 1981 1982 + `dir`: the directory in which to save the csv 1983 + `filename`: the name of the csv file to write to 1984 + `save_to_file`: whether to save the csv 1985 + `print_out`: whether to print out the matrix 1986 + `output`: if set to `'pretty'`: return a pretty text matrix (see `pretty_table()`); 1987 if set to `'raw'`: return a list of list of strings 1988 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 1989 ''' 1990 samples = sorted([u for u in self.unknowns]) 1991 out = [[''] + samples] 1992 for s1 in samples: 1993 out.append([s1]) 1994 for s2 in samples: 1995 if correl: 1996 out[-1].append(f'{self.sample_D4x_correl(s1, s2):.6f}') 1997 else: 1998 out[-1].append(f'{self.sample_D4x_covar(s1, s2):.8e}') 1999 2000 if save_to_file: 2001 if not os.path.exists(dir): 2002 os.makedirs(dir) 2003 if filename is None: 2004 if correl: 2005 filename = f'D{self._4x}_correl.csv' 2006 else: 2007 filename = f'D{self._4x}_covar.csv' 2008 with open(f'{dir}/{filename}', 'w') as fid: 2009 fid.write(make_csv(out)) 2010 if print_out: 2011 self.msg('\n'+pretty_table(out)) 2012 if output == 'raw': 2013 return out 2014 elif output == 'pretty': 2015 return pretty_table(out)
Print out, save to disk and/or return the variance-covariance matrix of D4x for all unknown samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the matrixoutput
: if set to'pretty'
: return a pretty text matrix (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2017 @make_verbal 2018 def table_of_samples( 2019 self, 2020 dir = 'output', 2021 filename = None, 2022 save_to_file = True, 2023 print_out = True, 2024 output = None, 2025 ): 2026 ''' 2027 Print out, save to disk and/or return a table of samples. 2028 2029 **Parameters** 2030 2031 + `dir`: the directory in which to save the csv 2032 + `filename`: the name of the csv file to write to 2033 + `save_to_file`: whether to save the csv 2034 + `print_out`: whether to print out the table 2035 + `output`: if set to `'pretty'`: return a pretty text table (see `pretty_table()`); 2036 if set to `'raw'`: return a list of list of strings 2037 (e.g., `[['header1', 'header2'], ['0.1', '0.2']]`) 2038 ''' 2039 2040 out = [['Sample','N','d13C_VPDB','d18O_VSMOW',f'D{self._4x}','SE','95% CL','SD','p_Levene']] 2041 for sample in self.anchors: 2042 out += [[ 2043 f"{sample}", 2044 f"{self.samples[sample]['N']}", 2045 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2046 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2047 f"{self.samples[sample][f'D{self._4x}']:.4f}",'','', 2048 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', '' 2049 ]] 2050 for sample in self.unknowns: 2051 out += [[ 2052 f"{sample}", 2053 f"{self.samples[sample]['N']}", 2054 f"{self.samples[sample]['d13C_VPDB']:.2f}", 2055 f"{self.samples[sample]['d18O_VSMOW']:.2f}", 2056 f"{self.samples[sample][f'D{self._4x}']:.4f}", 2057 f"{self.samples[sample][f'SE_D{self._4x}']:.4f}", 2058 f"± {self.samples[sample][f'SE_D{self._4x}'] * self.t95:.4f}", 2059 f"{self.samples[sample][f'SD_D{self._4x}']:.4f}" if self.samples[sample]['N'] > 1 else '', 2060 f"{self.samples[sample]['p_Levene']:.3f}" if self.samples[sample]['N'] > 2 else '' 2061 ]] 2062 if save_to_file: 2063 if not os.path.exists(dir): 2064 os.makedirs(dir) 2065 if filename is None: 2066 filename = f'D{self._4x}_samples.csv' 2067 with open(f'{dir}/{filename}', 'w') as fid: 2068 fid.write(make_csv(out)) 2069 if print_out: 2070 self.msg('\n'+pretty_table(out)) 2071 if output == 'raw': 2072 return out 2073 elif output == 'pretty': 2074 return pretty_table(out)
Print out, save to disk and/or return a table of samples.
Parameters
dir
: the directory in which to save the csvfilename
: the name of the csv file to write tosave_to_file
: whether to save the csvprint_out
: whether to print out the tableoutput
: if set to'pretty'
: return a pretty text table (seepretty_table()
); if set to'raw'
: return a list of list of strings (e.g.,[['header1', 'header2'], ['0.1', '0.2']]
)
2077 def plot_sessions(self, dir = 'output', figsize = (8,8), filetype = 'pdf', dpi = 100): 2078 ''' 2079 Generate session plots and save them to disk. 2080 2081 **Parameters** 2082 2083 + `dir`: the directory in which to save the plots 2084 + `figsize`: the width and height (in inches) of each plot 2085 + `filetype`: 'pdf' or 'png' 2086 + `dpi`: resolution for PNG output 2087 ''' 2088 if not os.path.exists(dir): 2089 os.makedirs(dir) 2090 2091 for session in self.sessions: 2092 sp = self.plot_single_session(session, xylimits = 'constant') 2093 ppl.savefig(f'{dir}/D{self._4x}_plot_{session}.{filetype}', **({'dpi': dpi} if filetype.lower() == 'png' else {})) 2094 ppl.close(sp.fig)
Generate session plots and save them to disk.
Parameters
dir
: the directory in which to save the plotsfigsize
: the width and height (in inches) of each plotfiletype
: 'pdf' or 'png'dpi
: resolution for PNG output
2097 @make_verbal 2098 def consolidate_samples(self): 2099 ''' 2100 Compile various statistics for each sample. 2101 2102 For each anchor sample: 2103 2104 + `D47` or `D48`: the nominal Δ4x value for this anchor, specified by `self.Nominal_D4x` 2105 + `SE_D47` or `SE_D48`: set to zero by definition 2106 2107 For each unknown sample: 2108 2109 + `D47` or `D48`: the standardized Δ4x value for this unknown 2110 + `SE_D47` or `SE_D48`: the standard error of Δ4x for this unknown 2111 2112 For each anchor and unknown: 2113 2114 + `N`: the total number of analyses of this sample 2115 + `SD_D47` or `SD_D48`: the “sample” (in the statistical sense) standard deviation for this sample 2116 + `d13C_VPDB`: the average δ13C_VPDB value for this sample 2117 + `d18O_VSMOW`: the average δ18O_VSMOW value for this sample (as CO2) 2118 + `p_Levene`: the p-value from a [Levene test](https://en.wikipedia.org/wiki/Levene%27s_test) of equal 2119 variance, indicating whether the Δ4x repeatability this sample differs significantly from 2120 that observed for the reference sample specified by `self.LEVENE_REF_SAMPLE`. 2121 ''' 2122 D4x_ref_pop = [r[f'D{self._4x}'] for r in self.samples[self.LEVENE_REF_SAMPLE]['data']] 2123 for sample in self.samples: 2124 self.samples[sample]['N'] = len(self.samples[sample]['data']) 2125 if self.samples[sample]['N'] > 1: 2126 self.samples[sample][f'SD_D{self._4x}'] = stdev([r[f'D{self._4x}'] for r in self.samples[sample]['data']]) 2127 2128 self.samples[sample]['d13C_VPDB'] = np.mean([r['d13C_VPDB'] for r in self.samples[sample]['data']]) 2129 self.samples[sample]['d18O_VSMOW'] = np.mean([r['d18O_VSMOW'] for r in self.samples[sample]['data']]) 2130 2131 D4x_pop = [r[f'D{self._4x}'] for r in self.samples[sample]['data']] 2132 if len(D4x_pop) > 2: 2133 self.samples[sample]['p_Levene'] = levene(D4x_ref_pop, D4x_pop, center = 'median')[1] 2134 2135 if self.standardization_method == 'pooled': 2136 for sample in self.anchors: 2137 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2138 self.samples[sample][f'SE_D{self._4x}'] = 0. 2139 for sample in self.unknowns: 2140 self.samples[sample][f'D{self._4x}'] = self.standardization.params.valuesdict()[f'D{self._4x}_{pf(sample)}'] 2141 try: 2142 self.samples[sample][f'SE_D{self._4x}'] = self.sample_D4x_covar(sample)**.5 2143 except ValueError: 2144 # when `sample` is constrained by self.standardize(constraints = {...}), 2145 # it is no longer listed in self.standardization.var_names. 2146 # Temporary fix: define SE as zero for now 2147 self.samples[sample][f'SE_D4{self._4x}'] = 0. 2148 2149 elif self.standardization_method == 'indep_sessions': 2150 for sample in self.anchors: 2151 self.samples[sample][f'D{self._4x}'] = self.Nominal_D4x[sample] 2152 self.samples[sample][f'SE_D{self._4x}'] = 0. 2153 for sample in self.unknowns: 2154 self.msg(f'Consolidating sample {sample}') 2155 self.unknowns[sample][f'session_D{self._4x}'] = {} 2156 session_avg = [] 2157 for session in self.sessions: 2158 sdata = [r for r in self.sessions[session]['data'] if r['Sample'] == sample] 2159 if sdata: 2160 self.msg(f'{sample} found in session {session}') 2161 avg_D4x = np.mean([r[f'D{self._4x}'] for r in sdata]) 2162 avg_d4x = np.mean([r[f'd{self._4x}'] for r in sdata]) 2163 # !! TODO: sigma_s below does not account for temporal changes in standardization error 2164 sigma_s = self.standardization_error(session, avg_d4x, avg_D4x) 2165 sigma_u = sdata[0][f'wD{self._4x}raw'] / self.sessions[session]['a'] / len(sdata)**.5 2166 session_avg.append([avg_D4x, (sigma_u**2 + sigma_s**2)**.5]) 2167 self.unknowns[sample][f'session_D{self._4x}'][session] = session_avg[-1] 2168 self.samples[sample][f'D{self._4x}'], self.samples[sample][f'SE_D{self._4x}'] = w_avg(*zip(*session_avg)) 2169 weights = {s: self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 for s in self.unknowns[sample][f'session_D{self._4x}']} 2170 wsum = sum([weights[s] for s in weights]) 2171 for s in weights: 2172 self.unknowns[sample][f'session_D{self._4x}'][s] += [self.unknowns[sample][f'session_D{self._4x}'][s][1]**-2 / wsum] 2173 2174 for r in self: 2175 r[f'D{self._4x}_residual'] = r[f'D{self._4x}'] - self.samples[r['Sample']][f'D{self._4x}']
Compile various statistics for each sample.
For each anchor sample:
D47
orD48
: the nominal Δ4x value for this anchor, specified byself.Nominal_D4x
SE_D47
orSE_D48
: set to zero by definition
For each unknown sample:
D47
orD48
: the standardized Δ4x value for this unknownSE_D47
orSE_D48
: the standard error of Δ4x for this unknown
For each anchor and unknown:
N
: the total number of analyses of this sampleSD_D47
orSD_D48
: the “sample” (in the statistical sense) standard deviation for this sampled13C_VPDB
: the average δ13CVPDB value for this sampled18O_VSMOW
: the average δ18OVSMOW value for this sample (as CO2)p_Levene
: the p-value from a Levene test of equal variance, indicating whether the Δ4x repeatability this sample differs significantly from that observed for the reference sample specified byself.LEVENE_REF_SAMPLE
.
2179 def consolidate_sessions(self): 2180 ''' 2181 Compute various statistics for each session. 2182 2183 + `Na`: Number of anchor analyses in the session 2184 + `Nu`: Number of unknown analyses in the session 2185 + `r_d13C_VPDB`: δ13C_VPDB repeatability of analyses within the session 2186 + `r_d18O_VSMOW`: δ18O_VSMOW repeatability of analyses within the session 2187 + `r_D47` or `r_D48`: Δ4x repeatability of analyses within the session 2188 + `a`: scrambling factor 2189 + `b`: compositional slope 2190 + `c`: WG offset 2191 + `SE_a`: Model stadard erorr of `a` 2192 + `SE_b`: Model stadard erorr of `b` 2193 + `SE_c`: Model stadard erorr of `c` 2194 + `scrambling_drift` (boolean): whether to allow a temporal drift in the scrambling factor (`a`) 2195 + `slope_drift` (boolean): whether to allow a temporal drift in the compositional slope (`b`) 2196 + `wg_drift` (boolean): whether to allow a temporal drift in the WG offset (`c`) 2197 + `a2`: scrambling factor drift 2198 + `b2`: compositional slope drift 2199 + `c2`: WG offset drift 2200 + `Np`: Number of standardization parameters to fit 2201 + `CM`: model covariance matrix for (`a`, `b`, `c`, `a2`, `b2`, `c2`) 2202 + `d13Cwg_VPDB`: δ13C_VPDB of WG 2203 + `d18Owg_VSMOW`: δ18O_VSMOW of WG 2204 ''' 2205 for session in self.sessions: 2206 if 'd13Cwg_VPDB' not in self.sessions[session]: 2207 self.sessions[session]['d13Cwg_VPDB'] = self.sessions[session]['data'][0]['d13Cwg_VPDB'] 2208 if 'd18Owg_VSMOW' not in self.sessions[session]: 2209 self.sessions[session]['d18Owg_VSMOW'] = self.sessions[session]['data'][0]['d18Owg_VSMOW'] 2210 self.sessions[session]['Na'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.anchors]) 2211 self.sessions[session]['Nu'] = len([r for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns]) 2212 2213 self.msg(f'Computing repeatabilities for session {session}') 2214 self.sessions[session]['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors', sessions = [session]) 2215 self.sessions[session]['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors', sessions = [session]) 2216 self.sessions[session][f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', sessions = [session]) 2217 2218 if self.standardization_method == 'pooled': 2219 for session in self.sessions: 2220 2221 self.sessions[session]['a'] = self.standardization.params.valuesdict()[f'a_{pf(session)}'] 2222 i = self.standardization.var_names.index(f'a_{pf(session)}') 2223 self.sessions[session]['SE_a'] = self.standardization.covar[i,i]**.5 2224 2225 self.sessions[session]['b'] = self.standardization.params.valuesdict()[f'b_{pf(session)}'] 2226 i = self.standardization.var_names.index(f'b_{pf(session)}') 2227 self.sessions[session]['SE_b'] = self.standardization.covar[i,i]**.5 2228 2229 self.sessions[session]['c'] = self.standardization.params.valuesdict()[f'c_{pf(session)}'] 2230 i = self.standardization.var_names.index(f'c_{pf(session)}') 2231 self.sessions[session]['SE_c'] = self.standardization.covar[i,i]**.5 2232 2233 self.sessions[session]['a2'] = self.standardization.params.valuesdict()[f'a2_{pf(session)}'] 2234 if self.sessions[session]['scrambling_drift']: 2235 i = self.standardization.var_names.index(f'a2_{pf(session)}') 2236 self.sessions[session]['SE_a2'] = self.standardization.covar[i,i]**.5 2237 else: 2238 self.sessions[session]['SE_a2'] = 0. 2239 2240 self.sessions[session]['b2'] = self.standardization.params.valuesdict()[f'b2_{pf(session)}'] 2241 if self.sessions[session]['slope_drift']: 2242 i = self.standardization.var_names.index(f'b2_{pf(session)}') 2243 self.sessions[session]['SE_b2'] = self.standardization.covar[i,i]**.5 2244 else: 2245 self.sessions[session]['SE_b2'] = 0. 2246 2247 self.sessions[session]['c2'] = self.standardization.params.valuesdict()[f'c2_{pf(session)}'] 2248 if self.sessions[session]['wg_drift']: 2249 i = self.standardization.var_names.index(f'c2_{pf(session)}') 2250 self.sessions[session]['SE_c2'] = self.standardization.covar[i,i]**.5 2251 else: 2252 self.sessions[session]['SE_c2'] = 0. 2253 2254 i = self.standardization.var_names.index(f'a_{pf(session)}') 2255 j = self.standardization.var_names.index(f'b_{pf(session)}') 2256 k = self.standardization.var_names.index(f'c_{pf(session)}') 2257 CM = np.zeros((6,6)) 2258 CM[:3,:3] = self.standardization.covar[[i,j,k],:][:,[i,j,k]] 2259 try: 2260 i2 = self.standardization.var_names.index(f'a2_{pf(session)}') 2261 CM[3,[0,1,2,3]] = self.standardization.covar[i2,[i,j,k,i2]] 2262 CM[[0,1,2,3],3] = self.standardization.covar[[i,j,k,i2],i2] 2263 try: 2264 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2265 CM[3,4] = self.standardization.covar[i2,j2] 2266 CM[4,3] = self.standardization.covar[j2,i2] 2267 except ValueError: 2268 pass 2269 try: 2270 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2271 CM[3,5] = self.standardization.covar[i2,k2] 2272 CM[5,3] = self.standardization.covar[k2,i2] 2273 except ValueError: 2274 pass 2275 except ValueError: 2276 pass 2277 try: 2278 j2 = self.standardization.var_names.index(f'b2_{pf(session)}') 2279 CM[4,[0,1,2,4]] = self.standardization.covar[j2,[i,j,k,j2]] 2280 CM[[0,1,2,4],4] = self.standardization.covar[[i,j,k,j2],j2] 2281 try: 2282 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2283 CM[4,5] = self.standardization.covar[j2,k2] 2284 CM[5,4] = self.standardization.covar[k2,j2] 2285 except ValueError: 2286 pass 2287 except ValueError: 2288 pass 2289 try: 2290 k2 = self.standardization.var_names.index(f'c2_{pf(session)}') 2291 CM[5,[0,1,2,5]] = self.standardization.covar[k2,[i,j,k,k2]] 2292 CM[[0,1,2,5],5] = self.standardization.covar[[i,j,k,k2],k2] 2293 except ValueError: 2294 pass 2295 2296 self.sessions[session]['CM'] = CM 2297 2298 elif self.standardization_method == 'indep_sessions': 2299 pass # Not implemented yet
Compute various statistics for each session.
Na
: Number of anchor analyses in the sessionNu
: Number of unknown analyses in the sessionr_d13C_VPDB
: δ13CVPDB repeatability of analyses within the sessionr_d18O_VSMOW
: δ18OVSMOW repeatability of analyses within the sessionr_D47
orr_D48
: Δ4x repeatability of analyses within the sessiona
: scrambling factorb
: compositional slopec
: WG offsetSE_a
: Model stadard erorr ofa
SE_b
: Model stadard erorr ofb
SE_c
: Model stadard erorr ofc
scrambling_drift
(boolean): whether to allow a temporal drift in the scrambling factor (a
)slope_drift
(boolean): whether to allow a temporal drift in the compositional slope (b
)wg_drift
(boolean): whether to allow a temporal drift in the WG offset (c
)a2
: scrambling factor driftb2
: compositional slope driftc2
: WG offset driftNp
: Number of standardization parameters to fitCM
: model covariance matrix for (a
,b
,c
,a2
,b2
,c2
)d13Cwg_VPDB
: δ13CVPDB of WGd18Owg_VSMOW
: δ18OVSMOW of WG
2302 @make_verbal 2303 def repeatabilities(self): 2304 ''' 2305 Compute analytical repeatabilities for δ13C_VPDB, δ18O_VSMOW, Δ4x 2306 (for all samples, for anchors, and for unknowns). 2307 ''' 2308 self.msg('Computing reproducibilities for all sessions') 2309 2310 self.repeatability['r_d13C_VPDB'] = self.compute_r('d13C_VPDB', samples = 'anchors') 2311 self.repeatability['r_d18O_VSMOW'] = self.compute_r('d18O_VSMOW', samples = 'anchors') 2312 self.repeatability[f'r_D{self._4x}a'] = self.compute_r(f'D{self._4x}', samples = 'anchors') 2313 self.repeatability[f'r_D{self._4x}u'] = self.compute_r(f'D{self._4x}', samples = 'unknowns') 2314 self.repeatability[f'r_D{self._4x}'] = self.compute_r(f'D{self._4x}', samples = 'all samples')
Compute analytical repeatabilities for δ13CVPDB, δ18OVSMOW, Δ4x (for all samples, for anchors, and for unknowns).
2317 @make_verbal 2318 def consolidate(self, tables = True, plots = True): 2319 ''' 2320 Collect information about samples, sessions and repeatabilities. 2321 ''' 2322 self.consolidate_samples() 2323 self.consolidate_sessions() 2324 self.repeatabilities() 2325 2326 if tables: 2327 self.summary() 2328 self.table_of_sessions() 2329 self.table_of_analyses() 2330 self.table_of_samples() 2331 2332 if plots: 2333 self.plot_sessions()
Collect information about samples, sessions and repeatabilities.
2336 @make_verbal 2337 def rmswd(self, 2338 samples = 'all samples', 2339 sessions = 'all sessions', 2340 ): 2341 ''' 2342 Compute the χ2, root mean squared weighted deviation 2343 (i.e. reduced χ2), and corresponding degrees of freedom of the 2344 Δ4x values for samples in `samples` and sessions in `sessions`. 2345 2346 Only used in `D4xdata.standardize()` with `method='indep_sessions'`. 2347 ''' 2348 if samples == 'all samples': 2349 mysamples = [k for k in self.samples] 2350 elif samples == 'anchors': 2351 mysamples = [k for k in self.anchors] 2352 elif samples == 'unknowns': 2353 mysamples = [k for k in self.unknowns] 2354 else: 2355 mysamples = samples 2356 2357 if sessions == 'all sessions': 2358 sessions = [k for k in self.sessions] 2359 2360 chisq, Nf = 0, 0 2361 for sample in mysamples : 2362 G = [ r for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2363 if len(G) > 1 : 2364 X, sX = w_avg([r[f'D{self._4x}'] for r in G], [r[f'wD{self._4x}'] for r in G]) 2365 Nf += (len(G) - 1) 2366 chisq += np.sum([ ((r[f'D{self._4x}']-X)/r[f'wD{self._4x}'])**2 for r in G]) 2367 r = (chisq / Nf)**.5 if Nf > 0 else 0 2368 self.msg(f'RMSWD of r["D{self._4x}"] is {r:.6f} for {samples}.') 2369 return {'rmswd': r, 'chisq': chisq, 'Nf': Nf}
Compute the χ2, root mean squared weighted deviation
(i.e. reduced χ2), and corresponding degrees of freedom of the
Δ4x values for samples in samples
and sessions in sessions
.
Only used in D4xdata.standardize()
with method='indep_sessions'
.
2372 @make_verbal 2373 def compute_r(self, key, samples = 'all samples', sessions = 'all sessions'): 2374 ''' 2375 Compute the repeatability of `[r[key] for r in self]` 2376 ''' 2377 2378 if samples == 'all samples': 2379 mysamples = [k for k in self.samples] 2380 elif samples == 'anchors': 2381 mysamples = [k for k in self.anchors] 2382 elif samples == 'unknowns': 2383 mysamples = [k for k in self.unknowns] 2384 else: 2385 mysamples = samples 2386 2387 if sessions == 'all sessions': 2388 sessions = [k for k in self.sessions] 2389 2390 if key in ['D47', 'D48']: 2391 # Full disclosure: the definition of Nf is tricky/debatable 2392 G = [r for r in self if r['Sample'] in mysamples and r['Session'] in sessions] 2393 chisq = (np.array([r[f'{key}_residual'] for r in G])**2).sum() 2394 Nf = len(G) 2395# print(f'len(G) = {Nf}') 2396 Nf -= len([s for s in mysamples if s in self.unknowns]) 2397# print(f'{len([s for s in mysamples if s in self.unknowns])} unknown samples to consider') 2398 for session in sessions: 2399 Np = len([ 2400 _ for _ in self.standardization.params 2401 if ( 2402 self.standardization.params[_].expr is not None 2403 and ( 2404 (_[0] in 'abc' and _[1] == '_' and _[2:] == pf(session)) 2405 or (_[0] in 'abc' and _[1:3] == '2_' and _[3:] == pf(session)) 2406 ) 2407 ) 2408 ]) 2409# print(f'session {session}: {Np} parameters to consider') 2410 Na = len({ 2411 r['Sample'] for r in self.sessions[session]['data'] 2412 if r['Sample'] in self.anchors and r['Sample'] in mysamples 2413 }) 2414# print(f'session {session}: {Na} different anchors in that session') 2415 Nf -= min(Np, Na) 2416# print(f'Nf = {Nf}') 2417 2418# for sample in mysamples : 2419# X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2420# if len(X) > 1 : 2421# chisq += np.sum([ (x-self.samples[sample][key])**2 for x in X ]) 2422# if sample in self.unknowns: 2423# Nf += len(X) - 1 2424# else: 2425# Nf += len(X) 2426# if samples in ['anchors', 'all samples']: 2427# Nf -= sum([self.sessions[s]['Np'] for s in sessions]) 2428 r = (chisq / Nf)**.5 if Nf > 0 else 0 2429 2430 else: # if key not in ['D47', 'D48'] 2431 chisq, Nf = 0, 0 2432 for sample in mysamples : 2433 X = [ r[key] for r in self if r['Sample'] == sample and r['Session'] in sessions ] 2434 if len(X) > 1 : 2435 Nf += len(X) - 1 2436 chisq += np.sum([ (x-np.mean(X))**2 for x in X ]) 2437 r = (chisq / Nf)**.5 if Nf > 0 else 0 2438 2439 self.msg(f'Repeatability of r["{key}"] is {1000*r:.1f} ppm for {samples}.') 2440 return r
Compute the repeatability of [r[key] for r in self]
2442 def sample_average(self, samples, weights = 'equal', normalize = True): 2443 ''' 2444 Weighted average Δ4x value of a group of samples, accounting for covariance. 2445 2446 Returns the weighed average Δ4x value and associated SE 2447 of a group of samples. Weights are equal by default. If `normalize` is 2448 true, `weights` will be rescaled so that their sum equals 1. 2449 2450 **Examples** 2451 2452 ```python 2453 self.sample_average(['X','Y'], [1, 2]) 2454 ``` 2455 2456 returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, 2457 where Δ4x(X) and Δ4x(Y) are the average Δ4x 2458 values of samples X and Y, respectively. 2459 2460 ```python 2461 self.sample_average(['X','Y'], [1, -1], normalize = False) 2462 ``` 2463 2464 returns the value and SE of the difference Δ4x(X) - Δ4x(Y). 2465 ''' 2466 if weights == 'equal': 2467 weights = [1/len(samples)] * len(samples) 2468 2469 if normalize: 2470 s = sum(weights) 2471 if s: 2472 weights = [w/s for w in weights] 2473 2474 try: 2475# indices = [self.standardization.var_names.index(f'D47_{pf(sample)}') for sample in samples] 2476# C = self.standardization.covar[indices,:][:,indices] 2477 C = np.array([[self.sample_D4x_covar(x, y) for x in samples] for y in samples]) 2478 X = [self.samples[sample][f'D{self._4x}'] for sample in samples] 2479 return correlated_sum(X, C, weights) 2480 except ValueError: 2481 return (0., 0.)
Weighted average Δ4x value of a group of samples, accounting for covariance.
Returns the weighed average Δ4x value and associated SE
of a group of samples. Weights are equal by default. If normalize
is
true, weights
will be rescaled so that their sum equals 1.
Examples
self.sample_average(['X','Y'], [1, 2])
returns the value and SE of [Δ4x(X) + 2 Δ4x(Y)]/3, where Δ4x(X) and Δ4x(Y) are the average Δ4x values of samples X and Y, respectively.
self.sample_average(['X','Y'], [1, -1], normalize = False)
returns the value and SE of the difference Δ4x(X) - Δ4x(Y).
2484 def sample_D4x_covar(self, sample1, sample2 = None): 2485 ''' 2486 Covariance between Δ4x values of samples 2487 2488 Returns the error covariance between the average Δ4x values of two 2489 samples. If if only `sample_1` is specified, or if `sample_1 == sample_2`), 2490 returns the Δ4x variance for that sample. 2491 ''' 2492 if sample2 is None: 2493 sample2 = sample1 2494 if self.standardization_method == 'pooled': 2495 i = self.standardization.var_names.index(f'D{self._4x}_{pf(sample1)}') 2496 j = self.standardization.var_names.index(f'D{self._4x}_{pf(sample2)}') 2497 return self.standardization.covar[i, j] 2498 elif self.standardization_method == 'indep_sessions': 2499 if sample1 == sample2: 2500 return self.samples[sample1][f'SE_D{self._4x}']**2 2501 else: 2502 c = 0 2503 for session in self.sessions: 2504 sdata1 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample1] 2505 sdata2 = [r for r in self.sessions[session]['data'] if r['Sample'] == sample2] 2506 if sdata1 and sdata2: 2507 a = self.sessions[session]['a'] 2508 # !! TODO: CM below does not account for temporal changes in standardization parameters 2509 CM = self.sessions[session]['CM'][:3,:3] 2510 avg_D4x_1 = np.mean([r[f'D{self._4x}'] for r in sdata1]) 2511 avg_d4x_1 = np.mean([r[f'd{self._4x}'] for r in sdata1]) 2512 avg_D4x_2 = np.mean([r[f'D{self._4x}'] for r in sdata2]) 2513 avg_d4x_2 = np.mean([r[f'd{self._4x}'] for r in sdata2]) 2514 c += ( 2515 self.unknowns[sample1][f'session_D{self._4x}'][session][2] 2516 * self.unknowns[sample2][f'session_D{self._4x}'][session][2] 2517 * np.array([[avg_D4x_1, avg_d4x_1, 1]]) 2518 @ CM 2519 @ np.array([[avg_D4x_2, avg_d4x_2, 1]]).T 2520 ) / a**2 2521 return float(c)
Covariance between Δ4x values of samples
Returns the error covariance between the average Δ4x values of two
samples. If if only sample_1
is specified, or if sample_1 == sample_2
),
returns the Δ4x variance for that sample.
2523 def sample_D4x_correl(self, sample1, sample2 = None): 2524 ''' 2525 Correlation between Δ4x errors of samples 2526 2527 Returns the error correlation between the average Δ4x values of two samples. 2528 ''' 2529 if sample2 is None or sample2 == sample1: 2530 return 1. 2531 return ( 2532 self.sample_D4x_covar(sample1, sample2) 2533 / self.unknowns[sample1][f'SE_D{self._4x}'] 2534 / self.unknowns[sample2][f'SE_D{self._4x}'] 2535 )
Correlation between Δ4x errors of samples
Returns the error correlation between the average Δ4x values of two samples.
2537 def plot_single_session(self, 2538 session, 2539 kw_plot_anchors = dict(ls='None', marker='x', mec=(.75, 0, 0), mew = .75, ms = 4), 2540 kw_plot_unknowns = dict(ls='None', marker='x', mec=(0, 0, .75), mew = .75, ms = 4), 2541 kw_plot_anchor_avg = dict(ls='-', marker='None', color=(.75, 0, 0), lw = .75), 2542 kw_plot_unknown_avg = dict(ls='-', marker='None', color=(0, 0, .75), lw = .75), 2543 kw_contour_error = dict(colors = [[0, 0, 0]], alpha = .5, linewidths = 0.75), 2544 xylimits = 'free', # | 'constant' 2545 x_label = None, 2546 y_label = None, 2547 error_contour_interval = 'auto', 2548 fig = 'new', 2549 ): 2550 ''' 2551 Generate plot for a single session 2552 ''' 2553 if x_label is None: 2554 x_label = f'δ$_{{{self._4x}}}$ (‰)' 2555 if y_label is None: 2556 y_label = f'Δ$_{{{self._4x}}}$ (‰)' 2557 2558 out = _SessionPlot() 2559 anchors = [a for a in self.anchors if [r for r in self.sessions[session]['data'] if r['Sample'] == a]] 2560 unknowns = [u for u in self.unknowns if [r for r in self.sessions[session]['data'] if r['Sample'] == u]] 2561 2562 if fig == 'new': 2563 out.fig = ppl.figure(figsize = (6,6)) 2564 ppl.subplots_adjust(.1,.1,.9,.9) 2565 2566 out.anchor_analyses, = ppl.plot( 2567 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2568 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.anchors], 2569 **kw_plot_anchors) 2570 out.unknown_analyses, = ppl.plot( 2571 [r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2572 [r[f'D{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] in self.unknowns], 2573 **kw_plot_unknowns) 2574 out.anchor_avg = ppl.plot( 2575 np.array([ np.array([ 2576 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2577 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2578 ]) for sample in anchors]).T, 2579 np.array([ np.array([0, 0]) + self.Nominal_D4x[sample] for sample in anchors]).T, 2580 **kw_plot_anchor_avg) 2581 out.unknown_avg = ppl.plot( 2582 np.array([ np.array([ 2583 np.min([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) - 1, 2584 np.max([r[f'd{self._4x}'] for r in self.sessions[session]['data'] if r['Sample'] == sample]) + 1 2585 ]) for sample in unknowns]).T, 2586 np.array([ np.array([0, 0]) + self.unknowns[sample][f'D{self._4x}'] for sample in unknowns]).T, 2587 **kw_plot_unknown_avg) 2588 if xylimits == 'constant': 2589 x = [r[f'd{self._4x}'] for r in self] 2590 y = [r[f'D{self._4x}'] for r in self] 2591 x1, x2, y1, y2 = np.min(x), np.max(x), np.min(y), np.max(y) 2592 w, h = x2-x1, y2-y1 2593 x1 -= w/20 2594 x2 += w/20 2595 y1 -= h/20 2596 y2 += h/20 2597 ppl.axis([x1, x2, y1, y2]) 2598 elif xylimits == 'free': 2599 x1, x2, y1, y2 = ppl.axis() 2600 else: 2601 x1, x2, y1, y2 = ppl.axis(xylimits) 2602 2603 if error_contour_interval != 'none': 2604 xi, yi = np.linspace(x1, x2), np.linspace(y1, y2) 2605 XI,YI = np.meshgrid(xi, yi) 2606 SI = np.array([[self.standardization_error(session, x, y) for x in xi] for y in yi]) 2607 if error_contour_interval == 'auto': 2608 rng = np.max(SI) - np.min(SI) 2609 if rng <= 0.01: 2610 cinterval = 0.001 2611 elif rng <= 0.03: 2612 cinterval = 0.004 2613 elif rng <= 0.1: 2614 cinterval = 0.01 2615 elif rng <= 0.3: 2616 cinterval = 0.03 2617 elif rng <= 1.: 2618 cinterval = 0.1 2619 else: 2620 cinterval = 0.5 2621 else: 2622 cinterval = error_contour_interval 2623 2624 cval = np.arange(np.ceil(SI.min() / .001) * .001, np.ceil(SI.max() / .001 + 1) * .001, cinterval) 2625 out.contour = ppl.contour(XI, YI, SI, cval, **kw_contour_error) 2626 out.clabel = ppl.clabel(out.contour) 2627 2628 ppl.xlabel(x_label) 2629 ppl.ylabel(y_label) 2630 ppl.title(session, weight = 'bold') 2631 ppl.grid(alpha = .2) 2632 out.ax = ppl.gca() 2633 2634 return out
Generate plot for a single session
2636 def plot_residuals( 2637 self, 2638 kde = False, 2639 hist = False, 2640 binwidth = 2/3, 2641 dir = 'output', 2642 filename = None, 2643 highlight = [], 2644 colors = None, 2645 figsize = None, 2646 dpi = 100, 2647 yspan = None, 2648 ): 2649 ''' 2650 Plot residuals of each analysis as a function of time (actually, as a function of 2651 the order of analyses in the `D4xdata` object) 2652 2653 + `kde`: whether to add a kernel density estimate of residuals 2654 + `hist`: whether to add a histogram of residuals (incompatible with `kde`) 2655 + `histbins`: specify bin edges for the histogram 2656 + `dir`: the directory in which to save the plot 2657 + `highlight`: a list of samples to highlight 2658 + `colors`: a dict of `{<sample>: <color>}` for all samples 2659 + `figsize`: (width, height) of figure 2660 + `dpi`: resolution for PNG output 2661 + `yspan`: factor controlling the range of y values shown in plot 2662 (by default: `yspan = 1.5 if kde else 1.0`) 2663 ''' 2664 2665 from matplotlib import ticker 2666 2667 if yspan is None: 2668 if kde: 2669 yspan = 1.5 2670 else: 2671 yspan = 1.0 2672 2673 # Layout 2674 fig = ppl.figure(figsize = (8,4) if figsize is None else figsize) 2675 if hist or kde: 2676 ppl.subplots_adjust(left = .08, bottom = .05, right = .98, top = .8, wspace = -0.72) 2677 ax1, ax2 = ppl.subplot(121), ppl.subplot(1,15,15) 2678 else: 2679 ppl.subplots_adjust(.08,.05,.78,.8) 2680 ax1 = ppl.subplot(111) 2681 2682 # Colors 2683 N = len(self.anchors) 2684 if colors is None: 2685 if len(highlight) > 0: 2686 Nh = len(highlight) 2687 if Nh == 1: 2688 colors = {highlight[0]: (0,0,0)} 2689 elif Nh == 3: 2690 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0)])} 2691 elif Nh == 4: 2692 colors = {a: c for a,c in zip(highlight, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2693 else: 2694 colors = {a: hls_to_rgb(k/Nh, .4, 1) for k,a in enumerate(highlight)} 2695 else: 2696 if N == 3: 2697 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0)])} 2698 elif N == 4: 2699 colors = {a: c for a,c in zip(self.anchors, [(0,0,1), (1,0,0), (0,2/3,0), (.75,0,.75)])} 2700 else: 2701 colors = {a: hls_to_rgb(k/N, .4, 1) for k,a in enumerate(self.anchors)} 2702 2703 ppl.sca(ax1) 2704 2705 ppl.axhline(0, color = 'k', alpha = .25, lw = 0.75) 2706 2707 ax1.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: f'${x:+.0f}$' if x else '$0$')) 2708 2709 session = self[0]['Session'] 2710 x1 = 0 2711# ymax = np.max([1e3 * (r['D47'] - self.samples[r['Sample']]['D47']) for r in self]) 2712 x_sessions = {} 2713 one_or_more_singlets = False 2714 one_or_more_multiplets = False 2715 multiplets = set() 2716 for k,r in enumerate(self): 2717 if r['Session'] != session: 2718 x2 = k-1 2719 x_sessions[session] = (x1+x2)/2 2720 ppl.axvline(k - 0.5, color = 'k', lw = .5) 2721 session = r['Session'] 2722 x1 = k 2723 singlet = len(self.samples[r['Sample']]['data']) == 1 2724 if not singlet: 2725 multiplets.add(r['Sample']) 2726 if r['Sample'] in self.unknowns: 2727 if singlet: 2728 one_or_more_singlets = True 2729 else: 2730 one_or_more_multiplets = True 2731 kw = dict( 2732 marker = 'x' if singlet else '+', 2733 ms = 4 if singlet else 5, 2734 ls = 'None', 2735 mec = colors[r['Sample']] if r['Sample'] in colors else (0,0,0), 2736 mew = 1, 2737 alpha = 0.2 if singlet else 1, 2738 ) 2739 if highlight and r['Sample'] not in highlight: 2740 kw['alpha'] = 0.2 2741 ppl.plot(k, 1e3 * r[f'D{self._4x}_residual'], **kw) 2742 x2 = k 2743 x_sessions[session] = (x1+x2)/2 2744 2745 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000, self.repeatability[f'r_D{self._4x}']*1000, color = 'k', alpha = .05, lw = 1) 2746 ppl.axhspan(-self.repeatability[f'r_D{self._4x}']*1000*self.t95, self.repeatability[f'r_D{self._4x}']*1000*self.t95, color = 'k', alpha = .05, lw = 1) 2747 if not (hist or kde): 2748 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000, f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm", size = 9, alpha = 1, va = 'center') 2749 ppl.text(len(self), self.repeatability[f'r_D{self._4x}']*1000*self.t95, f" 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", size = 9, alpha = 1, va = 'center') 2750 2751 xmin, xmax, ymin, ymax = ppl.axis() 2752 if yspan != 1: 2753 ymin, ymax = (ymin + ymax)/2 - yspan * (ymax - ymin)/2, (ymin + ymax)/2 + yspan * (ymax - ymin)/2 2754 for s in x_sessions: 2755 ppl.text( 2756 x_sessions[s], 2757 ymax +1, 2758 s, 2759 va = 'bottom', 2760 **( 2761 dict(ha = 'center') 2762 if len(self.sessions[s]['data']) > (0.15 * len(self)) 2763 else dict(ha = 'left', rotation = 45) 2764 ) 2765 ) 2766 2767 if hist or kde: 2768 ppl.sca(ax2) 2769 2770 for s in colors: 2771 kw['marker'] = '+' 2772 kw['ms'] = 5 2773 kw['mec'] = colors[s] 2774 kw['label'] = s 2775 kw['alpha'] = 1 2776 ppl.plot([], [], **kw) 2777 2778 kw['mec'] = (0,0,0) 2779 2780 if one_or_more_singlets: 2781 kw['marker'] = 'x' 2782 kw['ms'] = 4 2783 kw['alpha'] = .2 2784 kw['label'] = 'other (N$\\,$=$\\,$1)' if one_or_more_multiplets else 'other' 2785 ppl.plot([], [], **kw) 2786 2787 if one_or_more_multiplets: 2788 kw['marker'] = '+' 2789 kw['ms'] = 4 2790 kw['alpha'] = 1 2791 kw['label'] = 'other (N$\\,$>$\\,$1)' if one_or_more_singlets else 'other' 2792 ppl.plot([], [], **kw) 2793 2794 if hist or kde: 2795 leg = ppl.legend(loc = 'upper right', bbox_to_anchor = (1, 1), bbox_transform=fig.transFigure, borderaxespad = 1.5, fontsize = 9) 2796 else: 2797 leg = ppl.legend(loc = 'lower right', bbox_to_anchor = (1, 0), bbox_transform=fig.transFigure, borderaxespad = 1.5) 2798 leg.set_zorder(-1000) 2799 2800 ppl.sca(ax1) 2801 2802 ppl.ylabel(f'Δ$_{{{self._4x}}}$ residuals (ppm)') 2803 ppl.xticks([]) 2804 ppl.axis([-1, len(self), None, None]) 2805 2806 if hist or kde: 2807 ppl.sca(ax2) 2808 X = 1e3 * np.array([r[f'D{self._4x}_residual'] for r in self if r['Sample'] in multiplets or r['Sample'] in self.anchors]) 2809 2810 if kde: 2811 from scipy.stats import gaussian_kde 2812 yi = np.linspace(ymin, ymax, 201) 2813 xi = gaussian_kde(X).evaluate(yi) 2814 ppl.fill_betweenx(yi, xi, xi*0, fc = (0,0,0,.15), lw = 1, ec = (.75,.75,.75,1)) 2815# ppl.plot(xi, yi, 'k-', lw = 1) 2816 elif hist: 2817 ppl.hist( 2818 X, 2819 orientation = 'horizontal', 2820 histtype = 'stepfilled', 2821 ec = [.4]*3, 2822 fc = [.25]*3, 2823 alpha = .25, 2824 bins = np.linspace(-9e3*self.repeatability[f'r_D{self._4x}'], 9e3*self.repeatability[f'r_D{self._4x}'], int(18/binwidth+1)), 2825 ) 2826 ppl.text(0, 0, 2827 f" SD = {self.repeatability[f'r_D{self._4x}']*1000:.1f} ppm\n 95% CL = ± {self.repeatability[f'r_D{self._4x}']*1000*self.t95:.1f} ppm", 2828 size = 7.5, 2829 alpha = 1, 2830 va = 'center', 2831 ha = 'left', 2832 ) 2833 2834 ppl.axis([0, None, ymin, ymax]) 2835 ppl.xticks([]) 2836 ppl.yticks([]) 2837# ax2.spines['left'].set_visible(False) 2838 ax2.spines['right'].set_visible(False) 2839 ax2.spines['top'].set_visible(False) 2840 ax2.spines['bottom'].set_visible(False) 2841 2842 ax1.axis([None, None, ymin, ymax]) 2843 2844 if not os.path.exists(dir): 2845 os.makedirs(dir) 2846 if filename is None: 2847 return fig 2848 elif filename == '': 2849 filename = f'D{self._4x}_residuals.pdf' 2850 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2851 ppl.close(fig)
Plot residuals of each analysis as a function of time (actually, as a function of
the order of analyses in the D4xdata
object)
kde
: whether to add a kernel density estimate of residualshist
: whether to add a histogram of residuals (incompatible withkde
)histbins
: specify bin edges for the histogramdir
: the directory in which to save the plothighlight
: a list of samples to highlightcolors
: a dict of{<sample>: <color>}
for all samplesfigsize
: (width, height) of figuredpi
: resolution for PNG outputyspan
: factor controlling the range of y values shown in plot (by default:yspan = 1.5 if kde else 1.0
)
2854 def simulate(self, *args, **kwargs): 2855 ''' 2856 Legacy function with warning message pointing to `virtual_data()` 2857 ''' 2858 raise DeprecationWarning('D4xdata.simulate is deprecated and has been replaced by virtual_data()')
Legacy function with warning message pointing to virtual_data()
2860 def plot_distribution_of_analyses( 2861 self, 2862 dir = 'output', 2863 filename = None, 2864 vs_time = False, 2865 figsize = (6,4), 2866 subplots_adjust = (0.02, 0.13, 0.85, 0.8), 2867 output = None, 2868 dpi = 100, 2869 ): 2870 ''' 2871 Plot temporal distribution of all analyses in the data set. 2872 2873 **Parameters** 2874 2875 + `dir`: the directory in which to save the plot 2876 + `vs_time`: if `True`, plot as a function of `TimeTag` rather than sequentially. 2877 + `dpi`: resolution for PNG output 2878 + `figsize`: (width, height) of figure 2879 + `dpi`: resolution for PNG output 2880 ''' 2881 2882 asamples = [s for s in self.anchors] 2883 usamples = [s for s in self.unknowns] 2884 if output is None or output == 'fig': 2885 fig = ppl.figure(figsize = figsize) 2886 ppl.subplots_adjust(*subplots_adjust) 2887 Xmin = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2888 Xmax = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self)]) 2889 Xmax += (Xmax-Xmin)/40 2890 Xmin -= (Xmax-Xmin)/41 2891 for k, s in enumerate(asamples + usamples): 2892 if vs_time: 2893 X = [r['TimeTag'] for r in self if r['Sample'] == s] 2894 else: 2895 X = [x for x,r in enumerate(self) if r['Sample'] == s] 2896 Y = [-k for x in X] 2897 ppl.plot(X, Y, 'o', mec = None, mew = 0, mfc = 'b' if s in usamples else 'r', ms = 3, alpha = .75) 2898 ppl.axhline(-k, color = 'b' if s in usamples else 'r', lw = .5, alpha = .25) 2899 ppl.text(Xmax, -k, f' {s}', va = 'center', ha = 'left', size = 7, color = 'b' if s in usamples else 'r') 2900 ppl.axis([Xmin, Xmax, -k-1, 1]) 2901 ppl.xlabel('\ntime') 2902 ppl.gca().annotate('', 2903 xy = (0.6, -0.02), 2904 xycoords = 'axes fraction', 2905 xytext = (.4, -0.02), 2906 arrowprops = dict(arrowstyle = "->", color = 'k'), 2907 ) 2908 2909 2910 x2 = -1 2911 for session in self.sessions: 2912 x1 = min([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2913 if vs_time: 2914 ppl.axvline(x1, color = 'k', lw = .75) 2915 if x2 > -1: 2916 if not vs_time: 2917 ppl.axvline((x1+x2)/2, color = 'k', lw = .75, alpha = .5) 2918 x2 = max([r['TimeTag'] if vs_time else j for j,r in enumerate(self) if r['Session'] == session]) 2919# from xlrd import xldate_as_datetime 2920# print(session, xldate_as_datetime(x1, 0), xldate_as_datetime(x2, 0)) 2921 if vs_time: 2922 ppl.axvline(x2, color = 'k', lw = .75) 2923 ppl.axvspan(x1,x2,color = 'k', zorder = -100, alpha = .15) 2924 ppl.text((x1+x2)/2, 1, f' {session}', ha = 'left', va = 'bottom', rotation = 45, size = 8) 2925 2926 ppl.xticks([]) 2927 ppl.yticks([]) 2928 2929 if output is None: 2930 if not os.path.exists(dir): 2931 os.makedirs(dir) 2932 if filename == None: 2933 filename = f'D{self._4x}_distribution_of_analyses.pdf' 2934 ppl.savefig(f'{dir}/{filename}', dpi = dpi) 2935 ppl.close(fig) 2936 elif output == 'ax': 2937 return ppl.gca() 2938 elif output == 'fig': 2939 return fig
Plot temporal distribution of all analyses in the data set.
Parameters
dir
: the directory in which to save the plotvs_time
: ifTrue
, plot as a function ofTimeTag
rather than sequentially.dpi
: resolution for PNG outputfigsize
: (width, height) of figuredpi
: resolution for PNG output
2942 def plot_bulk_compositions( 2943 self, 2944 samples = None, 2945 dir = 'output/bulk_compositions', 2946 figsize = (6,6), 2947 subplots_adjust = (0.15, 0.12, 0.95, 0.92), 2948 show = False, 2949 sample_color = (0,.5,1), 2950 analysis_color = (.7,.7,.7), 2951 labeldist = 0.3, 2952 radius = 0.05, 2953 ): 2954 ''' 2955 Plot δ13C_VBDP vs δ18O_VSMOW (of CO2) for all analyses. 2956 2957 By default, creates a directory `./output/bulk_compositions` where plots for 2958 each sample are saved. Another plot named `__all__.pdf` shows all analyses together. 2959 2960 2961 **Parameters** 2962 2963 + `samples`: Only these samples are processed (by default: all samples). 2964 + `dir`: where to save the plots 2965 + `figsize`: (width, height) of figure 2966 + `subplots_adjust`: passed to `subplots_adjust()` 2967 + `show`: whether to call `matplotlib.pyplot.show()` on the plot with all samples, 2968 allowing for interactive visualization/exploration in (δ13C, δ18O) space. 2969 + `sample_color`: color used for replicate markers/labels 2970 + `analysis_color`: color used for sample markers/labels 2971 + `labeldist`: distance (in inches) from replicate markers to replicate labels 2972 + `radius`: radius of the dashed circle providing scale. No circle if `radius = 0`. 2973 ''' 2974 2975 from matplotlib.patches import Ellipse 2976 2977 if samples is None: 2978 samples = [_ for _ in self.samples] 2979 2980 saved = {} 2981 2982 for s in samples: 2983 2984 fig = ppl.figure(figsize = figsize) 2985 fig.subplots_adjust(*subplots_adjust) 2986 ax = ppl.subplot(111) 2987 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 2988 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 2989 ppl.title(s) 2990 2991 2992 XY = np.array([[_['d18O_VSMOW'], _['d13C_VPDB']] for _ in self.samples[s]['data']]) 2993 UID = [_['UID'] for _ in self.samples[s]['data']] 2994 XY0 = XY.mean(0) 2995 2996 for xy in XY: 2997 ppl.plot([xy[0], XY0[0]], [xy[1], XY0[1]], '-', lw = 1, color = analysis_color) 2998 2999 ppl.plot(*XY.T, 'wo', mew = 1, mec = analysis_color) 3000 ppl.plot(*XY0, 'wo', mew = 2, mec = sample_color) 3001 ppl.text(*XY0, f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3002 saved[s] = [XY, XY0] 3003 3004 x1, x2, y1, y2 = ppl.axis() 3005 x0, dx = (x1+x2)/2, (x2-x1)/2 3006 y0, dy = (y1+y2)/2, (y2-y1)/2 3007 dx, dy = [max(max(dx, dy), radius)]*2 3008 3009 ppl.axis([ 3010 x0 - 1.2*dx, 3011 x0 + 1.2*dx, 3012 y0 - 1.2*dy, 3013 y0 + 1.2*dy, 3014 ]) 3015 3016 XY0_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(XY0)) 3017 3018 for xy, uid in zip(XY, UID): 3019 3020 xy_in_display_space = fig.dpi_scale_trans.inverted().transform(ax.transData.transform(xy)) 3021 vector_in_display_space = xy_in_display_space - XY0_in_display_space 3022 3023 if (vector_in_display_space**2).sum() > 0: 3024 3025 unit_vector_in_display_space = vector_in_display_space / ((vector_in_display_space**2).sum())**0.5 3026 label_vector_in_display_space = vector_in_display_space + unit_vector_in_display_space * labeldist 3027 label_xy_in_display_space = XY0_in_display_space + label_vector_in_display_space 3028 label_xy_in_data_space = ax.transData.inverted().transform(fig.dpi_scale_trans.transform(label_xy_in_display_space)) 3029 3030 ppl.text(*label_xy_in_data_space, uid, va = 'center', ha = 'center', color = analysis_color) 3031 3032 else: 3033 3034 ppl.text(*xy, f'{uid} ', va = 'center', ha = 'right', color = analysis_color) 3035 3036 if radius: 3037 ax.add_artist(Ellipse( 3038 xy = XY0, 3039 width = radius*2, 3040 height = radius*2, 3041 ls = (0, (2,2)), 3042 lw = .7, 3043 ec = analysis_color, 3044 fc = 'None', 3045 )) 3046 ppl.text( 3047 XY0[0], 3048 XY0[1]-radius, 3049 f'\n± {radius*1e3:.0f} ppm', 3050 color = analysis_color, 3051 va = 'top', 3052 ha = 'center', 3053 linespacing = 0.4, 3054 size = 8, 3055 ) 3056 3057 if not os.path.exists(dir): 3058 os.makedirs(dir) 3059 fig.savefig(f'{dir}/{s}.pdf') 3060 ppl.close(fig) 3061 3062 fig = ppl.figure(figsize = figsize) 3063 fig.subplots_adjust(*subplots_adjust) 3064 ppl.xlabel('$δ^{18}O_{VSMOW}$ of $CO_2$ (‰)') 3065 ppl.ylabel('$δ^{13}C_{VPDB}$ (‰)') 3066 3067 for s in saved: 3068 for xy in saved[s][0]: 3069 ppl.plot([xy[0], saved[s][1][0]], [xy[1], saved[s][1][1]], '-', lw = 1, color = analysis_color) 3070 ppl.plot(*saved[s][0].T, 'wo', mew = 1, mec = analysis_color) 3071 ppl.plot(*saved[s][1], 'wo', mew = 1.5, mec = sample_color) 3072 ppl.text(*saved[s][1], f' {s}', va = 'center', ha = 'left', color = sample_color, weight = 'bold') 3073 3074 x1, x2, y1, y2 = ppl.axis() 3075 ppl.axis([ 3076 x1 - (x2-x1)/10, 3077 x2 + (x2-x1)/10, 3078 y1 - (y2-y1)/10, 3079 y2 + (y2-y1)/10, 3080 ]) 3081 3082 3083 if not os.path.exists(dir): 3084 os.makedirs(dir) 3085 fig.savefig(f'{dir}/__all__.pdf') 3086 if show: 3087 ppl.show() 3088 ppl.close(fig)
Plot δ13C_VBDP vs δ18OVSMOW (of CO2) for all analyses.
By default, creates a directory ./output/bulk_compositions
where plots for
each sample are saved. Another plot named __all__.pdf
shows all analyses together.
Parameters
samples
: Only these samples are processed (by default: all samples).dir
: where to save the plotsfigsize
: (width, height) of figuresubplots_adjust
: passed tosubplots_adjust()
show
: whether to callmatplotlib.pyplot.show()
on the plot with all samples, allowing for interactive visualization/exploration in (δ13C, δ18O) space.sample_color
: color used for replicate markers/labelsanalysis_color
: color used for sample markers/labelslabeldist
: distance (in inches) from replicate markers to replicate labelsradius
: radius of the dashed circle providing scale. No circle ifradius = 0
.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3130class D47data(D4xdata): 3131 ''' 3132 Store and process data for a large set of Δ47 analyses, 3133 usually comprising more than one analytical session. 3134 ''' 3135 3136 Nominal_D4x = { 3137 'ETH-1': 0.2052, 3138 'ETH-2': 0.2085, 3139 'ETH-3': 0.6132, 3140 'ETH-4': 0.4511, 3141 'IAEA-C1': 0.3018, 3142 'IAEA-C2': 0.6409, 3143 'MERCK': 0.5135, 3144 } # I-CDES (Bernasconi et al., 2021) 3145 ''' 3146 Nominal Δ47 values assigned to the Δ47 anchor samples, used by 3147 `D47data.standardize()` to normalize unknown samples to an absolute Δ47 3148 reference frame. 3149 3150 By default equal to (after [Bernasconi et al. (2021)](https://doi.org/10.1029/2020GC009588)): 3151 ```py 3152 { 3153 'ETH-1' : 0.2052, 3154 'ETH-2' : 0.2085, 3155 'ETH-3' : 0.6132, 3156 'ETH-4' : 0.4511, 3157 'IAEA-C1' : 0.3018, 3158 'IAEA-C2' : 0.6409, 3159 'MERCK' : 0.5135, 3160 } 3161 ``` 3162 ''' 3163 3164 3165 @property 3166 def Nominal_D47(self): 3167 return self.Nominal_D4x 3168 3169 3170 @Nominal_D47.setter 3171 def Nominal_D47(self, new): 3172 self.Nominal_D4x = dict(**new) 3173 self.refresh() 3174 3175 3176 def __init__(self, l = [], **kwargs): 3177 ''' 3178 **Parameters:** same as `D4xdata.__init__()` 3179 ''' 3180 D4xdata.__init__(self, l = l, mass = '47', **kwargs) 3181 3182 3183 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3184 ''' 3185 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3186 value for that temperature, and add treat these samples as additional anchors. 3187 3188 **Parameters** 3189 3190 + `fCo2eqD47`: Which CO2 equilibrium law to use 3191 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3192 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3193 + `priority`: if `replace`: forget old anchors and only use the new ones; 3194 if `new`: keep pre-existing anchors but update them in case of conflict 3195 between old and new Δ47 values; 3196 if `old`: keep pre-existing anchors but preserve their original Δ47 3197 values in case of conflict. 3198 ''' 3199 f = { 3200 'petersen': fCO2eqD47_Petersen, 3201 'wang': fCO2eqD47_Wang, 3202 }[fCo2eqD47] 3203 foo = {} 3204 for r in self: 3205 if 'Teq' in r: 3206 if r['Sample'] in foo: 3207 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3208 else: 3209 foo[r['Sample']] = f(r['Teq']) 3210 else: 3211 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3212 3213 if priority == 'replace': 3214 self.Nominal_D47 = {} 3215 for s in foo: 3216 if priority != 'old' or s not in self.Nominal_D47: 3217 self.Nominal_D47[s] = foo[s] 3218 3219 def save_D47_correl(self, *args, **kwargs): 3220 return self._save_D4x_correl(*args, **kwargs) 3221 3222 save_D47_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D47')
Store and process data for a large set of Δ47 analyses, usually comprising more than one analytical session.
3176 def __init__(self, l = [], **kwargs): 3177 ''' 3178 **Parameters:** same as `D4xdata.__init__()` 3179 ''' 3180 D4xdata.__init__(self, l = l, mass = '47', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ47 values assigned to the Δ47 anchor samples, used by
D47data.standardize()
to normalize unknown samples to an absolute Δ47
reference frame.
By default equal to (after Bernasconi et al. (2021)):
{
'ETH-1' : 0.2052,
'ETH-2' : 0.2085,
'ETH-3' : 0.6132,
'ETH-4' : 0.4511,
'IAEA-C1' : 0.3018,
'IAEA-C2' : 0.6409,
'MERCK' : 0.5135,
}
3183 def D47fromTeq(self, fCo2eqD47 = 'petersen', priority = 'new'): 3184 ''' 3185 Find all samples for which `Teq` is specified, compute equilibrium Δ47 3186 value for that temperature, and add treat these samples as additional anchors. 3187 3188 **Parameters** 3189 3190 + `fCo2eqD47`: Which CO2 equilibrium law to use 3191 (`petersen`: [Petersen et al. (2019)](https://doi.org/10.1029/2018GC008127); 3192 `wang`: [Wang et al. (2019)](https://doi.org/10.1016/j.gca.2004.05.039)). 3193 + `priority`: if `replace`: forget old anchors and only use the new ones; 3194 if `new`: keep pre-existing anchors but update them in case of conflict 3195 between old and new Δ47 values; 3196 if `old`: keep pre-existing anchors but preserve their original Δ47 3197 values in case of conflict. 3198 ''' 3199 f = { 3200 'petersen': fCO2eqD47_Petersen, 3201 'wang': fCO2eqD47_Wang, 3202 }[fCo2eqD47] 3203 foo = {} 3204 for r in self: 3205 if 'Teq' in r: 3206 if r['Sample'] in foo: 3207 assert foo[r['Sample']] == f(r['Teq']), f'Different values of `Teq` provided for sample `{r["Sample"]}`.' 3208 else: 3209 foo[r['Sample']] = f(r['Teq']) 3210 else: 3211 assert r['Sample'] not in foo, f'`Teq` is inconsistently specified for sample `{r["Sample"]}`.' 3212 3213 if priority == 'replace': 3214 self.Nominal_D47 = {} 3215 for s in foo: 3216 if priority != 'old' or s not in self.Nominal_D47: 3217 self.Nominal_D47[s] = foo[s]
Find all samples for which Teq
is specified, compute equilibrium Δ47
value for that temperature, and add treat these samples as additional anchors.
Parameters
fCo2eqD47
: Which CO2 equilibrium law to use (petersen
: Petersen et al. (2019);wang
: Wang et al. (2019)).priority
: ifreplace
: forget old anchors and only use the new ones; ifnew
: keep pre-existing anchors but update them in case of conflict between old and new Δ47 values; ifold
: keep pre-existing anchors but preserve their original Δ47 values in case of conflict.
Save D47 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D47_correl.csv
)D47_precision
: the precision to use when writingD47
andD47_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3225class D48data(D4xdata): 3226 ''' 3227 Store and process data for a large set of Δ48 analyses, 3228 usually comprising more than one analytical session. 3229 ''' 3230 3231 Nominal_D4x = { 3232 'ETH-1': 0.138, 3233 'ETH-2': 0.138, 3234 'ETH-3': 0.270, 3235 'ETH-4': 0.223, 3236 'GU-1': -0.419, 3237 } # (Fiebig et al., 2019, 2021) 3238 ''' 3239 Nominal Δ48 values assigned to the Δ48 anchor samples, used by 3240 `D48data.standardize()` to normalize unknown samples to an absolute Δ48 3241 reference frame. 3242 3243 By default equal to (after [Fiebig et al. (2019)](https://doi.org/10.1016/j.chemgeo.2019.05.019), 3244 [Fiebig et al. (2021)](https://doi.org/10.1016/j.gca.2021.07.012)): 3245 3246 ```py 3247 { 3248 'ETH-1' : 0.138, 3249 'ETH-2' : 0.138, 3250 'ETH-3' : 0.270, 3251 'ETH-4' : 0.223, 3252 'GU-1' : -0.419, 3253 } 3254 ``` 3255 ''' 3256 3257 3258 @property 3259 def Nominal_D48(self): 3260 return self.Nominal_D4x 3261 3262 3263 @Nominal_D48.setter 3264 def Nominal_D48(self, new): 3265 self.Nominal_D4x = dict(**new) 3266 self.refresh() 3267 3268 3269 def __init__(self, l = [], **kwargs): 3270 ''' 3271 **Parameters:** same as `D4xdata.__init__()` 3272 ''' 3273 D4xdata.__init__(self, l = l, mass = '48', **kwargs) 3274 3275 def save_D48_correl(self, *args, **kwargs): 3276 return self._save_D4x_correl(*args, **kwargs) 3277 3278 save_D48_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D48')
Store and process data for a large set of Δ48 analyses, usually comprising more than one analytical session.
3269 def __init__(self, l = [], **kwargs): 3270 ''' 3271 **Parameters:** same as `D4xdata.__init__()` 3272 ''' 3273 D4xdata.__init__(self, l = l, mass = '48', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ48 values assigned to the Δ48 anchor samples, used by
D48data.standardize()
to normalize unknown samples to an absolute Δ48
reference frame.
By default equal to (after Fiebig et al. (2019), Fiebig et al. (2021)):
{
'ETH-1' : 0.138,
'ETH-2' : 0.138,
'ETH-3' : 0.270,
'ETH-4' : 0.223,
'GU-1' : -0.419,
}
Save D48 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D48_correl.csv
)D48_precision
: the precision to use when writingD48
andD48_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
3281class D49data(D4xdata): 3282 ''' 3283 Store and process data for a large set of Δ49 analyses, 3284 usually comprising more than one analytical session. 3285 ''' 3286 3287 Nominal_D4x = {"1000C": 0.0, "25C": 2.228} # Wang 2004 3288 ''' 3289 Nominal Δ49 values assigned to the Δ49 anchor samples, used by 3290 `D49data.standardize()` to normalize unknown samples to an absolute Δ49 3291 reference frame. 3292 3293 By default equal to (after [Wang et al. (2004)](https://doi.org/10.1016/j.gca.2004.05.039)): 3294 3295 ```py 3296 { 3297 "1000C": 0.0, 3298 "25C": 2.228 3299 } 3300 ``` 3301 ''' 3302 3303 @property 3304 def Nominal_D49(self): 3305 return self.Nominal_D4x 3306 3307 @Nominal_D49.setter 3308 def Nominal_D49(self, new): 3309 self.Nominal_D4x = dict(**new) 3310 self.refresh() 3311 3312 def __init__(self, l=[], **kwargs): 3313 ''' 3314 **Parameters:** same as `D4xdata.__init__()` 3315 ''' 3316 D4xdata.__init__(self, l=l, mass='49', **kwargs) 3317 3318 def save_D49_correl(self, *args, **kwargs): 3319 return self._save_D4x_correl(*args, **kwargs) 3320 3321 save_D49_correl.__doc__ = D4xdata._save_D4x_correl.__doc__.replace('D4x', 'D49')
Store and process data for a large set of Δ49 analyses, usually comprising more than one analytical session.
3312 def __init__(self, l=[], **kwargs): 3313 ''' 3314 **Parameters:** same as `D4xdata.__init__()` 3315 ''' 3316 D4xdata.__init__(self, l=l, mass='49', **kwargs)
Parameters: same as D4xdata.__init__()
Nominal Δ49 values assigned to the Δ49 anchor samples, used by
D49data.standardize()
to normalize unknown samples to an absolute Δ49
reference frame.
By default equal to (after Wang et al. (2004)):
{
"1000C": 0.0,
"25C": 2.228
}
Save D49 values along with their SE and correlation matrix.
Parameters
samples
: Only these samples are output (by default: all samples).dir
: the directory in which to save the faile (by defaut:output
)filename
: the name to the csv file to write to (by default:D49_correl.csv
)D49_precision
: the precision to use when writingD49
andD49_SE
values (by default: 4)correl_precision
: the precision to use when writing correlation factor values (by default: 4)
Inherited Members
- D4xdata
- R13_VPDB
- R18_VSMOW
- LAMBDA_17
- R17_VSMOW
- R18_VPDB
- R17_VPDB
- LEVENE_REF_SAMPLE
- ALPHA_18O_ACID_REACTION
- Nominal_d13C_VPDB
- Nominal_d18O_VPDB
- d13C_STANDARDIZATION_METHOD
- d18O_STANDARDIZATION_METHOD
- make_verbal
- msg
- vmsg
- log
- refresh
- refresh_sessions
- refresh_samples
- read
- input
- wg
- compute_bulk_delta
- crunch
- fill_in_missing_info
- standardize_d13C
- standardize_d18O
- compute_bulk_and_clumping_deltas
- compute_isobar_ratios
- split_samples
- unsplit_samples
- assign_timestamps
- report
- combine_samples
- standardize
- standardization_error
- summary
- table_of_sessions
- table_of_analyses
- covar_table
- table_of_samples
- plot_sessions
- consolidate_samples
- consolidate_sessions
- repeatabilities
- consolidate
- rmswd
- compute_r
- sample_average
- sample_D4x_covar
- sample_D4x_correl
- plot_single_session
- plot_residuals
- simulate
- plot_distribution_of_analyses
- plot_bulk_compositions
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort