.. meta:: :description lang=en: kafe - a general, Python-based approach to fit a model function to two-dimensional data points with correlated uncertainties in both dimensions :robots: index, follow **************************************** Fit examples, utilities, tips and tricks **************************************** A wide range of applications of the *kafe* core and the usage of the helper functions is exemplified below. All of them are contained in the sub-directory ``examples/`` of the *kafe* distribution and are intended to serve as a basis for user projects. Example 1 - model comparison ============================ To decide whether a model is adequate to describe a given set of data, typically several models have to be fit to the same data. Here is the code for a comparison of a data set to two models, namely a linear and an exponential function: .. code-block:: python # import everything we need from kafe import kafe # additionally, import the two model functions we want to fit: from kafe.function_library import linear_2par, exp_2par ############ # Initialize Dataset my_dataset = kafe.Dataset(title="Example Dataset") # Load the Dataset from the file my_dataset.read_from_file('dataset.dat') ### Create the Fits my_fits = [kafe.Fit(my_dataset, exp_2par), kafe.Fit(my_dataset, linear_2par)] ### Do the Fits for fit in my_fits: fit.do_fit() ### Create a plot my_plot = kafe.Plot(my_fits[0], my_fits[1]) my_plot.plot_all(show_data_for=0) # only show data once (it's the same data) ### Save and show output my_plot.save('kafe_example1.pdf') my_plot.show() The file `dataset.dat` contains x and y data in the standard *kafe* data format, where values and errors (and optionally also correlation coefficients) are given for each axis separately. `#` indicates a comment line, which is ignored when reading the data:: # axis 0: x # datapoints uncor. err. 0.957426 3.0e-01 2.262212 3.0e-01 3.061167 3.0e-01 3.607280 3.0e-01 4.933100 3.0e-01 5.992332 3.0e-01 7.021234 3.0e-01 8.272489 3.0e-01 9.250817 3.0e-01 9.757758 3.0e-01 # axis 1: y # datapoints uncor. err. 1.672481 2.0e-01 1.743410 2.0e-01 1.805217 2.0e-01 2.147802 2.0e-01 2.679615 2.0e-01 3.110055 2.0e-01 3.723173 2.0e-01 4.430122 2.0e-01 4.944116 2.0e-01 5.698063 2.0e-01 The resulting output is shown below. As can be seen already from the graph, the exponential model better describes the data. The χ² probability in the printed output shows, however, that the linear model would be marginally acceptable as well:: linear_2par chi2prob 0.052 HYPTEST accepted (sig. 5%) exp_2par chi2prob 0.957 HYPTEST accepted (sig. 5%) .. figure:: _static/img/kafe_example1.png :height: 300px :width: 600px :scale: 100 % :alt: image not found :align: center `Output of example 1 - compare two models` The contour curves of the two fits are shown below and reflect the large correlations between the fit parameters. The right plot of the profile χ² curve shows that there is a slight deviation from the parabolic curve in the fist fit of a non-linear (exponential) function. For more details on the profiled χ² curve see the discussion of example 3, where the difference is more prominent. .. figure:: _static/img/kafe_example1_contours.png :height: 300px :width: 900px :alt: image not found :align: center `Contour curves and a profile χ² curve for the fits in example 1` Example 2 - two fits and models =============================== Another typical use case consists of comparing two sets of measurements and the models derived from them. This is very similar to the previous example with minor modifications: .. code-block:: python # import everything we need from kafe import kafe # additionally, import the model function we want to fit: from kafe.function_library import linear_2par # Initialize the Datasets my_datasets = [kafe.Dataset(title="Example Dataset 1"), kafe.Dataset(title="Example Dataset 2")] # Load the Datasets from files my_datasets[0].read_from_file(input_file='dataset1.dat') my_datasets[1].read_from_file(input_file='dataset2.dat') # Create the Fits my_fits = [kafe.Fit(dataset, linear_2par, fit_label="Linear regression " + dataset.data_label[-1]) for dataset in my_datasets] # Do the Fits for fit in my_fits: fit.do_fit() # Create the Plots my_plot = kafe.Plot(my_fits[0], my_fits[1]) my_plot.plot_all() # this time without any arguments, i.e. show everything # Save the plots my_plot.save('kafe_example2.pdf') # Show the plots my_plot.show() This results in the following output: .. figure:: _static/img/kafe_example2.png :height: 300px :width: 600px :scale: 100 % :alt: image not found :align: center `Output of example 2 - comparison of two linear fits.` Although the parameters extracted from the two data sets agree within errors, the uncertainty bands of the two functions do not overlap in the region where the data of Dataset 2 are located, so the data are most probably incompatible with the assumption of an underlying single linear model. Example 3 - non-linear fit with non-parabolic errors ===================================================== Very often, when the fit model is a non-linear function of the parameters, the χ² function is not parabolic around the minimum. A very common example of such a case is an exponential function parametrized as shown in the code fragment below. `Minuit` contains a special algorithm called ``MINOS``, which returns the correct errors in this case as well. Instead of using the curvature at the minimum, ``MINOS`` follows the χ² function from the minimum to the point where it crosses the the value `minimum+up`, where `up=1` corresponds to one standard deviation in χ² fits. During the scan of the χ² function at different values of each parameter, the minimum with respect to all other parameters in the fit is determined, thus making sure that all correlations among the parameters are taken into account. In case of a parabolic χ² function, the ``MINOS`` errors are identical to those obtained by the ``HESSE`` algorithm, but are typically larger or asymmetric in other cases. The method :py:meth:`~kafe.fit.Fit.do_fit` executes the ``MINOS`` algorithm after completion of a fit and prints the ``MINOS`` errors if the deviation from the parabolic result is larger than 5% . A graphical visualization is provided by the method :py:meth:`~kafe.fit.Fit.plot_profile` , which displays the profile χ² curve for the parameter with name or index passed as an argument to the method. The relevant code fragments and the usage of the method :py:meth:`~kafe.fit.Fit.plot_profile` are illustrated here: .. code-block:: python import kafe ################################## # Definition of the fit function # ################################## # Set an ASCII expression for this function @ASCII(x_name="t", expression="A0*exp(-t/tau)") # Set some LaTeX-related parameters for this function @LaTeX(name='A', x_name="t", parameter_names=('A_0', '\\tau{}'), expression="A_0\\,\\exp(\\frac{-t}{\\tau})") @FitFunction def exponential(t, A0=1, tau=1): return A0 * exp(-t/tau) # Initialize the Dataset and load data from a file my_dataset = kafe.Dataset(title="Example dataset", axis_labels=['t', 'A']) my_dataset.read_from_file(input_file='dataset.dat') # Perform a Fit my_fit = Fit(my_dataset, exponential) my_fit.do_fit() # Plot the results my_plot = Plot(my_fit) my_plot.plot_all() # Create plots of the contours and profile chi-squared contour = my_fit.plot_contour(0, 1, dchi2=[1.0, 2.3]) profile1 = my_fit.plot_profile(0) profile2 = my_fit.plot_profile(1) # Show the plots my_plot.show() The data points were generated using a normalization factor of `A`\ :sub:`0` = 1 and a lifetime `τ` = 1. The resulting fit output below demonstrates that this is well reproduced within uncertainties: .. figure:: _static/img/kafe_example3.png :height: 300px :width: 600px :scale: 100% :alt: image not found :align: center `Output of example 3 - Fit of an exponential` The contour `A`\ :sub:`0` vs `τ` is, however, not an ellipse, as shown in the figure below. The profiled χ² curves are also shown; they deviate significantly from parabolas. The proper one-sigma uncertainty in the sense of a 68% confidence interval is read from these curves by determining the parameter values where the χ² curves cross the horizontal lines at a value of Δχ²=1 above the minimum. The two-sigma uncertainties correspond to the intersections with the horizontal line at Δχ²=4. .. figure:: _static/img/kafe_example3_contours.png :height: 300px :width: 900px :scale: 100% :alt: image not found :align: center `Contour and profile χ² curves of example 3` .. note:: A more parabolic behavior can be achieved by using the width parameter λ=1/τ in the parametrization of the exponential function. Example 4 - average of correlated measurements ============================================== The average of a set of measurements can be considered as a fit of a constant to a set of input data. This example illustrates how correlated errors are handled in *kafe*. Measurements can have a common error, which may be absolute or relative, i. e. depend on the input value. In more complicated cases the full covariance matrix must be constructed. *kafe* has a helper function, :py:func:`~kafe.dataset_tools.build_dataset` (in module :py:mod:`~kafe.dataset_tools`), which aids in setting up the covariance matrix and transforming the input to the default format used by the :py:class:`~kafe.dataset.Dataset` and :py:class:`~kafe.fit.Fit` classes. Two further helper functions in module :py:mod:`~kafe.file_tools` aid in reading the appropriate information from data files. 1. The function :py:func:`~kafe.file_tools.parse_column_data` reads the input values and their independent errors from one file, and optionally covariance matrices for the x and y axes from additional files. The field ordering is defined by a control string. 2. Another helper function, :py:func:`~kafe.file_tools.buildDataset_fromFile`, specifies input values or blocks of input data from a single file with keywords. The second version needs only very minimal additional user code, as illustrated here: .. code-block:: python import kafe from kafe.function_library import constant_1par from kafe.file_tools import buildDataset_fromFile # build Dataset from input file fname = 'WData.dat' curDataset = buildDataset_fromFile(fname) # perform the Fit curFit = kafe.Fit(curDataset, constant_1par) curFit.do_fit() # Plot the results myPlot = kafe.Plot(curFit) myPlot.plot_all() myPlot.save("plot.pdf") myPlot.show() The input file is necessarily more complicated, but holds the full information on the data set in one place. Refer to the documentation of the function :py:func:`~kafe.file_tools.parse_general_inputfile` in module :py:mod:`~kafe.file_tools` for a full description of the currently implemented keywords. The input file for the averaging example is here:: # Measurements of W boson mass (combined LEP2, 2013) # ================================================== # example to use parse_general_inputfile from kafe; # covariance matrix build from common errors # == # Metadata for plotting *BASENAME WData *TITLE measurements of the W boson mass *xLabel number of measurement *yLabel $m_\matrhm{W}$ *yUnit GeV/$c^2$ # x data need not be given for averaging # ============================================================ # Measurements of W mass by ALEPH, DELPHI, L3 and OPAL # from from LEP2 Report Feb. 2013 # common errors within channels # 2q2l: 0.021 GeV, # 4q: 0.044 GeV, # and between channels: 0.025 GeV # ============================================================ *yData_SCOV # W_mass err syst sqrt of covariance matrix elements # (as lower triangular matrix) # 2q2l channel 80.429 0.055 0.021 80.339 0.073 0.021 0.021 80.217 0.068 0.021 0.021 0.021 80.449 0.058 0.021 0.021 0.021 0.021 # 4q channel 80.477 0.069 0.044 0.025 0.025 0.025 0.025 0.044 80.310 0.091 0.044 0.025 0.025 0.025 0.025 0.044 0.044 80.324 0.078 0.044 0.025 0.025 0.025 0.025 0.044 0.044 0.044 80.353 0.068 0.044 0.025 0.025 0.025 0.025 0.044 0.044 0.044 0.044 Example 5 - non-linear multi-parameter fit (damped oscillation) =============================================================== This example shows the fitting of a more complicated model function to data collected from a damped harmonic oscillator. In such non-linear fits, stetting the initial values is sometimes crucial to let the fit converge at the global minimum. The :py:class:`~kafe.fit.Fit` object provides the method :py:meth:`~kafe.fit.Fit.set_parameters` for this purpose. As the fit function for this problem is not a standard one, it is defined explicitly making use of the decorator functions available in *kafe* to provide nice type setting of the parameters. This time, the function :py:func:`~kafe.file_tools.parse_column_data` is used to read the input, which is given as separate columns with the fields ``