Git Product home page Git Product logo

cern-physics's Introduction

cern-physics's People

Contributors

bdrum avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

abylinkin

cern-physics's Issues

Migrate drawing to pandas

Now I use such constructions for drawing:

bw, = ax.plot(np.linspace(x.min(), x.max(),1000), fit.bw(x=np.linspace(x.min(), x.max(),1000), M = result.best_values['bw_M'], G=result.best_values['bw_G'], amp=result.best_values['bw_amp'] ), 'g--',label=r'Breit-Wigner')
bkg, = plt.plot(np.linspace(x.min(), x.max(),1000), fit.polyn(np.linspace(x.min(), x.max(),1000), result.best_values['bckg_c0'], result.best_values['bckg_c1'], result.best_values['bckg_c2'], result.best_values['bckg_c3'], result.best_values['bckg_c4'],0,0,0, result.best_values['bckg_amp']), 'b--', label='Bckgr fit')
# plt.plot(bcg_x, bcg_y, 'r+', label='bkg data')
result.plot_fit(datafmt='ko',fitfmt='r-',numpoints=100,data_kws={'xerr':0.022})
ax.set_title('')
_ = ax.set_ylabel(r'Counts per 20 Mev/$c^2$')
_ = ax.set_xlabel(r'M$(\pi^+\pi^-\pi^+\pi^-)$ (GeV/$c^2$)')
_ = plt.legend([dat,ft, bw, bkg],['Data','Fit','Breit-Wigner','Bkg'])

Instead of this, I can combine data to DataFrames and use pandas plot for drawing purposes

Cross section

Now I have number of events with $\rho'$ for each runs and I can get luminosity of these runs.

This means that I can get cross section of the $\rho'$

In fact this is not true, but why?

$$\sigma=\frac{N}{L}$$

Obviously this should be flat. But for such definition I can't confirm that:

img

Actually in case we will take std as $\frac{1}{\sqrt{nev}}$ such calculation looks strange because of scale doesn't match.

Perhaps, we have to normalize the data. In this case we can see flat curve(last picture):

img

Overlapping problem

The same track can be triggered with different names (CCUP2 and CCUP9 and so on).

I have to estimate this influence.

Unbinned fitting (Likelihood)

For fitting in #44 I have standard $\chi^2$ test for deterring quality of the fit, but for area with small statistics (Puasson) it's wrong.
E.g. I've calculated error as $\sqrt(N)$ for each bin, but in case of small count of events in bin it provide us the situation when 2 events in bin and 1.4 sigma.

Here is the fragment from lection of Cambridge prof. Mark Thomson about the question:

image

Influence of ITS only tracks to 4 tracks events

Let's see how does further parameters for tracks are distributed for TPC+ITS and ITS only

  • full momentum
  • pt
  • pseudorapidity

Mass based on ITS only tracks should be biased to the soft side.

image

Events losses after splitting by trigger

I faced a problem that after splitting event to triggered and untriggered I have a difference between origin total event count and
triggered event plus untriggered:

ind nTPC total_cnt trig_cnt untrig_cnt trig % untrig % trig_cnt_plus_untrig_cnt diff
0 0 19194 10349 8631 53.9 45 18980 214
1 1 17641 10138 7334 57.5 41.6 17472 169
2 2 16506 9572 6780 58 41.1 16352 154
3 3 12952 7583 5249 58.5 40.5 12832 120
4 4 5916 3506 2359 59.3 39.9 5865 51

Definitely this is wrong

What is the best way to access UPC events?

In the slides of the one presentation I've seen:

We have two types of UPC events Triggered by MUON Triggered in the Central Barrel In both cases the events are almost empty The average size of AOD UPC events in the central barrel for 2015 PbPb is around 0.5 kB We have some 30 M triggered events, which means some 15 GB of data. The purity of this sample is small, meaning that a simple pre-selection may reduce this number by a large factor (between 2 and 3).

In the past we have used a LEGO train to access the events and produced trees to perform local analyses. The code in the LEGO train is quite stable Latchezar commented that this approach may not be optimal for such small data set distributed over all the PbPb data set and recommended that we had a look at nanoAODs.

The code to select only those branches we use and create with them ‘a nanoAOD’ is ready. As it is so general, we do not expect the code to be changed frequently. Latchezar mentioned that there exist AOD trains in the central framework.

Could it be a good option to use them to produce UPC AODs?

In Summary, what is the best way to run over the UPC triggers in the PbPb sample so that we do not have to run over all the PbPb events to find our few gigas of data?

Comparison with Valeri data

Now me and Valeri have discrepancies in the data.

He has less number of processed runs, but more statistics than me.

What is the reason of such discrepancies?

He'll provide to me file with event info and we can find out the difference...

Mass from pairs

I've got mass from 4 tracks, but what if I will calculate it from pion subsystems?

4tr mass distr

Plot the Mass for 4 prongs events with zero and non-zero total charge for different pt intervals:

  • (0,150]
  • (150,800]
  • (800,2000)

ZDC energy distribution

Plot histogram of zdc energy distribution.

I've to see smth like abs(sinc)....

Also demonstrate that basic part (let say first half of the plot) contains events that form Mass spectra meanwhile second half mass spectra contains background events.

Comparison runs quality

I've processed 125 runs, but only ~80 were recommended by DPG.

I can compare the quality of such runs and the rest via visible c-s.

Histogramming problems

ROOT has excellent histogram tools with different visualizations modes (error bars, markers), operations for histogram such as ratio or difference.

What I need is

  • Stats table, such as entries, mean, std, overflow, integral
  • Operations with histograms such as division and difference
  • Histogram with Error bars
  • Visualization with comparison purposes it could be n hist at one plot, subplots, so on

I've seen many of histogramming tools including ROOT, e.g.
scikit-hep/hist
scikit-hep/aghast
scikit-hep/boost-histogram
numpy.histogram

Perhaps many of these tick could be easy solved via let say standard packages, e.g. stats could be provided by pd.describe (except overflow and integral)

Trigger influence investigation

Estimate triggers influence: selected triggers are CCUP9, CCUP2, CCUP4, C1ZED, CCUP11. what difference between them according for my events?

Move data preparation from pandas to numpy

Till this moment I've used uproot4 for reading root files and output result was pandas dataframe.
Now my file has grown from 500mb to 1.5gb, so current version of script doesn't work because it takes forever and takes all memory.

I've tested parsing to numpy arrays and it works fine.

Now I have to move the main script form pandas usage to numpy.

Reorganizing project dir

Now in my project dir all things are together.

I prefer split notebooks, cpp project, and readme should be related with main notebook

What eta distribution for pion subsytems?

The idea is that we can check $\eta$ distribution for the pairs and check, perhaps lightest mass condition for selecting pair is to hard and we can increase accuracy if make condition for $\eta$

Track duplicates with only ITS criteria

I see in my events plenty of duplicated tracks:

There are only in case of ITS selection

((data['T_HasPointOnITSLayer0']) + (data['T_HasPointOnITSLayer1'])) * \
data['T_ITSRefit']  * (np.abs(data['T_NumberOfSigmaITSPion']) < 3)

img

In case of adding TPC criteria looks like (cause of I can't get it) that such tracks will be thrown...

Influence of ITSNCls >=3 or >3

I've seen that expanding condition from >3 to >=3 for ITSNCls parameter gives me about 600 events to statistics.

What are these events, background or signal?

Produce Monte Carlo data

Ask Michal Broz to run simulation for the:

Run anchored: Get Run number with maximum of 'good' events
Period: LHC15o
Process: rho(1720) → rho pi+ pi- → 4pi
Number of generated events: 100 000

GeneratorCode

AliGenerator* GeneratorCustom() 
{
  AliGenCocktail* genCocktail = GeneratorCocktail("StarlightEvtGen");
  
  AliGenStarLight* genStarLight = (AliGenStarLight*)GeneratorStarlight();
  genStarLight->SetParameter("PROD_PID     =   999    #Channel of interest (not relevant for photonuclear processes)");
  genStarLight->SetParameter("rho0PrimeMass     =   1.720");
  genStarLight->SetParameter("rho0PrimeWidth     =   0.249");
  genStarLight->SetParameter("W_MAX     =   5.0");
  genCocktail->AddGenerator(genStarLight, "Starlight", 1.);

  AliGenSLEvtGen *genEvtGen = new AliGenSLEvtGen();
  TString decayTable="RHOPRIME.RHOPIPI.DEC"; // default
  genEvtGen->SetUserDecayTable(gSystem->ExpandPathName(Form("$ALICE_ROOT/STARLIGHT/AliStarLight/DecayTables/%s",decayTable.Data())));
  genEvtGen->SetEvtGenPartNumber(30113);

  genEvtGen->SetPolarization(1);           // Set polarization: transversal(1),longitudinal(-1),unpolarized(0)

  genCocktail->AddGenerator(genEvtGen,"EvtGen",1.);
  return genCocktail;
}

Here is the question about decay channel. Now it specified as 999 that correspond to 4 pions final state, but in startlight one more exists - 913 that produces a final state consisting of two pions, also it adds the amplitudes (with interference) for photoproduction of the rho, and for photoproduction of direct pi^+pi^-., but again the final state is only two pions.

Investigate other decay modes for $\rho'$

Until now I was studying only one decay mode:

$\rho' \rightarrow 4 \pi$

In PDG I've seen also other interesting modes for $\rho'$:

    1. $\rho' \rightarrow \eta_0 \rho_0$ | ?
    • 1.1. $\rho_0 \rightarrow 4 \pi$ | $2*10^{-5}%$
    • 1.2. $\rho_0 \rightarrow \pi^+ \pi^-$ | $10^{-2}%$
    • 1.3. $\eta_0' \rightarrow \pi^+ \pi^- \gamma$ | $4%$
    • 1.4. $\eta_0' \rightarrow \pi^+ \pi^- \pi^0$ | $23%$
    1. $\rho' \rightarrow 4 \pi$ | ?

What about $\rho' \rightarrow \rho_0 \rho_0$ is it possible?

Fit mass of resonance

Here is ρ' resonance.

img

According with PDG description
The best fit could be obtained with the hypothesis of ρ-like resonances at 1420 and 1770 MeV, with widths of about 250 MeV.

I can create two BW model and fit resonance by them.

$\rho_0$ decay angle for the rest frame of $\rho_0$

4tracks_analysis

During 4$\pi$ photoproduction in reaction of the $\rho'$ decay:

$$\rho'\rightarrow \pi^+\pi^-\pi^+\pi^-$$

I have a intermediate two-body decay state when $\rho_0\rightarrow \pi^+\pi^-$.

This allows us to check $\rho_0$ via determining of the decay angle for rest frame of $\rho_0$.

Aliases for most common visualizing scenarios

I have many common scenarios for data visualizing.

E.g.

  • vertical comparison or horizontal comparison on the different plots
  • adding data to the same plots with labels
  • histogramming with errors bars
  • custom styles for histograms

How can I simplify usage of such behavior?

First of all I know that in matplotlib I can specify default parameters for some functions. I can use this e.g. for changing default type of histograms to 'step', change defalut colors, applying root style as default.

Then I can prepare functions that can be wrap described common scenarios with additional parameters like arrays of colors and labels, and extend standard arguments of matplotlib via args and kwargs.

Discrepancies in the data

In my analysis two the same criteria, but exposed on the different manner gives difference result:

selectOnlyStandard    = data['T_TPCRefit']  * (data['T_TPCNCls'] > 50) * \
            (data['T_ITSNCls'] > 3) * \
            (~newT_ITSsa) * (np.abs(data['T_NumberOfSigmaTPCPion']) < 3)
GoodEventsStandard = np.argwhere(selectOnlyStandard.sum()==4).flatten() # get events with 4 good tracks
GoodEventsStandard = GoodEventsStandard[np.argwhere(data['T_Q'][selectOnlyStandard][GoodEventsStandard].sum()==0).flatten()].flatten()  

pxstd = data['T_Px'][selectOnlyStandard][GoodEventsStandard]
pystd = data['T_Py'][selectOnlyStandard][GoodEventsStandard]
    
ptstd = np.sqrt(pxstd.sum()**2  + pystd.sum()**2)

fig = plt.figure(figsize=(15, 7))
ax = fig.add_axes([0,0,1,1])
fig.suptitle(f'4pr pt standard criteria', fontsize=32)
plt.style.use(hep.style.ROOT)
counts, bins = np.histogram(ptstd, bins=100, range=(0,2))
_ = ax.hist(ptstd, bins=bins, color='black', histtype='step', label=f'Q = 0;Entries {np.sum(counts)}')
plt.xlabel('Pt, GeV')
plt.ylabel('# events')
ax.legend()

GoodEvents = GetGoodEvents(WithGoodNTpcTracks=4)
counts, bins = np.histogram(GetPt(Draw=False), bins=100, range=(0,2))
_ = ax.hist(GetPt(Draw=False), bins=bins, color='red', histtype='step', label=f'4 TPC tracks;Entries {np.sum(counts)}', linewidth=2)
plt.xlabel('Pt, GeV')
plt.ylabel('# events')
ax.legend()

gives:

First criteria direct combination of second criteria.

Obviously for direct criteria condition to total charge doesn't work:

We can see the sum of zero charge events and non zero:

Picture for the second case, but the meaning is understood

Generalize computations

Now, when I want to plot mass with total charge zero and non-zero, I can't do it because I don't have a functions or objects that could change the state. So I should repeat a code. This is bad solution.

I have to parametrize calculations and states.

Move code from cell to scripts

Keeping a code in jupyter notebook cells make notebook oversized.
I think the good solution is keeping scripts in the 'scripts' folder and use their as modules.

Influence of limitation for tracks number

In the selection script PVN limits the number of THE GOOD(passed by some criteria) TRACKStracks and fill predefined arrays.

I have chosen another strategy and fill vectors with dynamic size.

That's interesting to compare data losses in PVN case.

New structure for analysis scripts

In the solution process #22 I realized that a problem of structure is not only in oversized. Also reinitializing of variables in case of restarting the kernel could be a headache. Moreover jumping from cell to cell without tags is really slowly.

I've a plan to create package for my analysis based on python modules and in notebooks just import modules and call functions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.