Git Product home page Git Product logo

hep's Issues

fwk: detect double-put in dflowsvc

From @sbinet on September 13, 2014 8:49

make sure consistency is upheld in face of double-put:

store := ctx.Store()
data := 42
// store data
err := store.Put("my-data", &data)

// oops, another time, under the same key
err = store.Put("my-data", &data)

otherwise, this could wreak havoc in the sync'ing of achan-s.

Copied from original issue: go-hep/fwk#39

hbook/yoda

From @sbinet on December 12, 2016 16:23

add a helper package to read/write "archives" of YODA files, so one can read/write a complete collection of hbook values (H1D, H2D, Scatter2D, ...) from/to a single YODA file.

Copied from original issue: go-hep/hbook#26

fwk: implement a job-server REST API

From @sbinet on August 21, 2014 11:56

using job configurations from issue go-hep/fwk#25 , it should be possible to submit jobs to a web server with simple GET/POST queries.

Copied from original issue: go-hep/fwk#26

fwk: investigate HCL as a high-level configuration language

From @sbinet on August 20, 2014 14:55

people might cringe a bit at configuring a fwk job in go.
even if the github.com/go-hep/fwk/job.Job interface can be considered quite nice.

also, providing a "Turing-complete" language as a configuration language introduces problems of its own (one has to debug the program which creates the program one wants to run.)
c.f. all the issues with e.g. the Athena/Gaudi python joboptions.

Using a declarative language like hcl might fit the bill.

Copied from original issue: go-hep/fwk#24

fwk: investigate concurrent start-up of components

From @sbinet on August 22, 2014 16:19

gaudi had issues with initialization of components (especially cycles in services/tools: ToolA -> ToolB -> ToolC -> ... -> ToolA)

the lack of resource dependencies declaration (component A needs component B) prevented to efficiently discover such issues.
it also prevented to initialize components in parallel.

investigate whether fwk couldn't do better here.

Copied from original issue: go-hep/fwk#28

rio: define a better on-disk format

From @sbinet on September 16, 2014 9:44

  • define a format friendlier to concurrent I/O
  • inspiration from Linux/BSD VFS ?
    • ZFS
    • Ext4
    • BTRFS
  • inspiration from HDF5 ?

Requirements:

  • allow streaming from http ?
  • optimize for main usage: write once, read many
  • optimize for read-speed
  • use cases: user analysis, bulk reconstruction
  • MPI I/O compliant ?

BTRFS

http://git.kernel.org/cgit/linux/kernel/git/mason/btrfs-progs.git/tree/
https://btrfs.wiki.kernel.org/index.php/Btrfs_design
http://www.cs.ucdavis.edu/~pandey/Teaching/ECS150/Lects/07fs.pdf

Ext4

https://ext4.wiki.kernel.org/index.php/Main_Page
http://foss.in/2006/cfp/slides/ext4-foss.pdf
https://ols.fedoraproject.org/OLS/Reprints-2008/kumar-reprint.pdf
https://github.com/torvalds/linux/tree/master/fs/ext4
http://atrey.karlin.mff.cuni.cz/~jack/papers/lk2009-ext4-btrfs.pdf

ZFS

http://maczfs.googlecode.com/files/ZFSOnDiskFormat.pdf
https://blogs.oracle.com/bonwick/en_US/entry/zfs_block_allocation
https://blogs.oracle.com/bonwick/entry/space_maps
https://github.com/zfsonlinux/zfs/

XFS

http://oss.sgi.com/projects/xfs/papers/xfs_filesystem_structure.pdf

HDF5

http://www.hdfgroup.org/HDF5/doc/H5.format.html

ROOT

https://github.com/root-mirror/root/tree/master/io/doc/TFile
http://root.cern.ch/root/htmldoc/TFile.html
ftp://root.cern.ch/root/doc/11InputOutput.pdf

A4

https://github.com/a4/a4
http://arxiv.org/abs/1208.1600

blosc

http://www.blosc.org/
http://www.blosc.org/blosc-in-depth.html

FUSE

http://bazil.org/fuse/

LTFS

http://en.m.wikipedia.org/wiki/Linear_Tape_File_System

XZ

http://tukaani.org/xz/xz-file-format.txt

ACQt

https://indico.in2p3.fr/event/10875/session/7/material/0/3.pdf

Compression

Brotli

http://www.ietf.org/id/draft-alakuijala-brotli-07.txt
https://github.com/google/brotli/

Copied from original issue: go-hep/rio#1

rio: data type codecs

From @sbinet on September 16, 2014 13:12

  • investigate whether encoding/decoding of types could be stored into rio files
  • assuming the proposal for Go plugins API goes in, it would be possible to dynamically load code. => just write the package name ?
  • fwd/bwd compat of types. write the AST or some serialized version of the data type for versionning (instead of just the package name + data type name) (AKA, schema evolution)

Copied from original issue: go-hep/rio#2

csvutil/csvdriver: add ability to name columns

From @sbinet on November 16, 2016 9:15

right now, opening a CSV connection is done like so:

func foo() {
	// Open the CSV file as a database table.
	db, err := csvdriver.Conn{
		File:    "../data/iris.csv",
		Comment: '#',
		Comma:   ',',
	}.Open()
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	// Start a database transaction.
	tx, err := db.Begin()
	if err != nil {
		log.Fatal(err)
	}
	defer tx.Commit()

	// Define a SQL query that we will execute against the CSV file.
	query := "SELECT var3, var4, var5 FROM csv WHERE var5 = \"Iris-versicolor\""

	// etc...
}

this makes a query rather cumbersome to read.

it should be possible to open a CSV connection with an optional Columns []string field:

// Open the CSV file as a database table.
db, err := csvdriver.Conn{
	File:    "../data/iris.csv",
	Comment: '#',
	Comma:   ',',
	Columns: []string{"sepal_length", "sepal_width", "petal_length", "petal_width", "species"},
}.Open()

such that the query reads:

// Define a SQL query that we will execute against the CSV file.
query := `SELECT petal_length, petal_width, species FROM csv WHERE species = "Iris-versicolor"`

What do you think @dwhitena ?

Copied from original issue: go-hep/csvutil#2

fwk: evt-level timeout

From @sbinet on August 13, 2014 6:45

fwk should support the ability to abort the processing of an event when it reaches a given timeout

Copied from original issue: go-hep/fwk#5

rootio/cmd/root-srv: do not keep files in memory

right now, when a user uploads a file, root-srv gobbles it up all in memory and keeps it there (until the user session expires).

  • introduce a local mode where the file is directly open from the disk
  • when remote or on appengine, store the file on some temporary storage (a tmp directory keyed on the user session name)

rootio: add support for TChain

it should be possible to combine multiple rootio.Trees (with the same layout) from multiple ROOT files into one rootio.Tree.

right, now, to get at a rootio.Tree, one does:

func main() {
	f, err := rootio.Open("testdata/chain.1.root")
	if err != nil {
		log.Fatal(err)
	}
	defer f.Close()

	obj, err := f.Get("tree")
	if err != nil {
		log.Fatal(err)
	}
	tree := obj.(rootio.Tree)
	fmt.Printf("entries= %v\n", tree.Entries())
}

the rootio.Tree interface is:

// Tree is a collection of branches of data.
type Tree interface {
	Named
	Entries() int64
	TotBytes() int64
	ZipBytes() int64
	Branch(name string) Branch
	Branches() []Branch
	Leaves() []Leaf

	getFile() *File
	loadEntry(i int64) error
}

it should be possible to have a type that implements the rootio.Tree interface but logically concatenates multiple rootio.Trees together.
Something like the io.MultiReader function: https://godoc.org/io#MultiReader

Initially, one could just require the user to pass already created/retrieved rootio.Tree to a, e.g., Chain function:

// Chain returns a new Tree that is the logical concatenation of all the input Trees.
func Chain(trees ...Tree) Tree { ... }

one should think about resource ownership:

  • who should close the ROOT file containing a Tree when that Tree has been completely read ?
  • how to integrate the chain type with the Scanners ?
  • is it better to have a Chain function that takes a tree name and a list of file names ? (so resource ownership is clearer). it's less composable...
  • perhaps introduce a ChainScanner that opens and closes files as needed when each Tree has been consumed ?

data type codecs

From @sbinet on September 16, 2014 13:12

  • investigate whether encoding/decoding of types could be stored into rio files
  • assuming the proposal for Go plugins API goes in, it would be possible to dynamically load code. => just write the package name ?
  • fwd/bwd compat of types. write the AST or some serialized version of the data type for versionning (instead of just the package name + data type name) (AKA, schema evolution)

Copied from original issue: go-hep/rio#2

hbook: retrieve x position of maximum within given range for H1D

From @ebusato on October 12, 2016 15:10

Hi,

I'd like to retrieve the position on the x axis of the y maximum of the histogram within a certain range provided by the user. Right now you have the Max() method that returns the y maximum. What do you think would be the best option to implement what I need :

  1. Change the Max() method by adding two parameters to specify the range and one return value which is the maximum position on the x axis
  2. Create a new method and let Max() untouched in order to not impact clients.

?

Let me know and I'll send a PR.

Emmanuel

Copied from original issue: go-hep/hbook#8

fwk: task-level timeout

From @sbinet on August 13, 2014 6:46

fwk should (perhaps) support task-level timeout (and then abort event) when the task takes too much time to process.

Copied from original issue: go-hep/fwk#6

fwk: implement a property documentation system

From @sbinet on September 11, 2014 13:27

concrete components may have properties.
should we provide a way to expose/enforce/attach documentation for them ?
godoc is useless for them.

Copied from original issue: go-hep/fwk#37

fwk: implement a job-configuration dumper+loader for user-defined types

From @sbinet on September 10, 2014 15:32

it should be possible to dump a fwk.App configuration into:

a human-readable format
a machine-readable format (serialization/gob/json/...)
for later re-use, comparison or regression-tests purposes.

it should also work for user-defined types. (currently, the JSON based format only works for builtins)

Copied from original issue: go-hep/fwk#34

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.