Git Product home page Git Product logo

ltsm's People

Contributors

inkdot7 avatar ironmann avatar joergbehrendt avatar munken avatar tstibor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ltsm's Issues

Store files in TSM base on a UUID

Modify this tool so that files could be stored in TSM as a UUID that is attached to the file via an extended attribute. There is issued with using the file path as the ID in TSM as well as the FID and the recommended approach seems to be to use an xattr UUID as the way to track files between HSM levels.

Robinhood already has support for this method.

[tsmapi] dsmSendData: tx and data_sent checking

I think we're not handling two aspects of dsmSendData() API#141:

  1. stopping the tx if DSM_RC_WILL_ABORT is returned
  2. checking whether the whole buffer is transmitted by looking at data_blk.numBytes after the call. Because I think it's just a socket call, and it might not take all data at once.

What do you think?

I've started to convert the tsm_archive_generic() into a FSM, as described in API#68. Hopefully it'll make it easier to support batched transactions.

Working with archive/released stub files

Currently we have a lustre subdirectory where users can archive data to TSM. Once that file is archived or especially released, if it's moved it can't be restored as TSM doesn't know about the new files path.

I'm looking for a way to handle stub files that are archived/released to TSM properly. I understand that due to the way TSM stores files that stub files that are moved/renamed get "lost". Is there a way to handle this making for a good user experience, possibly through robinhood or changes to the way we move files in/out of TSM?

[copytool] limit number of requests in work-queue

Lustre MDT has the max_requests hsm parameter, but it only applies to the appropriate MDT HSM coordinator. This makes it tricky to configure properly, because it's not connected to number of registered copytools. Ideally, it should be set to some value above total number of threads in all of the registered copytools.

Since there's a good chance for miss-configuration, I propose we limit number of request we take for hsm. How about setting the limit to 2 x thread_count?

[qtable] deduplication not working on objects with same date

If two tsm objects with the same fs/hl/ll key and the same insertion date are processed by the remove_older_obj function in qtable.c both are kept in the hashtable.
This causes the fileapi and my sanity script added in 3abba03 to fail because i always expect exactly one return object.

A simple change of the comparison from 'new.date > old.date' to greater than or equal fixed this problem for me but leads to errors in the test_qtable testsuite.

[tsmapi] Get rid of qarray and use chained hash tables

At current, an array (called qarray) is employed to store and retrieve qryRespArchiveData. The qarray datastructure is simple and trivial version of a C++ vector

/* Increase length (capacity) by factor of 2 when qarray is full. */
if ((*qarray)->N >= (*qarray)->capacity) {
	(*qarray)->capacity *= 2;
	(*qarray)->data = realloc((*qarray)->data,
				  sizeof(qryRespArchiveData) *
				  (*qarray)->capacity);
	if ((*qarray)->data == NULL) {
		CT_ERROR(errno, "realloc");
		return DSM_RC_UNSUCCESSFUL;
	}
}

where the capacity is simply doubled when the array has no space left. This design was chosen to straightforwardly enable a sort operation on the restore order field:

int cmp_restore_order(const void *a, const void *b)
{
	const qryRespArchiveData *query_data_a = (qryRespArchiveData *)a;
	const qryRespArchiveData *query_data_b = (qryRespArchiveData *)b;

	if (query_data_a->restoreOrderExt.top > query_data_b->restoreOrderExt.top)
		return(DS_GREATERTHAN);
	else if (query_data_a->restoreOrderExt.top < query_data_b->restoreOrderExt.top)
		return(DS_LESSTHAN);
	else if (query_data_a->restoreOrderExt.hi_hi > query_data_b->restoreOrderExt.hi_hi)
		return(DS_GREATERTHAN);
	else if (query_data_a->restoreOrderExt.hi_hi < query_data_b->restoreOrderExt.hi_hi)
		return(DS_LESSTHAN);
	else if (query_data_a->restoreOrderExt.hi_lo > query_data_b->restoreOrderExt.hi_lo)
		return(DS_GREATERTHAN);
	else if (query_data_a->restoreOrderExt.hi_lo < query_data_b->restoreOrderExt.hi_lo)
		return(DS_LESSTHAN);
	else if (query_data_a->restoreOrderExt.lo_hi > query_data_b->restoreOrderExt.lo_hi)
		return(DS_GREATERTHAN);
	else if (query_data_a->restoreOrderExt.lo_hi < query_data_b->restoreOrderExt.lo_hi)
		return(DS_LESSTHAN);
	else if (query_data_a->restoreOrderExt.lo_lo > query_data_b->restoreOrderExt.lo_lo)
		return(DS_GREATERTHAN);
	else if (query_data_a->restoreOrderExt.lo_lo < query_data_b->restoreOrderExt.lo_lo)
		return(DS_LESSTHAN);
	else
		return(DS_EQUAL);
}

The TSM server allows to archive data multiple times when "fs/hl/ll" is equal. In the query operation (when no date information is provided), we fill up qarray (when the latest flag is set) with the most current qryRespArchiveData. Therefore currently also a hashtable is used, to lookup which version was already queried. It makes more sense to directly store the queries in chained hash tables (buckets of linked-lists). To enable sorting one can either convert the chained hash tables into
a fixed array, then sort it, or into a single linked-list and sort the linked-list (e.g. as described in LinkedListProblems). Note, another but not efficient method, is to query each time before archiving, however that is for sure not an elegant solution.

[copytool] Cleanup the consumer threads appropriately

The consumer threads are processing requests such as archiving/retrieving/deleting
and the daemon gets a shutdown/c-ctrl signal (by init.d/system.d/user), one should close
the working queue so no new consumer threads can dequeue HSM action items.
In addition, one suppose to wait for a certain period of time such that the current consumer threads can
finished their HSM action items. If the time is passed and they are
still not done, one could to a tsm_disconnect(...), cleanup, etc...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.