Git Product home page Git Product logo

gboudreau / greyhole Goto Github PK

View Code? Open in Web Editor NEW
258.0 19.0 35.0 8.43 MB

Greyhole uses Samba to create a storage pool of all your available hard drives, and allows you to create redundant copies of the files you store.

Home Page: http://www.greyhole.net

License: GNU General Public License v3.0

Shell 5.03% Makefile 1.81% PHP 64.93% C 22.81% JavaScript 4.73% CSS 0.39% Hack 0.15% Dockerfile 0.16%
storage-pool samba linux php prevent-data-loss redundant-copies

greyhole's Introduction

Greyhole

Greyhole is an application that uses Samba to create a storage pool of all your available hard drives (whatever their size, however they're connected), and allows you to create redundant copies of the files you store, in order to prevent data loss when part of your hardware fails.

Installation

  1. Using apt (Ubuntu, Debian) or yum (CentOS, Fedora, RHEL):

    curl -Ls https://bit.ly/greyhole-package | sudo bash

  2. Follow the instructions from the USAGE file. There is also a copy of this file in /usr/share/greyhole/USAGE

Links

Features

JBOD concatenation storage pool

Configure as many hard drives as you'd like to be included in your pool. Your storage pool size will be the sum of the free space in all the hard drives you include. Your hard drives can be internal, external (USB, e-Sata, Firewire...), or even mount of remote file systems, and you can include hard drives of any size in your pool.

Per-share redundancy

For each of your shares that use the space of your storage pool, indicate how many copies of each file you want to keep. Each of those copies will be stored in a different hard drive, in order to prevent data loss when one or more hard drives fail. For very important files, you can even specify you'd like to keep copies on all available hard drives.

Easily recoverable files

Greyhole file copies are regular files, visible on any machine, without any hardware or software required. If you take out one hard drive from your pool, and mount it anywhere else, you'll be able to see all the files that Greyhole stored on it. They will have the same filenames, and they'll be in the same directories you'd expect them to be.

Documentation

The GitHub Wiki contains the Greyhole documentation.

greyhole's People

Contributors

akartmann avatar ebinans avatar fsironman avatar gboudreau avatar janecker avatar jult avatar karakal avatar panthar avatar r3vxx avatar sparticuz avatar tylerstraub avatar vena avatar zarmstrong avatar zefie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

greyhole's Issues

Support empty drive_selection_groups

When some drive_selection_groups definitions are empty, and are used in drive_selection_algorithm , the user could receive a "No metadata files could be created." error message.

Example:

drive_selection_groups = OK:
    NEW: /mnt/lsi0/gh, /mnt/lsi1/gh, /mnt/lsi2/gh, /mnt/lsi3/gh
    BROKEN:
    REMOTE:

drive_selection_algorithm = forced (1xOK, 1xNEW, 1xBROKEN, 1xREMOTE) most_available_space

Files with long names/path remain in the Landing zone

What steps will reproduce the problem?

  1. Create this path in samba share:
    "_Clipart/_HOLIDAYS/Bol_shaja_kollektsija_novogodnih_oboev_ot_Banka_Dizajna_designbank.ru/Bol_shaja_kollektsija_novogodnih_oboev_ot_Banka_Dizajna_designbank.ru/Bol_shaja_kollektsija_novogodnih_oboev_ot_Banka_Dizajna_designbank.ru/"
  2. Place some file with name "Bol_shaja_kollektsija_novogodnih_oboev_ot_Banka_Dizajna_designbank.ru.jpg" into this folder.

What is the expected behaviour? What do you see instead?
That file remains in the landing zone without been copied to Greyhole pool with symlink in the landing zone.

What version of the product are you using? On what operating system?
Greyhole 0.9.1 on Fedora 14 (Amahi 6)

Please provide any additional information below.

logging to syslog

using 0.9.3
Is it possible to enable logs into syslog in order to use it trough a syslog server ?

ubuntu maverick deb install failure

Hi there,
The deb package on ubuntu maverick (and all upstart variants I suspect) fails on initial install because the service isn't running. The section of code in the postinst script affected is as follows:

    # Using Upstart instead of SYSV init.d
    if [ -f /etc/init.d/greyhole ]; then
            rm /etc/init.d/greyhole
            stop greyhole 2> /dev/null > /dev/null
            start greyhole 2> /dev/null > /dev/null
    fi

The SYSV script will always get installed, so this section of code will always run, during upgrade or install. On upgrade, the greyhole service is likely running, so no problem. On initial install, it isn't, so the stop command will cause the script to stop, and dpkg to throw an error. When it's installed again, it works fine because /etc/init.d/greyhole has already been removed.

The simplest fix for this would be to put ||true on the end of the stop so that it doesn't fail if the service isn't running.

Great project btw! Love it.

Cheers,
Colin.

--going on upstart clients doesn't restart service

Hi there,
The following section of code in the greyhole php doesn't work on upstart clients because it's trying to use the service command, and so the service isn't restarted before the fsck when using the --going command when removing a drive.

    // Remove $options['dir'] from config file and restart (if it was running)
    $escaped_dir = str_replace('/', '\/', $going_dir);
    exec("/bin/sed -ie 's/^.*storage_pool_directory.*$escaped_dir.*$//' /etc/greyhole.conf");
    exec("/sbin/service greyhole condrestart");

Would it be possible to change it to test for upstart and use the restart command instead?

Cheers,
Colin.

Need to manually remove files from attic

I found out a bug ( I guess) with Greyhole. The case is as following, my current setup :

1 x 80 Gb drive mounted as /
1 x 250 Gb drive mounted as /mnt/data01
1 x 500 Gb drive mounted as /mnt/data02

and my shares are created on /shares/music, /shares/movies/, /shares/documents. The "bug" is as followed, in my pool i had about 22 Gb of free space left. I was moving some files around, and I was trying to move some 25 Gb to my pool, then i realized that it was to big for my pool. So i though, lets just finish the file transfer, and then just remove some files from my pool and I would be done, but here comes the bug in place. It would only move my "removed" files to the attic and not really remove my files because it couldn't write the first task (writing the 25 Gb) to my pool. So i had to manually remove the files from my attic before it would write the 25 Gb.

I hope i've made myself clear, if you want i could post logs or some other info.

32bit integer cut off for large drives

I stumbled across a bug when I added a 3tb drive to my pool recently. While the free space is initially captured as a float, it is then used as a key in an array map, and keys are automatically converted from float to int and suddenly 2.7tb of free bytes in a float became -1.4tb of free bytes in an int. The solution is to flip the arrays inside out so that the free space is stored as a value, values can be float unlike keys. Then instead of using krsort, arsort can be used. Later on when there is an array_merge being done, the arrays being merged will need to be wrapped in array_keys statements first so that the actual drives are added to the resulting array and not the free / available space.

A side benefit of this is that all the while loops where the space variables are being decremented by .01 until they have a unique value can be removed.

I've got this fix in place on my server already, I'll work on getting it into my forked repo and do a pull request this weekend.

It's probably not an issue for anyone with drives smaller than 3tb or those that have a 64bit distro, but worth the effort to fix regardless.

drive_selection_groups for specific shares

I had a problem with defining drive_selection_groups for specific shares. When I put the following in the greyhole.conf
drive_selection_groups[Documents]

I get an error that the share ocuments cannot be found in the samba configuration. So I put the next in the greyhole.conf drive_selection_groups[DDocuments]

But I don't know if this works and what causes the problem.

Thanks for looking into it.

Removed drive can't be re-added as-is

When removing a drive from the pool, using --going, the .greyhole_uses_this file is renamed .greyhole_used_this.
When re-adding that drive to the pool later on (after a fsck.ext for example), the drive will simply be ignored, without any error.

Issue #25 should fix this, since it will deprecate .greyhole_uses_this files.
But we still need to make sure removed drives can be re-added as-is later.

Cannot get "most_available_space" to work.

Hi.

I posted this at #greyhole, but I believe it was Thanksgiving, and didn't manage to get an answer. I figured this would be a good place to see what I'm doing wrong.

I started with one 2TB drive, mounted as "/var/storage/drives/hdd0" and added it to greyhole.conf as "storage_pool_directory = /var/storage/drives/hdd0/gh, min_free: 10gb "
. I've added recently a second drive mounted as hdd1, but greyhole keeps filling the first drive, and that one is almost full while the new 1TB is almost empty
.

The new drive has the "gh" dir in it with the metastore stuff in it, so greyhole recognizes it
, and I have "dir_selection_algorithm = most_available_space" in the conf
ig. I've even tried with the dir_selection_groups, to see if would make any difference or not, but it doesn't
. I also tried setting df_cache_time = 0, doesn't work
.

Running greyhole --stats outputs this: http://pastebin.com/cWtexGuJ

Even if I try to add a file to a share that is configured for 100 copies, the new drive doesn't get the file
.

Here are the relevant parts of greyhole.conf -> http://pastebin.com/kztdijgk

(note: the dir_selection_group line got cropped, but the rest of it is for hdd1)

Can someone help me fix this?

Duplication issue with other drives full except landing zone

What steps will reproduce the problem?

  1. Set drives like this
storage_pool_directory = /mnt/pool1/gh, min_free: 500gb
storage_pool_directory = /mnt/pool2/gh, min_free: 10gb
storage_pool_directory = /mnt/pool3/gh, min_free: 10gb
storage_pool_directory = /mnt/pool4/gh, min_free: 10gb
storage_pool_directory = /mnt/pool5/gh, min_free: 10gb
storage_pool_directory = /mnt/pool6/gh, min_free: 10gb
storage_pool_directory = /mnt/pool7/gh, min_free: 10gb

pool1 and pool2 are both 1TB drives others are 500GB drives.
I know that something like 200gb would have been more sane.

What is the expected behaviour? What do you see instead?

I would expect this to keep most free space on pool1 because that's also used as landing zone.
But when all other drives get full except pool1 files don't get duplicated i also don't think they get even moved to pool1/gh directory. In situation like this:

Storage Pool
Total - Used = Free + Attic = Possible
/mnt/pool1/gh: 917G - 704G = 213G + 0G = 213G
/mnt/pool2/gh: 917G - 917G = 0G + 0G = 0G
/mnt/pool3/gh: 458G - 458G = 0G + 0G = 0G
/mnt/pool4/gh: 458G - 458G = 0G + 0G = 0G
/mnt/pool5/gh: 458G - 458G = 0G + 0G = 0G
/mnt/pool6/gh: 458G - 458G = 0G + 0G = 0G
/mnt/pool7/gh: 458G - 458G = 0G + 0G = 0G

I think this should work like this, if all other drives are full and you copy something to samba share handled by greyhole.
It should move data from other drives to pool1 to make space, so it can duplicate those files correctly.

What version of the product are you using? On what operating system?
Ubuntu 10.10 and r383.

Please provide any additional information below.
none

Write man pages

Write man page for greyhole, /etc/greyhole.conf, greyhole-ui
(and maybe for greyhole-dfree & greyhole-ui-server, which aren't user-facing, but still available in /usr/bin, so maybe just a quick man just to explain their use...)

Files appearing in storage pools where not in drive_selection_group

I have a share, documents, which is set as num_copies[Documents]=max. I have four total storage pools, and have documents configured to use the following drive selection groups:

drive_selection_groups[Documents] = FIRST: /media/2TB-1/gh
SECOND: /media/2TB-2/gh
THIRD: /ShareBackup/gh

The thing is, Documents replicates in /media/1TB-1 as well. This may well be by design as I am specifying MAX redundancy, however I would have assumed that as I haven't explicitly listed that device as a selection drive for Documents it wouldn't store files there.

Greyhole says that the start was ok, even with DB Errors

I tried to start the greyhole daemon. It seems to have started but it wasn't running.

I found this within the logs:

cat /var/log/greyhole.log
Jan 21 11:01:44 4 stats: PHP Warning [2]: mysql_connect(): Access denied for user 'root'@'localhost' (using password: NO) in /usr/bin/greyhole on line 386; BT: greyhole[L1852] db_connect() => greyhole[L386] mysql_connect(,,)
Jan 21 11:01:44 2 stats: Can't connect to database.

Maybe Greyhole should check this on start.

Use GLib GIO to log filesystem events

This is a suggestion for enhancing the greyhole project with the goal of being more robust. At the moment files are only mirrored by greyhole when you exclusively use SAMBA to access them, which imposes serious limitations, for example FTP and other protocols cannot be used. It is also not possible to modify files locally on the server (unless you go through a local SAMBA mount).
Monitoring filesystem changes on filesystem level, and acting upon those events will solve this problem.

I propose to, instead of reacting to filesystem events by using the SAMBA eventlog, monitor the filesystem using GIO, which is part of the GLib library
(http://library.gnome.org/devel/platform-overview/stable/gio.html.en)
This is the same approach that the UbuntuOne client uses for reacting to filesystem events (https://bugs.launchpad.net/ubuntuone-client/+bug/382889).

For those that are not familiar with UbuntuOne, it is a program that monitors certain directories, and when files are added, deleted, or modified, it mirrors these changes on an online backup.

The GIO library can, for example, be used with the python gobject wrappers, pygobject (http://git.gnome.org/browse/pygobject/tree/README). A python daemon could be written to monitor filesystem events and put them in the existing greyhole queue.
This is, I believe, how the Ubuntu developers would do it. The ubuntuone-client source code could also serve as an example of how this should be implemented.

Some other alternatives to GIO are FAM, GAM, inotify/icron or fileschanged, which could also be used, though a believe GIO to be a good choice. I refer to the discussion on the launchpad GIO suggestion for ubuntuone for clarification. (and the fact that ubuntuone as of now uses GIO instead of inotify)

The reason I am writing this is that I would need this feature for my personal usage. I am planning on experimenting with this technology because I want a daemon on my file server that mirrors specified directories (recursively).
If there is interest in this approach for greyhole, I will share my progress whenever I make any. I might even try including this code in the greyhole project.

MEMORY-table for tasks

Using MEMORY as engine for the tasks-table would increase performance and decrease the amount of diskwrites required.

I am experimenting with defining TASKS as this:

mysql> show create table tasks\G
*************************** 1. row ***************************
       Table: tasks
Create Table: CREATE TABLE `tasks` (
  `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `action` varchar(10) NOT NULL,
  `share` varchar(255) DEFAULT NULL,
  `full_path` varchar(255) DEFAULT NULL,
  `additional_info` varchar(255) DEFAULT NULL,
  `complete` enum('yes','no','frozen','thawed','idle') NOT NULL,
  `event_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  PRIMARY KEY (`id`),
  KEY `find_next_task` (`complete`,`share`(64),`id`)
) ENGINE=MEMORY AUTO_INCREMENT=2459208 DEFAULT CHARSET=latin1
1 row in set (0.00 sec)

I had to redefine share, full_path and additional_info as varchar(255), which would be the equivalent of TINYTEXT (max 255 characters).

The downside would be that the table would be emptied at a reboot, but that could be controlled; using a secondary table to write out any pending tasks to before shutting down, then copying to TASKS on startup.

fix_symlinks() called for every renamed then deleted files

If you first rename a big (full of files) folder, and immediately after that, delete the renamed folder, GH will call fix_symlinks() for each file that was renamed then deleted, during the gh_rename().
On big shares, this will take a lot of time, and the user will be stuck for DAYS on the rename operation!
We need to find a better way to find lost symlinks...
Maybe try fix_symlinks only once per operation, or something...

php include files not in download archive

When downloading via greyhole.net or the downloads tab in Github, includes/common.php and includes/sql.php are not present. These are present when pulling a tarball/zip from the tags list but for people doing installs from source I think it would be easier if they were included in the normal download files. This would also make it simpler for package maintainers for other distributions (for example, I am making a package for Archlinux and have to do some extra steps to get those pear files)

Find orphan files doesn't work when --dir is a LZ folder

Doesn't find orphans:
greyhole -fo --dir=/mnt/lz/share_name

Do find orphans:
greyhole -fo --dir=/mnt/drive1/gh/share_name

Both should find the orphans.

Note: the 2nd command will only find orphans in /mnt/drive1. The first should find orphans on all storage pool drives.

init.d script doesn't work with sudo on CentOS

What steps will reproduce the problem?

  1. sudo /sbin/service greyhole

What is the expected output? What do you see instead?
It says it's started, but really isn't!

This seems to be caused by an environment variable (like PATH) not being transfered when using sudo, as issuing the same command as root works fine.
The init.d script might be missing the full path to an executable somewhere...

Hooks external scripts when greyhole processing events

Suggest hooks from greyhole to enable calling external scripts on events occuring. e.g. I have a cron job that goes through my picture directory and produces lower resolution versions in another share (for display on remote machines). would be nice if I could fork a script on an add or a modify to keep these in sync without having to walk the whole pictures directory structure and do date comparisons.

Greyhole cannot start with "duplicate" smb shares

I got this error:
Jan 28 19:45:02 4 daemon: Found a share (gh003) defined in /etc/greyhole.conf with no path in /etc/samba/smb.conf. Either add this share in /etc/samba/smb.conf, or remove it from /etc/greyhole.conf, then restart Greyhole.
Jan 28 19:45:02 2 daemon: Config file parsing failed. Exiting.

smb.conf: https://gist.github.com/1695401
greyhole.conf: https://gist.github.com/1695403

Every Share which has greyhole enabled exists twice within the smb.conf.
For samba this is not a problem, it seems to merge both.

Offer Greyhole as a easy to install package

Greyhole, including the management UI, should come as an easy to install package for the most used distributions (Fedora/CentOS, Ubuntu/Debian, others?)

Installing and using Greyhole should be as easy as installing the package, and launching the web administration UI.

Change Greyhole terminology

Change Greyhole terminology to something else than attic, graveyard, tombstones...

The documentation will also need to be adapted, and the new terms be documented, with their corresponding old terms.

Pending --going operation could still allow new files to go on that drive

What steps will reproduce the problem?

  1. greyhole --going=/mnt/hdd1/gh
  2. Add new files to a share, sticky on /mnt/hdd1/gh, or that would end up on that drive

What is the expected output? What do you see instead?
New files should never end up on hdd1, since it's pending the end of the --going process.

Warning in system log

Hi,

Just installed greyhole lastest version 0.9.17-3 on OpenMediaVaut (aka Debian 6.0.3) and got this warning when starting the daemon.

Greyhole[23243]: Dec 08 10:29:34 4 daemon: PHP Warning [2]: array_shift() expects parameter 1 to be array, boolean given in /usr/bin/greyhole on line 5649; BT: greyhole[L2485] set_metastore_backup() => greyhole[L5649] array_shift()

Don't think it's critical but it's better without it ;)

greyhole throws a PHP warning

When the greyhole daemon is started, or anytime a greyhole command is executed the following PHP warning is generated.

PHP Warning: date_default_timezone_get(): It is not safe to rely on the system's timezone settings. You are required to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected 'America/Chicago' for 'CST/-6.0/no DST' instead in /usr/bin/greyhole on line 32
greyhole, version 0.9.16, for linux-gnu (noarch)

Here is my PHP info

$ php -v
PHP 5.3.8 (cli) (built: Sep 24 2011 20:31:11)
Copyright (c) 1997-2011 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2011 Zend Technologies

PHP Fatal Error - Greyhole crashes

Excerpt from the log:

Nov 03 02:18:54 7 fsck: Entering /mnt/d_nas/gh/Movies/Anger Management (2003)
Nov 03 02:18:54 3 fsck: PHP Fatal Error: Object of class stdClass could not be converted to string; BT: greyhole[L647]

Nov 03 07:56:01 7 unlink: Loading metafiles for Series/MythBusters/Season 6/MythBusters - S06E14 - Blind Driving.avi ...
Nov 03 07:56:02 7 unlink: Got 1 metadata files.
Nov 03 07:56:02 3 unlink: PHP Fatal Error: Object of class stdClass could not be converted to string; BT: greyhole[L647]

So, it happens at seemingly random times, often at the same places.
I then have to start greyhole again, hoping that it will overcome the place where it encounters the error. It might take som tries, but it eventually moves on.

I'm running:
Ubuntu Server 11.04 32-bit
PHP 5.3.5-1ubuntu7.3 with Suhosin-Patch (cli) (built: Oct 13 2011 21:56:07)
Greyhole 0.9.16

Renames fail with "failed to open stream: Value too large for defined data type"

May 26 11:11:03 7 balance:   Working on file: The.Shawshank.Redemption.1994.720p.x264/The.Shawshank.Redemption.1994.720p.x264.mkv (4.37GB)
May 26 11:11:03 7 balance:   Drives with available space: /var/hda/files/drives/drive4/gh (1.38TB avail) - /var/hda/files/drives/drive2/gh (1.18TB avail) - /var/hda/files/gh (1,017GB avail) 
May 26 11:11:03 7 balance:   Target drive: /var/hda/files/drives/drive4/gh (1.38TB available)
May 26 11:11:03 7 balance:   Moving file copy...
May 26 11:11:03 4 balance: PHP Warning [2]: rename(/var/hda/files/gh/Movies/The.Shawshank.Redemption.1994.720p.x264/The.Shawshank.Redemption.1994.720p.x264.mkv): failed to open stream: Value too large for defined data type in /usr/bin/greyhole on line 3481
May 26 11:11:03 4 balance: PHP Warning [2]: rename(/var/hda/files/gh/Movies/The.Shawshank.Redemption.1994.720p.x264/The.Shawshank.Redemption.1994.720p.x264.mkv,/var/hda/files/drives/drive4/gh/Movies/The.Shawshank.Redemption.1994.720p.x264/.The.Shawshank.Redemption.1994.720p.x264.mkv.6ca75): Value too large for defined data type in /usr/bin/greyhole on line 3481
May 26 11:11:03 4 balance:     Failed file copy. Skipping.

Deprecate .gh_uses_this files

.gh_uses_this files should become a thing of the past.
Instead, GH should use 'stat' to check the mounted filesystem ID, and compare it to what it knows is the correct filesystem that should be there.

Related changes required:

  • stop using those files to store the randomly generated ID used for weekly stats reports.
  • stop using those during --going operations (where they are renamed to .greyhole_used_this before the operation starts; we probably want to do something similar, but with the saved filesystem ID of the going directory instead).
  • handle the cases where the user wants to change the ID for a specific directory (he either changed the file system that mounts there, or he forgot to mount it there the first time GH started, and thus GH has the incorrect ID saved). This would replace the USAGE step where we ask the user to "touch .gh_uses_this".
  • We probably want to have GH complain when two directories have the same filesystem ID. I think GH should die when this happens, and optionally just log a warning and continue, if a specific configuration option is added in greyhole.conf (allow_multiple_dirs_per_partition = true, or something).

New config option: Greyhole daemon niceness

i'm doing a big transfer of files over smb. the cpu is maxed at 100%. i noticed that the landing zone partition slowly fills up. when i renice it to -5 then the landing zone get's empty again. maybe this should be the default?
cpu is a slow via c7 1.2ghz.

PHP Warnings when Fsck is running

Running a "greyhole --fsck -k -e" command and got this warning :

Feb 14 09:36:08 6 fsck: Starting fsck for /media/31060235-d0a9-4406-9b84-831781661010/.greyhole
Feb 14 09:36:08 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2488; BT:
Feb 14 09:36:08 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2491; BT:
Feb 14 09:36:09 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2488; BT:
Feb 14 09:36:09 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2491; BT:
Feb 14 09:36:09 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2488; BT:
Feb 14 09:36:09 4 md5-worker: PHP Warning [8]: Undefined index: drive in /usr/bin/greyhole on line 2491; BT:
Feb 14 09:36:09 6 fsck: fsck for /media/31060235-d0a9-4406-9b84-831781661010/.greyhole completed.

Not a big deal as the fsck seems to work, but just want to notify it...

Complete Management UI

Create a complete UI to manage Samba shares, users, and Greyhole configuration.
The UI should allow end-users to easily manage hard drives & partitions, shares & users.
Easy to follow wizards should be created for operations that are more involved (add new drive).

Add some kind of going SMB Share feature

Is it possible to add an option, to tell Greyhole that an SMB Share is going.
Greyhole should stop managing files from this SMB Share, and should start to Copy all Files Back into the real SMB SHare Directory.

  • Greyhole must check if there is enough free space for this.

Add retention policy to attic

Enhancement: Add retension policy to Attic/Recycle Bin
Suggestion:

  1. Add retension_days value to conf file. Default value of -1 - unlimited
  2. Add optional paramater to empty-attic for age of files in day; 0 = no age, default behaviour. Empty files older than number
  3. Add daily cron which runs empty-attic with retension_days value

Renames high up in a folder hierarchy with large numbers of files can lead to recursion exhaustion

What steps will reproduce the problem?

  1. Create a folder hierarchy at least three folders deep with 100,000+ files scattered throughout the bottom of the hierarchy.
  2. Rename the root folder of the hierarchy.

After a while loading tombstones the Greyhole daemon crashes due to recursion exhaustion, spitting out an error like the below:

PHP Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 8192 bytes) in /home/fileserver/greyhole/greyhole on line 1597

If you start the daemon backup it will repeat the same process. The tombstone loading needs to be rewritten in a way that can either cut the recursion stack or perform the same task iteratively.

fsck should check that file copies are NOT symlinks

Example --debug output for a file copy that is a symlink.

Debugging file operations for file named "Yes Man/Yes Man.avi"

...
From DB
=======
  [2012-01-28 17:49:51] Task ID 27109: fsck_file Films/Yes Man/Yes Man.avi

From logs
=========
Feb 20 22:48:58 7 fsck:     Saving metadata in /var/hda/files/drives/drive2/gh/.gh_metastore/Films/Yes Man/Yes Man.nfo
Feb 20 22:48:58 7 fsck: Found /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck: Found /var/hda/files/drives/drive2/gh/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck: Loading metafiles for Films/Yes Man/Yes Man.avi ...
Feb 20 22:48:58 7 fsck:   Got 2 metadata files.
Feb 20 22:48:58 6 fsck:   Missing file copies. Expected 2, got 1. Will create more copies using /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck:   Updating symlink at /var/LZ/Films/Yes Man/Yes Man.avi to point to /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck:   Saving 2 metadata files for Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck:     Saving metadata in /var/hda/files/drives/drive1/gh/.gh_metastore/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck:     Saving metadata in /var/hda/files/drives/drive2/gh/.gh_metastore/Films/Yes Man/Yes Man.avi
Feb 20 22:48:58 7 fsck: Starting metastores fsck for /Films/Yes Man

From filesystem
===============
Landing Zone:
  lrwxrwxrwx 1 root root 57 Feb 20 22:48 /var/LZ/Films/Yes Man/Yes Man.avi -> /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi

Metadata Store:
  -rwxrwxrwx 1 TertiaryAdjunct users 286 Feb 20 22:48 /var/hda/files/drives/drive1/gh/.gh_metastore/Films/Yes Man/Yes Man.avi
    array (
      0 => 
      stdClass::__set_state(array(
         'path' => '/var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi',
         'is_linked' => true,
         'state' => 'OK',
      )),
      1 => 
      stdClass::__set_state(array(
         'path' => '/var/hda/files/drives/drive2/gh/Films/Yes Man/Yes Man.avi',
         'is_linked' => false,
         'state' => 'OK',
      )),
    )
  -rw-rw-rw- 1 root root 286 Feb 20 22:48 /var/hda/files/drives/drive2/gh/.gh_metastore/Films/Yes Man/Yes Man.avi
    array (
      0 => 
      stdClass::__set_state(array(
         'path' => '/var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi',
         'is_linked' => true,
         'state' => 'OK',
      )),
      1 => 
      stdClass::__set_state(array(
         'path' => '/var/hda/files/drives/drive2/gh/Films/Yes Man/Yes Man.avi',
         'is_linked' => false,
         'state' => 'OK',
      )),
    )

File copies:
  -rwxrwxrwx 1 TertiaryAdjunct users 734025728 Jan 24 21:21 /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi
  lrwxrwxrwx 1 TertiaryAdjunct users 57 Jan 26 03:46 /var/hda/files/drives/drive2/gh/Films/Yes Man/Yes Man.avi -> /var/hda/files/drives/drive1/gh/Films/Yes Man/Yes Man.avi

New options to prioritize specific drives

Sometimes, one would want to use any of a list of specific drives in priority.
For example, if you have SATA-connected drives, USB-connected drives, and remotely-mounted Samba shares all included in your pool, you might want to use the SATA drives first, then the USB-connected drives, then the remote Samba shares.
When creating multiple copies, one copy should go on each 'type' of drives, i.e. the first copy should go on a SATA drive, the second on a USB drive, and the third on a remote Samba share. This would make the whole system more resilient to failures.

LZ Free Space Check

A lot of Amahi users use greyhole, and by default amahi uses root as the LZ, causing many many problems for users, and I've seen this issue on my own custom installs also. It would be useful if GH checked the LZ for free space, the same way[or similar] to how it checks for free space on the drives, and maybe add a min_free=xx option to greyhole.conf

Ignore files option

I want to add some files in an ignore list.

Example: I want to not copy Thumbs.db every time this file change, anyway windows always recreate them. Then, I want to ignore this file name in every shares and just not copy it in gh folder on local drives.

Jim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.