Policy on saving NSL data
Policy on saving NSL data
The NSL array of nine 75 Gig SCSI drives keeps filling up. We keep saving data to DVD leaving the FITS files from the last two months, but the drives are too frequently 90% filled or more. What should we do?
1. Bigger disk drives. Switch to large IDE drives. Pricegrabber.com lists external 500G disk drives for under $1. a Gig. For perhaps $5,000, we could increase the NSL data saving capacity by a factor of a few. One problem is that our system administrator and general computer guru is a big fan of SCSI drives and upgrading them is much more expensive.
2. Save less data. Specifically, save less than 2 months of old FITS files, as is done now. This will help a little bit but be annoying for people wishing to see a FITS file from the month before last.
3. Faster DVD burner. The old one we have now was purchased two years ago, and even though we can't find the actual speed, surely a faster one is available today. This will likely cost around $500.
4. Ring buffer. Start automatically deleting FITS and moving GIF data that is older than two months. JPGs can stay, for now. On the down side, this means losing real data and possibly re-defining the work-study job for one of our key undergraduates. On the up side, this solution automates everything and no humans will be needed to intervene on a regular basis.
Any thoughts would be much appreciated!
- RJN
1. Bigger disk drives. Switch to large IDE drives. Pricegrabber.com lists external 500G disk drives for under $1. a Gig. For perhaps $5,000, we could increase the NSL data saving capacity by a factor of a few. One problem is that our system administrator and general computer guru is a big fan of SCSI drives and upgrading them is much more expensive.
2. Save less data. Specifically, save less than 2 months of old FITS files, as is done now. This will help a little bit but be annoying for people wishing to see a FITS file from the month before last.
3. Faster DVD burner. The old one we have now was purchased two years ago, and even though we can't find the actual speed, surely a faster one is available today. This will likely cost around $500.
4. Ring buffer. Start automatically deleting FITS and moving GIF data that is older than two months. JPGs can stay, for now. On the down side, this means losing real data and possibly re-defining the work-study job for one of our key undergraduates. On the up side, this solution automates everything and no humans will be needed to intervene on a regular basis.
Any thoughts would be much appreciated!
- RJN
Thoughts on data-saving
I wouldn't say I'm against IDE drives, just that whatever system we choose makes sense in terms of scalability and reliability. The problem with IDE isn't the speed or capacity, but rather the limit on number of devices -- namely 4. The system drive takes one spot, leaving 3. I'm not sure the case allows space for 3 more drives, and I'm not aware of an external IDE case/mounting solution for a Sun -- hence the SCSI array.
Ideally, we would look at some sort of "real" disk array -- probably attached to the server via SCSI, but probably running with IDE disks inside. The drives that we have are not really in an array, rather a string of 9 individual disks on a SCSI bus. Real arrays, properly implemented, scale quite well -- either through the addition of drives until all slots are populated, or through the connection of an additional drive chassis.
It's all a matter of balancing cost against requirements. More expensive solutions are often more flexible, while cheaper ones (backup tapes come to mind) don't allow fast access to archived data. I'd recommend keeping these principles in mind, regardless of technical limitations like the number of IDE devices.
Size: how much do you want to keep?
Lifetime: how long do you want to keep it?
Access: how much trouble are you willing to go to in order to look at archived data?
alternately, how immediate should access be? online, nearline, or offline?
Cost: what is it worth to do this?
TJ
Ideally, we would look at some sort of "real" disk array -- probably attached to the server via SCSI, but probably running with IDE disks inside. The drives that we have are not really in an array, rather a string of 9 individual disks on a SCSI bus. Real arrays, properly implemented, scale quite well -- either through the addition of drives until all slots are populated, or through the connection of an additional drive chassis.
It's all a matter of balancing cost against requirements. More expensive solutions are often more flexible, while cheaper ones (backup tapes come to mind) don't allow fast access to archived data. I'd recommend keeping these principles in mind, regardless of technical limitations like the number of IDE devices.
Size: how much do you want to keep?
Lifetime: how long do you want to keep it?
Access: how much trouble are you willing to go to in order to look at archived data?
alternately, how immediate should access be? online, nearline, or offline?
Cost: what is it worth to do this?
TJ
Dave Torrey
Changing the hardware is a long term solution. On the short term we will probably have to pay on-line data. I was thinking about writing a short program that compresses the older files the archives. In a second thought, the JPGs are already compressed, so I need to check whether the additional disk space (if at all) provided by using gzip worth changing our web site.
-
- Ensign
- Posts: 10
- Joined: Thu Oct 21, 2004 6:29 am
-
- Ensign
- Posts: 17
- Joined: Thu Sep 23, 2004 3:44 pm
- Location: Michigan Tech.
-
- Ensign
- Posts: 95
- Joined: Fri Jul 30, 2004 8:30 pm
Can we do SATA?
I have this hard drive from newegg
http://www.newegg.com/app/ViewProductDe ... 59&depa=1I
63 cents per gig.
It may just be my imagination but I believe my SATA drive responds much faster than my IDE drive.
I also recommend this DVD burner.
http://www.newegg.com/app/ViewProductDe ... 962&depa=1
$63.50
Maybe with careful shopping we could employ both solutions, therby best preparing the system for future expansion.
What is the speed of the current burner?
I have this hard drive from newegg
http://www.newegg.com/app/ViewProductDe ... 59&depa=1I
63 cents per gig.
It may just be my imagination but I believe my SATA drive responds much faster than my IDE drive.
I also recommend this DVD burner.
http://www.newegg.com/app/ViewProductDe ... 962&depa=1
$63.50
Maybe with careful shopping we could employ both solutions, therby best preparing the system for future expansion.
What is the speed of the current burner?
-
- Ensign
- Posts: 78
- Joined: Tue Jul 27, 2004 1:45 pm
- Location: Back at Tel Aviv University after a sabbatical
Storage
I suggest NOT going SCSI, but using the cheaper alternative. We are regularly using 200GB ATA disks and are pretty happy. Note though that we are doubling the data up on two disks, so that a crash does not destroy irreplaceable observation data.
Noah Brosch
Noah Brosch
-
- Ensign
- Posts: 78
- Joined: Tue Jul 27, 2004 1:45 pm
- Location: Back at Tel Aviv University after a sabbatical
Disks vs. DVDs
The disk advantage is the on-line availability of the images and the reduction of hassle in mounting DVDs. One can now have 4x500 GB disks in a single enclosure, for a 2 TB storage (1 TB with redundancy). However, images grew as well. The CONCAM IV now in testing at the Wise Observatory produces ~13MB images and we probably will go to a faster cadence than the lower CONCAM models. This implies a data generation of some 6 GB per night and even a 1 TB storage will fill up in six months...