Project

General

Profile

Actions

Enhancement #117

closed

Plans for the futur of HQ server/NAS/...

Added by Marc Dequènes over 13 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
High
Category:
System :: Hardware
Start date:
2010-08-02
Due date:
% Done:

100%

Estimated time:
Patch Available:
No
Confirmed:
No
Branch:
Entity:
DuckLand
Security:
No
Help Needed:

Description

One idea would be to:
  • use restrained (non-RAID) new disks (see #104) bought for Elwing in desktops (Annael, Alienor-NG, ...)
  • move 2*1TB disks from Daneel on Elwing, the latter stopping acting as a NAS but still having storage for internal or specific usage
  • transform Daneel into a NAS for both stufz and backup:
    • buy some kind of 3 or 4 racks external drive mount for Daneel (with very few features: PSU, quiet fan, and with possibility to get direct eSATA connections to the drive or bypass current system) (not too expensive, <100€)
    • buy 3*2TB or 4*1.5TB disks and plug 2 inside directly and the other 1 or 2 via the eSATA ports
    • ensure drives are properly cooled, both inside (see #110), and in the external box

Related issues 5 (0 open5 closed)

Related to DuckCorp Infrastructure - Bug #104: High io wait on ElwingRejectedMarc Dequènes2010-06-17

Actions
Related to DuckCorp Infrastructure - Enhancement #110: Disks on Elwing and Daneel are a bit hotResolvedMarc Dequènes2010-06-26

Actions
Related to DuckCorp Infrastructure - Bug #102: Elwing is BorkenResolvedMarc Dequènes2010-06-14

Actions
Blocked by DuckCorp Infrastructure - Tracking #118: Command at Pearl Diffusion: Cables and toolsResolvedMarc Dequènes2010-08-04

Actions
Blocked by DuckCorp Infrastructure - Tracking #119: Command at Materiel.net: HD for NAS and HD coolingResolvedMarc Dequènes2010-08-04

Actions
Actions #1

Updated by Marc Dequènes over 13 years ago

External rack mount ideas (even if the exact feature and price may not suit us well):
Actions #2

Updated by Marc Dequènes over 13 years ago

  • Priority changed from Normal to High
Actions #3

Updated by Marc Dequènes over 13 years ago

Idea: perhaps use a rack extension like one of:

and then plug it into an old spare case with a 300W PSU (probably Enermax), with a hack to start the PSU, it could be a cheap solution.

Actions #4

Updated by Marc Dequènes over 13 years ago

The plan:
  • buy 4 * Hitachi Deskstar 7K2000
  • buy necessary cables for e-SATA<->SATA conversion and being able to plug things into the external box
  • buy what is neccesary for #110
  • perhaps buy one rack for external cooling
Actions #5

Updated by Marc Dequènes over 13 years ago

Notes for RAID5:
  • chunk size 256KB
  • block size 4kB
    seems to give the best performance.

stride = chunk / block = 64kB
stripe-width = stride * ( (n disks in raid5) - 1 ) = 192kB

RAID creation: mdadm --create /dev/md0 --verbose --chunk=256 --level=5 --raid-devices=4 /dev/sda /dev/sdb /dev/sdc /dev/sdd
fs creation: mkfs.ext3 -v -m .1 -b 4096 -E stride=64,stripe-width=192 /dev/md0

Should we stay in ext3 ? ext4 seems quite stable on Orfeo. XFS seems to be ok again. dunno...

Doc:
Actions #6

Updated by Marc Dequènes over 13 years ago

As for the bootloader, GRUB2 should work well, but care about the bios_grub partition: http://grub.enbug.org/BIOS_Boot_Partition

We need to find a solution for:
  • mirrored loader, so be able to boot a degraded but not broken RAID 5 (see this page for RAID1 as an example)
  • mirrored bios_grub partition (no idea)
Doc:
Actions #7

Updated by Marc Dequènes over 13 years ago

  • % Done changed from 10 to 20

The hardware is installed, and an image of the Squeeze alpha1 d-i is ready.

Everything seems fine, except the 4 disks were not detected, only 3, and after lots of tests to exclude components, it seems one of the eSATA wire made a bad contact and got 4 sometimes. I need to change it and retry the stability.

Actions #8

Updated by Marc Dequènes over 13 years ago

  • % Done changed from 20 to 30

I bought new eSATA wires and everything is now alright. I was able (with some difficulties with parted and GRUB2) to install a system with /boot in RAID1 on all disks, and a big LVM space in RAID5 on all disk in the remaining space. The RAID5 is building and the system seems to work fine. I'll do a few other tests tommorow.

Actions #9

Updated by Marc Dequènes over 13 years ago

I wonder if this new NAS should be merged with Daneel. Having more backup space is great, but it defeats the concept fo backup for the important data on the NAS, and backup runs would probably be a nuisance for the NAS performance (and not all backups occurs while we are sleeping). Perhaps optimizing the backup on Toushirou to exclude not-so-important things could help Daneel stay the same, and, later, reinstalled on 2TB disks ?

Actions #10

Updated by Marc Dequènes over 13 years ago

I read LVM is now able to align containers on RAID properly, but don't know how to check it.

According to my tests, ext4 is able to calculate stride and stripe-width automatically.

Unfortunately, it seems there is a bug in PV size autodetection, and only 5.46TB out of 6TB are assigned. It's quite a lot of space lost, so i'm investigating. mdadm --detail /dev/md1 gives:

     Array Size : 5860071168 (5588.60 GiB 6000.71 GB)

but pvs gives:
  PV         VG          Fmt  Attr PSize PFree  
  /dev/md1   Daneel_main lvm2 a-   5.46t 449.10g

I reported the problem: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=592729

Actions #11

Updated by Marc Dequènes over 13 years ago

Stats (bonnie++):

Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
Daneel           4G   370  98 150080  28 96812  20  1655  98 358202  35 628.0  11
Latency             22100us    1002ms    1124ms   12510us   37295us   76125us
Version  1.96       ------Sequential Create------ --------Random Create--------
Daneel              -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
files:max            /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
       128:131072:0   877  13   112   1 11067  47   565   8   106   1  5034  23
Latency              5566ms    2576ms     824ms   67313ms    3054ms    1264ms
1.96,1.96,Daneel,1,1281615311,4G,,370,98,150080,28,96812,20,1655,98,358202,35,628.0,11,128,131072,,,,877,13,112,1,11067,47,565,8,106,1,5034,23,22100us,1002ms,1124ms,12510us,37295us,76125us,5566ms,2576ms,824ms,67313ms,3054ms,1264ms

Actions #12

Updated by Marc Dequènes over 13 years ago

Using the mkfs largefile option may be interesting for the partition dedicated to stuffZ.

Actions #13

Updated by Marc Dequènes over 13 years ago

Seems the default for mdadm chunk is now 512KB, and LVM VG extend size matches, so i guess it is not needed to use special options anymore. Nevertheless, not sure about alignments exactly.

Actions #14

Updated by Marc Dequènes over 13 years ago

  • % Done changed from 30 to 80

I can confirm the chunk size and LVM extend size. I decided to forget about alignments, too complex, and as i'm using default chunk/extend/... sizes, most things should match.

RAID 5 tests were done, and the wiki was updated with tips on the repair process. A cross double RAID 1 for root + sys, and RAID 5 for data only, with biosgrub partitions on the 2 disks with root.

So, Elwing-NG is on the way (progress can be seen in #102). Daneel is still asleep during the process.

Actions #15

Updated by Marc Dequènes over 13 years ago

  • % Done changed from 80 to 90

Elwing is OK. Latest checks are on the way.

Daneel has been rebuilt successfully, but had trouble during migration to ext4 (see #128). I guess GRUB2 is not happy with the separate /boot, and the RAID 1 is now split. I decided to rebuild the system on the second disk with GPT+biosgrub, better LVM organization and partition sizes, including ext4 and largefile4 for /backup.

Actions #16

Updated by Marc Dequènes over 13 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 90 to 100

Everything's working well now.

Actions #17

Updated by Marc Dequènes over 13 years ago

  • Category changed from System :: Base to System :: Hardware
Actions

Also available in: Atom PDF