Project

General

Profile

Enhancement #537

Toushirou would like a brand new body

Added by Marc Dequènes over 2 years ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
DC Admins
Category:
System :: Hardware
Start date:
2018-03-10
Due date:
% Done:

100%

Estimated time:
(Total: 0:00 h)
Patch Available:
Confirmed:
No
Branch:
Toushirou-NG
Entity:
DuckCorp
Security:
Help Needed:
Yes

Description

Hardware is quite old (2007-07) and with the number of services many are getting slow (even if some should be better coded…). Also the storage is quite slow mostly due to a budget choice (RAID 5), so for example when MariaDB is on load the whole machine gets really down with high I/O. As for storage we have enough for some time still, and with cleanup it could improve, but with the success of StuffCloud and lots of photos and other media content it is clearly increasing.

As for needs, I would envision new services like MDA replication for a better availability of this important service, but we would need to replicate all the filter stack which is taking quite some resources. Also PG could be replicated. And other reliability things. Maybe we would want to split things on various VMs, like separate more user and system services, and maybe provide raw VMs (but we may leave this to Hivane with more resources).

First, let's agree on the topic: should we invest in a new machine?
If the result is YES, then we can draft and discuss the specs and feasibility (budget, hosting…).


Files

2017_06_01 inventaire.ods (14.8 KB) 2017_06_01 inventaire.ods List of servers Pierre-Louis Bonicoli, 2017-06-01 16:58

Subtasks

Enhancement #612: new Toushirou: check disksResolvedPierre-Louis Bonicoli

Actions
Enhancement #614: new Toushirou: Install system disksResolvedPierre-Louis Bonicoli

Actions
Enhancement #616: Configure /etc/udev/rules.d/70-persistent-net.rulesResolvedMarc Dequènes

Actions
Review #632: dropbear in initramfs: ansibilizeResolvedPierre-Louis BonicoliActions

Related issues

Has duplicate DuckCorp Infrastructure - Enhancement #615: new Toushirou: configuration migrationRejected2018-04-23

Actions
Blocks DuckCorp Infrastructure - Enhancement #576: Experiment with webphone solutionsResolved2017-07-20

Actions
Precedes DuckCorp Infrastructure - Enhancement #652: Orfeo would like a brand new bodyNew2018-03-122018-03-12

Actions

History

#1

Updated by Pierre-Louis Bonicoli over 2 years ago

Marc Dequènes wrote:

First, let's agree on the topic: should we invest in a new machine?
If the result is YES, then we can draft and discuss the specs and feasibility (budget, hosting…).

Yes.

#2

Updated by Pierre-Louis Bonicoli over 2 years ago

Misc might be able to fetch some servers, he need to check first.

#3

Updated by Pierre-Louis Bonicoli over 2 years ago

List of servers attached.

#4

Updated by Marc Dequènes over 2 years ago

  • Status changed from New to In Progress

So, I'm not sure if we can/want to use VMs on this machine. Containers seem a better path to use less resources but because of security reasons we would probably not do that just yet. Migrating services later into containers should not be a big deal. If we do not have a very powerful machine the number of VMs possible would be very low, so I'm not sure it is worth it. What do you think?

As for the basics I think of to go on (not ideal stuff but should do fine):
  • form factor: it is safe to stay on 1U format to be sure the current housing will not be impacted, but we may wish to gamble on having more
  • CPU: I think we need at worst twice what we have (if the model can still be found we may invest in filling out the slots for more power)
  • RAM: I think we need at worst twice what we have (but we can some ourselves to complement)
  • storage:
    • SCSI is a no go, too old and difficult to obtain technology
    • SAS is nice and poerwerful but finding drives may be more complicated and clearly much more expensive
    • SATA is cheaper and works well enough, I think this is a safe choice
    • I would avoid proprietary and often not real hardware RAID; also RAID 5 is such a pain in the ass for write performance, it is a no go, RAID 10 is fine and we could use software RAID

So at the moment at first glance I think this model could be nice: Proliant DL360p G8, the one with 8*1To SATA

IIUC the backplane for storage can be changed but you cannot mix SAS and SATA, so we cannot choose a SAS equipped one and later buy SATA replacement because of cost.

What do you think?

#5

Updated by Marc Dequènes over 2 years ago

So, IIUC the specs, the backplane do support either SAS or SATA, and there's nothing to change to switch from one to the other in this area. I guess the drive case would have to be changed though. Maybe we should ask someone having more experience just to be sure :-).

What I mean here is that if it's possible to mix easily, then we can choose whatever model better fits us or is more powerful.

#6

Updated by Marc Dequènes over 2 years ago

  • Subject changed from Toushirou would like a brand new body to Toushirou and Orfeo would like a brand new body

So Orfeo is jealous and would like a new body too.

#7

Updated by Marc Dequènes over 2 years ago

Pilou suggested a second G8 for Orfeo, or the most powerful G7. We just need to check the CPU and RAM to meet the criteria above.

#8

Updated by Marc Dequènes over 2 years ago

Pilou suggested to also try to house the small Dell (I guess it is Poweredge R210 II) for supervision and other small things. It's a nice idea, be could have a bastion and run checks like PKI cert obsolescence on it. The only difficult part is the housing but this can be dealt later. The small size and consumption should make this easier.

#9

Updated by Marc Dequènes about 2 years ago

  • Priority changed from High to Urgent

It was not possible to acquire new servers via Misc.

#10

Updated by Marc Dequènes about 2 years ago

  • Help Needed set to Yes
#11

Updated by Marc Dequènes almost 2 years ago

In the they are selling them. Unfortunately the most powerful vanished already. We asked for a HP 380 G7 and it has been put aside for us. This will improve Toushirou a bit to last some more time.

#12

Updated by Marc Dequènes over 1 year ago

As discussed on IRC during and after Orfeo's crash, I'm exploring moving Orfeo on a LXD on Elwing. Extra RAM is coming soon. The machine's CPU should be fine and there is more available space then in the past. I'm looking into LXD parameters to control resources consumption. I'm also thinking about the network configuration and looking for a new ISP with better quality (see #550) and fixed IPs. If this proves ok, I'll open tickets for migrations bits.

#13

Updated by Marc Dequènes over 1 year ago

#14

Updated by Marc Dequènes 12 months ago

  • Status changed from In Progress to Resolved
#15

Updated by Marc Dequènes 12 months ago

  • Status changed from Resolved to In Progress
#16

Updated by Marc Dequènes 12 months ago

  • Blocked by Bug #639: dc-ldap role: fails to initialize the MP database during initial installation added
#17

Updated by Marc Dequènes 12 months ago

  • Blocked by deleted (Bug #639: dc-ldap role: fails to initialize the MP database during initial installation)
#18

Updated by Marc Dequènes 12 months ago

apt-listbugs is causing many problems because new bugs were discovered in the meanwhile. So it's nice to know but would be nice to be able to override with some kind of switch.

#19

Updated by Marc Dequènes 12 months ago

#20

Updated by Marc Dequènes 7 months ago

So Tourshirou-NG is ready. I tested a few things and it worked well.

The sync script as well as the procedure needs to be reviewed so I pingued Pilou.

Of course the recent deployments (like #645) would need to be applied as well, I'll do it soon. And the network config needs to be prepared (#616), sooon too.

#21

Updated by Marc Dequènes 6 months ago

  • Branch set to Toushirou-NG

We're ready for the migration (scheduled 2019-05-04). The Toushirou-NG branch contains the new configuration.

#22

Updated by Marc Dequènes 6 months ago

#23

Updated by Marc Dequènes 6 months ago

  • Subject changed from Toushirou and Orfeo would like a brand new body to Toushirou would like a brand new body
  • Status changed from In Progress to Resolved

The migration went fine and all remaining deployment problems were solved.

I opened #652 in order to discuss the same topic for Orfeo.

Also available in: Atom PDF