-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Buongiorno a tutti, ho preparato un cluster con heartbeat e drbd di cui posto le conf di seguito. drbd.conf global { usage-count no; } common { syncer { rate 100M; } } resource r0 { protocol C; handlers { pri-on-incon-degr "echo o > /proc/sysrq-trigger ; /etc/init.d/heartbeat stop"; pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f"; local-io-error "echo o > /proc/sysrq-trigger ; halt -f"; outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater -t 5"; pri-lost "echo pri-lost. Have a look at the log files. | mail -s 'DRBD Alert' root"; split-brain "echo split-brain. drbdadm -- --discard-my-data connect $DRBD_RESOURCE ? | mail -s 'DRBD Alert' root"; } startup { wfc-timeout 60; degr-wfc-timeout 120; # 2 minutes. } disk { on-io-error detach; } net { rr-conflict disconnect; after-sb-0pri discard-younger-primary; after-sb-1pri consensus; after-sb-2pri disconnect; } syncer { rate 100M; al-extents 257; } ## cache on alessandra { device /dev/drbd0; disk /dev/hda10; address 192.168.1.245:7788; meta-disk internal; } on lalla { device /dev/drbd0; disk /dev/hda9; address 192.168.1.15:7788; meta-disk internal; } } resource "r1" { protocol C; startup { wfc-timeout 60; ## Infinite! degr-wfc-timeout 120; ## 2 minutes. } disk { on-io-error detach; } net { } syncer { rate 100M; } on alessandra { device /dev/drbd1; disk /dev/hda11; address 192.168.1.245:7789; meta-disk internal; } on lalla { device /dev/drbd1; disk /dev/hda10; address 192.168.1.15:7789; meta-disk internal; } } ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log logfacility local0 keepalive 2 deadtime 30 initdead 120 udpport 694 ucast eth1 192.168.2.246 auto_failback on node alessandra node lalla autkeys auth 2 2 sha1 unabellapasswordlungaecomplicata haresources alessandra IPaddr::192.168.2.247/24/eth1 \ drbddisk::r0 Filesystem::/dev/drbd0::/cache::ext3::defaults \ drbddisk::r1 Filesystem::/dev/drbd1::/jumper::ext3::defaults \ bind9 samba winbind bla bla bla \ MailTo::jclark@xxxxxxxxxx::Transizione_alessandra_lalla uguali sulle due macchine a meno del parametro ucast che e' chiaramente su alessandra l'ip di eth1 di lalla e su lalla quello di alessandra. i drbdx sono in /etc/fstab con l'opzione noauto. Se spengo fisicamente il master (alessandra) o tiro giu' a manina heartbeat lalla immediamente fa il take over e si prende tutto il malloppo senza dire ne a ne b il drbd switcha di ruolo senza nessun problema, idem al riavvio o all restart di heartbeat. Pero'... se Io simulo un guasto di rete (I.E. uno switch che va arrosto) unpluggando la scheda di rete eth1 su alessandra mi appare nei log di lalla questa cosa lalla:/# May 30 09:32:16 lalla heartbeat: [3018]: WARN: node alessandra: is dead May 30 09:32:16 lalla heartbeat: [3018]: WARN: No STONITH device configured. May 30 09:32:16 lalla heartbeat: [3018]: WARN: Shared disks are not protected. May 30 09:32:16 lalla heartbeat: [3018]: info: Resources being acquired from alessandra. May 30 09:32:16 lalla heartbeat: [3018]: info: Link alessandra:eth1 dead. May 30 09:32:16 lalla heartbeat: [8063]: debug: notify_world: setting SIGCHLD Handler to SIG_DFL May 30 09:32:16 lalla harc[8063]: info: Running /etc/ha.d/rc.d/status status May 30 09:32:16 lalla heartbeat: [8064]: info: No local resources [/usr/share/heartbeat/ResourceManager listkeys lalla] to acquire. May 30 09:32:16 lalla heartbeat: [3018]: debug: StartNextRemoteRscReq(): child count 1 May 30 09:32:16 lalla mach_down[8092]: info: Taking over resource group IPaddr::192.168.2.247/24/eth1 May 30 09:32:16 lalla ResourceManager[8118]: info: Acquiring resource group: alessandra IPaddr::192.168.2.247/24/eth1 drbddisk::r0 Filesystem::/dev/drbd0::/cache::ext3::defaults drbddisk::r1 Filesystem::/dev/drbd1::/jumper::ext3::defaults bind9 MailTo::mario.guenzi@xxxxxxxxxx::Transizione_lalla_alessandra May 30 09:32:16 lalla IPaddr[8145]: INFO: Resource is stopped May 30 09:32:16 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 start May 30 09:32:16 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 start May 30 09:32:17 lalla IPaddr[8243]: INFO: Using calculated netmask for 192.168.2.247: 255.255.255.0 May 30 09:32:17 lalla IPaddr[8243]: DEBUG: Using calculated broadcast for 192.168.2.247: 192.168.2.255 May 30 09:32:17 lalla IPaddr[8243]: INFO: eval ifconfig eth1:0 192.168.2.247 netmask 255.255.255.0 broadcast 192.168.2.255 May 30 09:32:17 lalla IPaddr[8243]: DEBUG: Sending Gratuitous Arp for 192.168.2.247 on eth1:0 [eth1] May 30 09:32:17 lalla IPaddr[8214]: INFO: Success May 30 09:32:17 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 start done. RC=0 May 30 09:32:17 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/drbddisk r0 start May 30 09:32:17 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/drbddisk r0 start May 30 09:32:29 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/drbddisk r0 start done. RC=1 May 30 09:32:29 lalla ResourceManager[8118]: ERROR: Return code 1 from /etc/ha.d/resource.d/drbddisk May 30 09:32:29 lalla ResourceManager[8118]: CRIT: Giving up resources due to failure of drbddisk::r0 May 30 09:32:29 lalla ResourceManager[8118]: info: Releasing resource group: alessandra IPaddr::192.168.2.247/24/eth1 drbddisk::r0 Filesystem::/dev/drbd0::/cache::ext3::defaults drbddisk::r1 Filesystem::/dev/drbd1::/jumper::ext3::defaults bind9 MailTo::mario.guenzi@xxxxxxxxxx::Transizione_lalla_alessandra May 30 09:32:30 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/MailTo mario.guenzi@xxxxxxxxxx <mailto:mario.guenzi@xxxxxxxxxx> Transizione_lalla_alessandra stop May 30 09:32:30 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/MailTo mario.guenzi@xxxxxxxxxx <mailto:mario.guenzi@xxxxxxxxxx> Transizione_lalla_alessandra stop May 30 09:32:30 lalla MailTo[8431]: INFO: Success May 30 09:32:30 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/MailTo mario.guenzi@xxxxxxxxxx <mailto:mario.guenzi@xxxxxxxxxx> Transizione_lalla_alessandra stop done. RC=0 May 30 09:32:30 lalla ResourceManager[8118]: info: Running /etc/init.d/bind9 stop May 30 09:32:30 lalla ResourceManager[8118]: debug: Starting /etc/init.d/bind9 stop May 30 09:32:30 lalla ResourceManager[8118]: debug: /etc/init.d/bind9 stop done. RC=0 May 30 09:32:30 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /jumper ext3 defaults stop May 30 09:32:30 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/Filesystem /dev/drbd1 /jumper ext3 defaults stop May 30 09:32:30 lalla Filesystem[8531]: INFO: Running stop for /dev/drbd1 on /jumper May 30 09:32:31 lalla Filesystem[8520]: INFO: Success May 30 09:32:31 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/Filesystem /dev/drbd1 /jumper ext3 defaults stop done. RC=0 May 30 09:32:31 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/drbddisk r1 stop May 30 09:32:31 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/drbddisk r1 stop May 30 09:32:31 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/drbddisk r1 stop done. RC=0 May 30 09:32:31 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /cache ext3 defaults stop May 30 09:32:31 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/Filesystem /dev/drbd0 /cache ext3 defaults stop May 30 09:32:31 lalla Filesystem[8641]: INFO: Running stop for /dev/drbd0 on /cache May 30 09:32:31 lalla Filesystem[8630]: INFO: Success May 30 09:32:31 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/Filesystem /dev/drbd0 /cache ext3 defaults stop done. RC=0 May 30 09:32:31 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/drbddisk r0 stop May 30 09:32:31 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/drbddisk r0 stop May 30 09:32:31 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/drbddisk r0 stop done. RC=0 May 30 09:32:31 lalla ResourceManager[8118]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 stop May 30 09:32:31 lalla ResourceManager[8118]: debug: Starting /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 stop May 30 09:32:32 lalla IPaddr[8769]: INFO: ifconfig eth1:0 down May 30 09:32:32 lalla IPaddr[8740]: INFO: Success May 30 09:32:32 lalla ResourceManager[8118]: debug: /etc/ha.d/resource.d/IPaddr 192.168.2.247/24/eth1 stop done. RC=0 May 30 09:32:32 lalla mach_down[8092]: info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired May 30 09:32:32 lalla mach_down[8092]: info: mach_down takeover complete for node alessandra. May 30 09:32:32 lalla heartbeat: [3018]: info: mach_down takeover complete. May 30 09:33:02 lalla hb_standby[8823]: Going standby [foreign]. May 30 09:33:02 lalla heartbeat: [3018]: info: lalla wants to go standby [foreign] May 30 09:33:13 lalla heartbeat: [3018]: WARN: No reply to standby request. Standby request cancelled. e ovviamente non va su niente ne ip ne servizi. dato che restuisce errore 1 ho provato a cercare con google ma o sono un tordo integrale e non capisco nulla o nessuno sa cosa sia questo errore 1 che e' definito errore generico. indipendentementte da come sia definito pero' vorrei capire cosa sbaglio. come ulteriore informazione aggiungo: distribuzione etch con pacchetti backport drbd 8.0.11 compilato staticamente nel kernel kernel 2.6.25 heartbeat 2.1.3 Qualche idea da darmi? grazie in anticipo e scusate il crossposting. - -- Mario Vittorio Guenzi http://clark.tipistrani.it Si vis pacem para bellum - -- Per REVOCARE l'iscrizione alla lista, inviare un email a debian-italian-REQUEST@xxxxxxxxxxxxxxxx con oggetto "unsubscribe". Per problemi inviare un email in INGLESE a listmaster@xxxxxxxxxxxxxxxx To UNSUBSCRIBE, email to debian-italian-REQUEST@xxxxxxxxxxxxxxxx with a subject of "unsubscribe". Trouble? Contact listmaster@xxxxxxxxxxxxxxxx -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFIRQaGm6qs1ZkNrIoRArhnAJ9vheLEre6dbrEw/Uw7dTrMq+LvlQCfbMiJ MtLDvkw5GIgcoSn+GNLe5ns= =pMbd -----END PGP SIGNATURE----- -- Per iscriversi (o disiscriversi), basta spedire un messaggio con OGGETTO "subscribe" (o "unsubscribe") a mailto:linuxtrent-request@xxxxxxxxxxxxx