2007年6月9日星期六

Oracle 同 Amanda 都玩人架

今個星期一中哂招.一早起身已經係咁咳,又有少少痰,梗係訓多陣然後睇個醫生穏陣d啦.三點左右返到公司,有同事就同我講話要Load返上個星期五個backup出嚟,因為Oracle嘅data好似有問題.於是我就馬上打左個突,原因好簡單,就係我Backup左咁耐(一年有多)都未做過restore,梗係唔記得左點做啦,不過始終都要做架啦,咁我咪二話不說馬上試下點做.

我公司其實係用梗Amanda做Backup嘅,我一上手第一樣試嘅就係做個“amtape show”睇下係邊個Slot果餅帶Back左上星期五嘅Data.一睇之下知道係Tape 3嘅Inecremental.咁我咪打“amtape slot 3 ”轉去Tape 3然後就入“amrecover DailyBack”.入左amrecover個console我仲記得係用sethost setdisk 同setdate三個command去identify要由邊個backup度restore出嚟.先用sethost去point到去個DB Server,然後就用setdisk去point去DB嘅partition,最後用setdate去point去要restore果一日.

當我打完呢三個之後就發現自己之前蠢左,原來佢係醒到識自己幫我轉埋tape架,諗深一層又覺得應該嘅,如果咁都唔識又洗咩用佢呢?跟住我就用“cd”navigate到自己想restore嘅target,然後再用“add”將個target放extraction list,到最後用“extract”去execute個restore動作,個amrecover concole竟然係咁話:-

EOF, check amidxtaped..debug file on localhost.
amrecover: short block 0 bytes
UNKNOWN file
amrecover: Can't read file header
extract_list - child returned non-zero status: 1
Continue [?/Y/n/r]? Y

大佬,咩事啊?唔知點算咪Google下囉,一搵就搵到Amanda個faq-o-matric有喎,開心到我暈,以為咩黑氣都過哂添,點不知睇完真係暈左,原來係有兩個人都有咁嘅問題,並唔係有人有Answer....Shit!

唉!但係佢地都比到個hints我,就係要去睇下/var/log/amanda/amidx.*有咩message睇.一睇之下嚇到眼都突埋,佢話餅tape入面嘅data係6月4號???咪即係星期一,亦即係打梗command果一日?我梗唔信啦,有咩理由先得架.我之後就用“amadmin find myhost myfile“睇下咩東東,喂!明明係Tape7先係星期一架喎,你有無搞錯啊!我份人就死都唔信邪.

其實我用Amanda嘅原因除左係因為佢免費之外,仲有就係因為佢係用gnu tar或者係用ufsdump做backup架,好處就梗係萬一個OS有咩事,唔洗慌唔知點restore啦.事關tar同ufsdump係OS Bundle嘅program,有咩事只要有OS就可以馬上做restore,就算真係無OS,用Knoppix咪無有怕囉,CD就可以Boot得起成個OS,gnu tar就梗係一boot起就有啦.

上去睇Amanda.org嘅manual學下點做manual restore.睇來睇去都係唔明,小弟一向都信“世上無難事,只怕你唔試”!試下試下就知道個Concept個好簡單,其實每一日嘅backup係store入一個file裡面,而佢就用在個file最前面嘅32 kbyte做file header store下個backup究竟係邊個,邊個host同邊個隻disk.

係咁嘅情況下,我就先check下我要嘅backup係第幾個file啦,只要睇下個amidx.*就知原來我要嘅backup係第72個file,然後就用standard Linux command”mt -f /dev/tapedevice fsf 72”去skip到我要個file,跟住就又係用standard Linux command讀左頭果512個byte睇下係唔係我要嘅嘢先.點不知都唔係我要嘅嘢,都估佢唔會咁so我架喇.我就估可能前前後後多左或者少左d file啫,只要我睇下個amidx.*睇下佢話呢個file係第幾個file,然後同72呢個我嘅target做個加或減咪得囉.

點知都唔係,仲要個sequence都唔match,我嘅意思係amidx.*入面講嘅file個實際上tape入面嘅file都唔一樣,直情係九唔答八添.我開始懷疑係個index corrupt左,原因好可能係啱啱上個星期日開始左個新cycle,咁佢就將之前個cycle嘅index drop左.

真係唔好以為我會咁易放棄先得架,我就當無左個amanda做一個full manual restore,即係完全唔用amanda provide嘅任何command,只係用standard Linux入面嘅command做restore.第一樣要做嘅就係幫餅tape做個list,我就寫左一個好簡單嘅script:-

!/bin/bash
let “count=1”
mt –f /dev/tapedevice rewind
while [ “0” = “0” ];do
mt –f /dev/tapedevice fsf 1
dd if =/dev/tapedevice bs=32k count=1 of=idx.$count
echo “$count” >> tape.index
head -1 idx.$count >> tape.index
let “count+=1”
done

行完之後就就會將個file list放左入一個叫tape.index嘅file入面,一睇知下就更加confirm我講個index corrupt左嘅講法,因為個list係同amidex.*入面嘅content真係完全唔同,說時遲那時快,我馬上搵到個file嘅position原來係78,咁我就用下面嘅command去讀返個file去嚟:-

mt –f /dev/tapedevice rewind
mt –f /dev/tapedevice fsf 78
dd if=/dev/tapedevice bs=32m count=1
之後就confirm個list係啱,然後就將個file用下面嘅command load返出去個OS上面:-
dd if=/dev/tapedevice bs=32k of=oracle.restore.tar

之後我就“tar –xvf app/oracle/oradata/SID/* oracle.restore.tar”restore返d oracle嘅data file出嚟.嘩!你都咪講笑,我將個file由tape到落返落個OS度就用左半個鐘,但係我由個disk file度restore返d oracle data file出嚟竟然用左一個鐘都唔止.真係蠢到極,竟然唔記得tape嘅concept係sequential所以係會好快,而disk係random access,所以會比tape慢.

之後就到比oracle玩我個同事,不過都係下個blog先再同大家交代.

沒有留言: