Ubuntu‎ > ‎

NFS 效能調整

2010.10.15

最近遇到了NFS效能不佳, 造成Server和client 的CPU及網路負載太重的問題, 用nfsstat -c 看了一下發現getattr次數過高

link@w52:~$ nfsstat -c
Client rpc stats:
calls      retrans    authrefrsh
892716385   2          0      

Client nfs v3:
null                 getattr                      setattr                lookup                      access                      readlink    
0         0%      3227823701 62%     802078        0% 67272335  1%     1849213494 35%     0         0%
read         write        create       mkdir        symlink      mknod       
1687732   0% 31008188  0% 754065    0% 61324     0% 0         0% 0         0%
remove       rmdir        rename       link         readdir      readdirplus 
243230    0% 0         0% 1524458   0% 0         0% 8934360   0% 105       0%
fsstat       fsinfo       pathconf     commit      
25911     0% 18        0% 9         0% 0         0%


getattr>60%調整client端nfs mount的actimeo的參數能有效的降低這種情形, 原先是使用noac關掉attribute cache, 現在拿掉noac改成actimeo=120

ubuntu> umount /home/img

ubuntu> vi /etc/fstab
x.x.x.x:/vol/source   /home/img     nfs     rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=120,vers=3,timeo=600

ubuntu> mount /home/img

如果有許多NFS client會頻繁的更新NFS下的目錄和檔案的話, 可以將actimeo參數調小, 例如1或3; 如果是不頻繁更動的則設actimeo=120或者更高.



下面的圖NFS Server CPU負載突然下降, 是更改actimeo=120及取消noac以後的結果:


Client/server端的網路流量也降低




noac

重要的如oracle資料庫需立即更新NFS server資料者使用, 作用是關閉NFS client的attribute cache和data cache, 網路和主機cpu負載會比有設定cache者高很多.

(default: not specified)

If specified, this option prevents the NFS client from caching attributes for the mounted directory.

Specify noac for a directory that will be used frequently by many NFS clients. The noac option ensures that the file and directory attributes on the server are up to date, because no changes are cached on the clients. However, if many NFS clients using the same NFS server all disable attribute caching, the server may become overloaded with attribute requests and updates. You can also use the actimeo option to set all the caching timeouts to a small number of seconds, like 1 or 3.

If you specify noac, do not specify the other caching options.


actimeo=n

設定檔案和目錄attribute cache的保留時間為n秒.

Set min and max times for regular files and direc-
tories to n seconds. See "File Attributes," below,
for a description of the effect of setting this
option to 0.



2010.12.06 新增

     NFS造成CPU及磁碟負載過重並非全部透過調整NFS server/client參數就能達到減輕負載的目的, 透過修改前端應用程式, 減少不必要的NFS存取, 往往更能有效的減輕系統負載,  最近在實務上, 遇到這種例子, 應用了tcpdump工具,  發現原本不需要存取NFS檔案的程式,  一直在存取NFS檔案, 經過程式設計師修改程式後,  NFS server的負載由平均60%降到了20%.

底下是用到的tcpdump 指令

>sudo tcpdump -s0 -Nt host 192.168.100.100               # -s0            代表抓取整個封包資料, 才能看到完整封包內容
                                                         # host           這邊輸入的是後端NFS server的ip
底下是抓取到的部分封包
IP www.2518351231 > nfsserver.nfs: 112 access fh Unknown/40000000548CDFC920000000001F33C2E480E82F0699270540000000548CDF00 001f
IP nfsserver.nfs > www.2518351231: reply ok 124 access c 001f
IP www.kerberos4 > nfsserver.nfs: . ack 3550744470 win 2815 <nop,nop,timestamp 44347615 46855275>
IP www.2535128447 > nfsserver.nfs: 132 lookup fh Unknown/40000000548CDFC9200000000095FCA96AE0BFB80699270540000000548CDF00 "229715781291624639.jpg"                 #lookup表示查找nfs server內是否有229715781291624639.jpg這個檔案, lookup代表NFS作業之一
IP nfsserver.nfs > www.2535128447: reply ok 120 lookup ERROR: No such file or directory
IP www.kerberos4 > nfsserver.nfs: . ack 121 win 2815 <nop,nop,timestamp 44347634 46855294>
IP www.2551905663 > nfsserver.nfs: 112 access fh Unknown/40000000548CDFC9200000000095FCA96AE0BFB80699270540000000548CDF00 001f
IP nfsserver.nfs > www.2551905663: reply ok 124 access c 001f


> sudo tcpdump -s0 -c100 -Nt tcp port 2049                               # 顯示100個tcp port 2049(NFS)的完整封包


NFS client會透過RPC(remote procedure call)協定去呼叫相對應的NFS server程序, 如下:
Procedure 0:  NULL - Do nothing 
Procedure 1: GETATTR - Get file attributes
Procedure 2: SETATTR - Set file attributes
Procedure 3: LOOKUP - Lookup filename
Procedure 4: ACCESS - Check Access Permission
Procedure 5: READLINK - Read from symbolic link
Procedure 6: READ - Read From file
Procedure 7: WRITE - Write to file
Procedure 8: CREATE - Create a file
Procedure 9: MKDIR - Create a directory
Procedure 10: SYMLINK - Create a symbolic link
Procedure 11: MKNOD - Create a special device
Procedure 12: REMOVE - Remove a File
Procedure 13: RMDIR - Remove a Directory
Procedure 14: RENAME - Rename a File or Directory
Procedure 15: LINK - Create Link to an object
Procedure 16: READDIR - Read From Directory
Procedure 17: READDIRPLUS - Extended read from directory
Procedure 18: FSSTAT - Get dynamic file system information
Procedure 19: FSINFO - Get static file system Information
Procedure 20: PATHCONF - Retrieve POSIX information
Procedure 21: COMMIT - Commit cached data on a server to stable storage


nfsstat 指令常用參數

> nfsstat --help
Usage: nfsstat [OPTION]...

  -m          顯示mount的NFS目錄及mount參數
  -c           顯示NFS 用戶端的使用狀態
  -s           顯示NFS 伺服器的使用狀態
  -v           顯示目前設備的所有NFS相關狀態訊息
  --help      使用說明


2010.12.21

    1. PHP的函數file_exists()和is_file()會對透過網路存取的NFS檔案系統造成嚴重的負載, 儘可能減少使用在web環境.
    2. 目錄的結構及一個目錄下檔案/目錄的數量也會增加硬碟運轉的負擔.

參考網址
http://docstore.mik.ua/orelly/networking_2ndEd/nfs/ch18_06.htm
http://docs.hp.com/en/B1031-90043/ch02s03.html
http://www.experts-exchange.com/OS/Unix/Setup/Q_21497744.html

http://www.lincoln.edu/math/rmyrick/ComputerNetworks/InetReference/115.htm
Comments