2015-11-07

PostgreSQLサーバー用ディレクトリを暗号化ファイルシステムに置き換えてみる～論理ボリューム作成編

CentOS Linux PostgreSQL

前置き

やりたいことは、前に書いた

PostgreSQLサーバー用ディレクトリを暗号化ファイルシステムに置き換えてみる - HHeLiBeXの日記正道編

と同じなのだが、

HDDを追加せずに、論理ボリュームlv_rootを分割して、片方を論理ボリュームlv_pgsqlとして暗号化ファイルシステムにする

という制約を課す(というか、元々やりたかったのはこっち)。

事前確認

lv_rootの情報を確認しておく。

# df -m
Filesystem           1M-blocks  Used Available Use% Mounted on
/dev/mapper/VolGroup-lv_root
                         14023  1005     12300   8% /
tmpfs                      372     0       372   0% /dev/shm
/dev/sda1                  477    57       395  13% /boot
# lvdisplay --units m /dev/mapper/VolGroup-lv_root
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_root
  LV Name                lv_root
  VG Name                VolGroup
  LV UUID                SHPvKC-d2fE-Rb2D-5lpT-eqly-QWAB-rGLf3z
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2015-11-07 11:53:33 +0900
  LV Status              available
  # open                 1
  LV Size                14376.00 MiB
  Current LE             3594
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
#

下準備

いずれにしてもPostgreSQLサーバーは一旦止めないといけないし、lv_rootを分割するにはOSを停止しないといけないので、以下を実行する。

# chkconfig postgresql off
# sync;sync;sync;shutdown -h now

OSをシャットダウンしたら、lv_rootを分割するために、インストールメディアを挿入してRescueモードで再起動する。

起動したら、以下の手順でシェルを起動する。

Rescue installed systemを選択してEnterキーを押す
Choose a Language ⇒ English
Keyboard Type ⇒ jp106
Setup Networking ⇒ No
Rescue ⇒ Skip
shell Start shellを選択

lv_rootの分割

bash-4.1# lvm vgchange -a y
  2 logical volume(s) in volume group "VolGroup" now active
bash-4.1# fsck.ext4 -f /dev/mapper/VolGroup-lv_root
e2fsck 1.41.12 (17-May-2010)
Pass 1: Checking inodes blocks, and sizes
    (略)
bash-4.1# resize2fs /dev/mapper/VolGroup-lv_root 6184M
resize2fs 1.41.12 (17-May-2010)
Resizing the filesystem on /dev/mapper/VolGroup-lv_root to 1583104 (4k) blocks.
The filesystem on /dev/mapper/VolGroup-lv_root is now 1583104 blocks long.

bash-4.1# lvm lvreduce -L6184M /dev/mapper/VolGroup-lv_root
  WARNING: Reducing active logical volume to 6.04GiB
  THIS MAY DESTROY YOUR DATA (filesystem etc.)
Do you realy want to reduce lv_root? [y/n]: y
  Size of logical volume VolGroup/lv_root changed from 14.04 GiB (3594 extents) to 6.04 GiB (1546 extends).
  Logical volume lv_root successfully resized
bash-4.1# exit

メニューに戻ったら、インストールメディアを抜いた後、rebootを選択して再起動する。

空き領域の確認

ちゃんと空き領域が確保できたかどうかを確認する。

# lvdisplay --units m /dev/mapper/VolGroup-lv_root
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_root
  LV Name                lv_root
  VG Name                VolGroup
  LV UUID                SHPvKC-d2fE-Rb2D-5lpT-eqly-QWAB-rGLf3z
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2015-11-07 11:53:33 +0900
  LV Status              available
  # open                 1
  LV Size                6184.00 MiB
  Current LE             1546
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
# vgdisplay --units m
  --- Volume group ---
  VG Name               VolGroup
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               15880.00 MiB
  PE Size               4.00 MiB
  Total PE              3970
  Alloc PE / Size       1922 / 7688.00 MiB
  Free  PE / Size       2048 / 8192.00 MiB
  VG UUID               n0127i-W87x-Y9GI-CAT9-60yD-1ZQz-J9KQ9b
   
#

lv_rootが縮小され、8GiBの空き領域が確保された事が確認できる。

暗号化ファイルシステムの作成と自動接続

論理ボリュームの作成

PostgreSQL用のデータディレクトリなので、lv_pgsqlとして論理ボリュームを作成する。

# lvcreate -L 8192M -n lv_pgsql VolGroup
  Logical volume "lv_pgsql" created.
# lvdisplay --units m
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_root
  LV Name                lv_root
  VG Name                VolGroup
  LV UUID                SHPvKC-d2fE-Rb2D-5lpT-eqly-QWAB-rGLf3z
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2015-11-07 11:53:33 +0900
  LV Status              available
  # open                 1
  LV Size                6184.00 MiB
  Current LE             1546
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:0
   
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_swap
  LV Name                lv_swap
  VG Name                VolGroup
  LV UUID                221tmz-bcIA-Tafg-kUvM-xxwv-8uID-qiJSpr
  LV Write Access        read/write
  LV Creation host, time localhost.localdomain, 2015-11-07 11:53:40 +0900
  LV Status              available
  # open                 1
  LV Size                1504.00 MiB
  Current LE             376
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:1
   
  --- Logical volume ---
  LV Path                /dev/VolGroup/lv_pgsql
  LV Name                lv_pgsql
  VG Name                VolGroup
  LV UUID                WrnmIg-daJ1-Xlnr-AJ0e-ciCZ-QeXT-GxaZSq
  LV Write Access        read/write
  LV Creation host, time CentOS6, 2015-11-07 13:54:04 +0900
  LV Status              available
  # open                 0
  LV Size                8192.00 MiB
  Current LE             2048
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:2
   
#

暗号化ファイルシステムの作成

前回とほぼ同じだが再掲。

# cryptsetup luksFormat -h sha256 /dev/mapper/VolGroup-lv_pgsql

WARNING!
========
This will overwrite data on /dev/mapper/VolGroup-lv_pgsql irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: 
Verify passphrase: 
# cryptsetup luksOpen /dev/mapper/VolGroup-lv_pgsql lv_pgsql_crypt
Enter passphrase for /dev/mapper/VolGroup-lv_pgsql: 
# mkfs -t ext4 /dev/mapper/lv_pgsql_crypt
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
524288 inodes, 2096640 blocks
104832 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=2147483648
64 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632

Writing inode tables: done                            
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done

This filesystem will be automatically checked every 30 mounts or
180 days, whichever comes first.  Use tune2fs -c or -i to override.
# cryptsetup luksClose lv_pgsql_crypt
#

試しにマウント

# cryptsetup luksOpen /dev/mapper/VolGroup-lv_pgsql lv_pgsql_crypt
Enter passphrase for /dev/mapper/VolGroup-lv_pgsql: 
# mount /dev/mapper/lv_pgsql_crypt /mnt/
# ls /mnt
lost+found
# umount /mnt
# cryptsetup luksClose lv_pgsql_crypt
#

自動接続の設定

前回とほぼ同じだが再掲。

# dd if=/dev/random of=/etc/lvm/lvm.seckey bs=1 count=32
32+0 records in
32+0 records out
32 bytes (32 B) copied, 0.0398393 s, 0.8 kB/s
# chmod 400 /etc/lvm/lvm.seckey
# cryptsetup luksAddKey /dev/mapper/VolGroup-lv_pgsql /etc/lvm/lvm.seckey
Enter any passphrase: 
# cryptsetup --key-file /etc/lvm/lvm.seckey luksOpen /dev/mapper/VolGroup-lv_pgsql lv_pgsql_crypt
# cryptsetup luksDump /dev/mapper/VolGroup-lv_pgsql
(中略)
UUID:           6069746a-025f-4158-963b-c729fe047281
(中略)
# vi /etc/crypttab
+ lv_pgsql_crypt UUID=6069746a-025f-4158-963b-c729fe047281 /etc/lvm/lvm.seckey luks
# vi /etc/fstab
+ /dev/mapper/lv_pgsql_crypt /var/lib/pgsql ext4 defaults 1 1
#

PostgreSQL用ディレクトリの移行

# mv /var/lib/pgsql /var/lib/pgsql_original.yyyy-mm-dd
# mkdir /var/lib/pgsql
# mount /var/lib/pgsql
# chown postgres:postgres /var/lib/pgsql
# chmod go-rwx /var/lib/pgsql
# ( cd /var/lib/pgsql_original.yyyy-mm-dd ; tar czf - . ) | ( cd /var/lib/pgsql ; tar xzf - )

PostgreSQLサーバーの起動

# service postgresql start
# chkconfig postgresql on

参考

2015-10-05

文字とASCII値の変換

PHP

いつも関数名まで忘れて苦労するので、メモついでにおもちゃを作ってみた(謎)。

関係する関数は以下

おもちゃコードとして、文字列の独自エンコード/デコードする関数を作ってみた。

エンコード
- 各文字のASCII値のビットを反転させた文字列を出力として返す
デコード
- エンコードしたものから元の文字列を出力として返す

ASCII値での単なるrot1関数。可視化するためにbin2hexしているが。

<?php

function rot1_encode($str) {
    $res = '';

    $len = strlen($str);
    for ($i = 0; $i < $len ; ++$i) {
        $res .= chr(ord($str[$i])^0xff);
    }

    return bin2hex($res);
}
if (!function_exists('hex2bin')) {
function hex2bin($str) {
    $res = '';

    $len = strlen($str);
    for ($i = 0; $i < $len ; $i += 2) {
        $res .= pack("c", intval(substr($str, $i, 2), 16));
    }

    return $res;
}
}
function rot1_decode($str) {
    $str = hex2bin($str);

    $res = '';

    $len = strlen($str);
    for ($i = 0; $i < $len ; ++$i) {
        $res .= chr(ord($str[$i])^0xff);
    }

    return $res;
}

$strs = array(
    'こんにちは、ord()とchr()を使ったお遊びです。',
    'お遊び',
);

foreach ($strs as $str) {
    var_dump(rot1_encode($str));
    var_dump(rot1_decode(rot1_encode($str)));
}

これを実行すると以下のようになる。

$ php test.php
string(122) "1c7e6c1c7d6c1c7e541c7e5e1c7e501c7f7e908d9bd7d61c7e579c978dd7d61c7d6d1b42401c7e5c1c7e601c7e75167e751c7e4c1c7e581c7e661c7f7d"
string(61) "こんにちは、ord()とchr()を使ったお遊びです。"
string(18) "1c7e75167e751c7e4c"
string(9) "お遊び"
$

2015-10-03

PostgreSQLサーバー用ディレクトリを暗号化ファイルシステムに置き換えてみる

Linux CentOS PostgreSQL

前置き

とりあえず、以下の環境を前提としている。

OSはCentOS 6 (VirtualBoxのVMとして作成)
PostgreSQL 8.4をパッケージインストールしている
- データディレクトリは「/var/lib/pgsql/data」
PostgreSQLのデータディレクトリも含めて1つのパーティションになっている
- これは別パーティションになっていても考え方は一緒

で、暗号化されていない「/var/lib/pgsql」以下を、別HDDを追加して暗号化ファイルシステムとしてフォーマットしたものに置き換えてみようという話。(※別HDDでなくても、同じHDDに空き領域があり、別パーティションを作成できるケースも同様に対応可能)

作業前のHDDは以下のように「/dev/sda」の1本だけ存在する。

# fdisk -l

Disk /dev/sda: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006435d

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          64      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              64        1045     7875584   8e  Linux LVM


Disk /dev/mapper/VolGroup-lv_root: 5947 MB, 5947523072 bytes
255 heads, 63 sectors/track, 723 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/VolGroup-lv_swap: 2113 MB, 2113929216 bytes
255 heads, 63 sectors/track, 257 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

#

下準備

とりあえず、PostgreSQLサーバーを一旦止めないといけないし、HDDを追加しないといけないのでOSも止めることになる(この辺は環境によるだろうが‥)。

# chkconfig postgresql off
# sync;sync;sync;shutdown -h now

OSをシャットダウンしたら、HDDを一つ追加して、OSを再起動する。

追加されたHDDは「/dev/sdb」。

# fdisk -l

Disk /dev/sda: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0006435d

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          64      512000   83  Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2              64        1045     7875584   8e  Linux LVM

Disk /dev/sdb: 8589 MB, 8589934592 bytes
255 heads, 63 sectors/track, 1044 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/VolGroup-lv_root: 5947 MB, 5947523072 bytes
255 heads, 63 sectors/track, 723 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000


Disk /dev/mapper/VolGroup-lv_swap: 2113 MB, 2113929216 bytes
255 heads, 63 sectors/track, 257 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

#

暗号化ファイルシステムの作成と自動接続

作成

# cryptsetup luksFormat -h sha256 /dev/sdb

WARNING!
========
This will overwrite data on /dev/sdb irrevocably.

Are you sure? (Type uppercase yes): YES
Enter LUKS passphrase: 
Verify passphrase: 
# cryptsetup luksOpen /dev/sdb sdb_crypt
Enter passphrase for /dev/sdb:
# mkfs -t ext4 /dev/mapper/sdb_crypt
# cryptsetup luksClose sdb_crypt

自動接続の設定

もちろん、これをやると、セキュリティリスクと手間とのトレードオフとなる。

自動接続すると、パスフレーズを知らなくても、HDDを全部盗んでOSを起動すればデータが見えてしまう。
自動接続しないと、luksOpenから、パスフレーズの入力、マウント、PostgreSQLサーバーの起動まで、全部手動でやることになる。

# dd if=/dev/random of=/etc/lvm/lvm.seckey bs=1 count=32
32+0 records in
32+0 records out
32 bytes (32 B) copied, 58.6724 s, 0.0 kB/s
# chmod 400 /etc/lvm/lvm.seckey
# cryptsetup luksAddKey /dev/sdb /etc/lvm/lvm.seckey
Enter any passphrase: 
# cryptsetup --key-file /etc/lvm/lvm.seckey luksOpen /dev/sdb sdb_crypt
# cryptsetup luksDump /dev/sdb
(中略)
UUID:           xxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb
(中略)
# vi /etc/crypttab
+ sdb_crypt UUID=xxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb /etc/lvm/lvm.seckey luks
# vi /etc/fstab
+ /dev/mapper/sdb_crypt /var/lib/pgsql ext4 errors=remount-ro 0 1
#

PostgreSQL用ディレクトリの移行

# mv /var/lib/pgsql /var/lib/pgsql_original.yyyy-mm-dd
# mkdir /var/lib/pgsql
# chown postgres:postgres /var/lib/pgsql
# chmod go-rwx /var/lib/pgsql
# mount /var/lib/pgsql
# ( cd /var/lib/pgsql_original.yyyy-mm-dd ; tar czf - . ) | ( cd /var/lib/pgsql ; tar xzf - )

PostgreSQLサーバーの起動

# service postgresql start
# chkconfig postgresql on

psqlコマンドなどでPostgreSQLサーバーに接続し、データの取出しなどができることを確認する。

(余談1)念のため確認

OSを再起動したときに、自動接続されるかどうかを確認しておく。

# umount /var/lib/pgsql
# cryptsetup luksClose sdb_crypt
# sync;sync;sync;shutdown -r now
(再起動後‥)
# ls /var/lib/pgsql
backups  data  lost+found  pgstartup.log
#

(余談2)暗号化ファイルシステムの取り外し

使用しているプロセスの停止

もちろん、今回の場合は、まずこのファイルシステムを使用しているPostgreSQLサーバーを停止する必要がある

# service postgresql stop
# chkconfig postgresql off

取り外し

自動接続の設定も併せて解除しておく必要がある。(あとでまた接続する場合は、各設定ファイルの内容は削除せずにコメントアウトしておく)

# umount /var/lib/pgsql
# cryptsetup luksClose sdb_crypt
# vi /etc/fstab
- /dev/mapper/sdb_crypt /var/lib/pgsql ext4 errors=remount-ro 0 1
# vi /etc/crypttab
- sdb_crypt UUID=xxxxxxxx-yyyy-zzzz-aaaa-bbbbbbbbbbbb /etc/lvm/lvm.seckey luks

参考

2015-08-01

Zend_Dateクラスを効率よく使うチャレンジ

PHP

序

Zend Frameworkに含まれているZend_Dateクラスは、インスタンス生成コストがとにかく高い。

どれくらい高いかというと、以下の2つのプログラムで比較してみるとなんとなく分かる。

test0-1.php: Zend_Dateをn回インスタンス生成

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $zd = new Zend_Date('2015-08-01 03:04:05');
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

test0-2.php: DateTimeをn回インスタンス生成

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $zd = new DateTime('2015-08-01 03:04:05');
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

test0-3.php: Zend_Dateオブジェクトをキャッシュしておいて、strtotimeとsetTimestampで対応

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

$zd = new Zend_Date();
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    $time = strtotime('2015-08-01 03:04:05');
    $zd->setTimestamp($time);
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

これらを実行する。

$ for i in {1,2,5,8}{,00,000} ; do
>     printf "%d" $i
>     for t in test0-1.php test0-2.php test0-3.php ; do
>         php ${t} ${i} | awk '{printf(",%s", $0);}'
>     done
>     printf "\n"
> done
1, 0.03093, 0.00011, 0.00004
100, 0.23281, 0.00048, 0.00074
1000, 2.32891, 0.00825, 0.01088
2, 0.01160, 0.00012, 0.00005
200, 0.48956, 0.00336, 0.00224
2000, 5.51129, 0.01458, 0.02203
5, 0.02411, 0.00018, 0.00012
500, 1.08754, 0.00409, 0.00425
5000,11.39618, 0.03983, 0.06424
8, 0.03674, 0.00031, 0.00000
800, 2.20668, 0.00417, 0.00931
8000,19.13784, 0.07557, 0.11488
$

並べ替えてグラフにしてみるとこんな感じ。

f:id:hhelibex:20150801161410p:plain

本

とりあえず、欲しいものは「日時計算をして、その結果を日時文字列にしたもの」ということで話を進める。

Zend_Dateオブジェクトをキャッシュして使いまわすとよさそうということで、上記「test0-3.php」をベースに、以下のような計測をしてみる。

test1-1.php: Zend_DateのtoStringメソッド

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

function _test(Zend_Date $zd) {
    return $zd->toString('YYYY-MM-dd');
}

$zd = new Zend_Date('2015-08-01 03:04:05');
//var_dump(_test($zd));
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    _test($zd);
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

test1-2.php: Zend_DateのgetTimestampメソッド＋date関数

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

function _test(Zend_Date $zd) {
    $ts = $zd->getTimestamp();
    return date('Y-m-d', $ts);
}

$zd = new Zend_Date('2015-08-01 03:04:05');
//var_dump(_test($zd));
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    _test($zd);
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

test1-3.php: Zend_Dateのgetメソッドで各フィールドを取得＋sprintf関数

<?php

$n = 10000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

set_include_path(dirname(__FILE__) . DIRECTORY_SEPARATOR . 'library');
require_once('Zend/Date.php');

date_default_timezone_set('Asia/Tokyo');

function _test(Zend_Date $zd) {
    $year = $zd->get(Zend_Date::YEAR);
    $month = $zd->get(Zend_Date::MONTH);
    $day = $zd->get(Zend_Date::DAY);
    return sprintf("%04d-%02d-%02d", $year, $month, $day);
}

$zd = new Zend_Date('2015-08-01 03:04:05');
$start = microtime(true);
for ($i = 0; $i < $n; ++$i) {
    _test($zd);
}
$end = microtime(true);
printf("%8.5lf\n", $end - $start);

さて、実行。

$ for i in {1,2,5,8}{,00,000,0000} ; do
>     printf "%d" $i
>     for t in test1-1.php test1-2.php test1-3.php ; do
>         php ${t} ${i} | awk '{printf(",%s", $0);}'
>     done
>     printf "\n"
> done

結果。

1, 0.00020, 0.00002, 0.00009
100, 0.01489, 0.00010, 0.01203
1000, 0.15603, 0.00634, 0.17088
10000, 1.28451, 0.04945, 1.37297
2, 0.00009, 0.00002, 0.00063
200, 0.02691, 0.00093, 0.04130
2000, 0.30510, 0.01436, 0.27488
20000, 2.93000, 0.12693, 3.37226
5, 0.00057, 0.00008, 0.00116
500, 0.08533, 0.00256, 0.07560
5000, 0.85879, 0.03124, 0.98879
50000,11.33206, 0.39485, 7.20063
8, 0.00063, 0.00009, 0.00074
800, 0.10203, 0.00917, 0.09432
8000, 0.86644, 0.03868, 0.81726
80000,12.23691, 0.57610,14.96780

f:id:hhelibex:20150801161436p:plain

さすがに各フィールドの値を取るのはコストが高いが、「test1-2.php」のケースが使えそうなレベル。

また、コードは省略するが、「test2-1.php」「test2-2.php」「test2-3.php」は時分秒まで含めるケース(「HH:mm:ss」「H:i:s」等の追加による)。

「test1-2.php/test2-2.php」はフォーマットに関係なくいいパフォーマンスを出しているが、Zend_Dateは時分秒まで含めるとそれだけコストが高くなる。

結

Zend_Dateクラスとは適度に付き合うのがよさそう(謎)。

2015-07-30

HTML5のinputタグにおけるtype属性のサポート状況

HTML5

ブラウザ標準の機能でカレンダーコンポーネントを使ったコードを書きたくて、ついでに単純なHTMLを書いて検証してみた。

試したのは以下のブラウザ。

Windows 7 Professional SP1
- Firefox 39.0
- Google Chrome 44.0.2403.107 m (64-bit)
- Opera 30.0.1835.125
- Safari 5.1.7
- Internet Explorer 11.0.9600.17914
Android 4.4.2
- 標準ブラウザ

書いたのは以下のHTMLコード。

<!DOCTYPE html>
<html>
<head></head>
<body>
<form action="#" method="post">
<table>
<tbody>
    <tr>
        <th>text</th>
        <td><input type="text" name="f_text" /></td>
    </tr>
    <tr>
        <th>search</th>
        <td><input type="search" name="f_search" /></td>
    </tr>
    <tr>
        <th>tel</th>
        <td><input type="tel" name="f_tel" /></td>
    </tr>
    <tr>
        <th>url</th>
        <td><input type="url" name="f_url" /></td>
    </tr>
    <tr>
        <th>email</th>
        <td><input type="email" name="f_email" /></td>
    </tr>
    <tr>
        <th>datetime</th>
        <td><input type="datetime" name="f_datetime" /></td>
    </tr>
    <tr>
        <th>date</th>
        <td><input type="date" name="f_date" /></td>
    </tr>
    <tr>
        <th>month</th>
        <td><input type="month" name="f_month" /></td>
    </tr>
    <tr>
        <th>week</th>
        <td><input type="week" name="f_week" /></td>
    </tr>
    <tr>
        <th>time</th>
        <td><input type="time" name="f_time" /></td>
    </tr>
    <tr>
        <th>datetime-local</th>
        <td><input type="datetime-local" name="f_datetime-local" /></td>
    </tr>
    <tr>
        <th>number</th>
        <td><input type="number" name="f_number" /></td>
    </tr>
    <tr>
        <th>range</th>
        <td><input type="range" name="f_range" /></td>
    </tr>
    <tr>
        <th>color</th>
        <td><input type="color" name="f_color" /></td>
    </tr>
</tbody>
</table>
<input type="submit" />
</form>
</body>
</html>

これを、ちゃんとWebサーバーを通して各ブラウザで表示させ、状況を見てみた。

サポートされているかどうかの判定は、type="text"との挙動の違いがあるかどうか。なので、最初は適当にaaaとか入力して、その場で怒られるものもあれば、submitボタンを押したときにバリデーションエラーを出すものもある。

サポート状況は大体こんな感じ。

	Firefox	Chrome	Opera	Safari	IE 11	Android
text	○	○	○	○	○	○
search			○	○		○
tel						○
url	○	○	○		○	○
email	○	○	○		○	○
datetime				○		○
date		○	○	○		○
month		○	○	○		○
week		○	○	○		○(*1)
time		○	○	○		○
datetime-local		○	○	○		○
number	○	○	○	○		○
range	○	○	○	○	○	○
color	○	○	○			○

(*1)：それっぽく表示されるけど動作しなかった‥

‥酷いな‥画面キャプチャを貼り付けるのも面倒になるくらい酷い‥

もう、普通のテキストボックスで年月日を分けますよ、えぇ(謎)‥

2015-07-21

PHPのrequire_onceが遅い話

PHP

もはや専門家の間では有名な話なのだろうが、今頃意識し始めて、ちょっと計ってみるかという気になったので計ってみる。

なんせ、Zend Frameworkのページでもパフォーマンスガイドとして書いてあるくらいだし。

Class Loading - Zend Framework Performance Guide - Zend Framework

計るにあたっては、「ノートPC上のVMでやりました」では話にならないだろうということで(それでも比率を見れば傾向はつかめると思うが)、とあるVPS上で計測をした。

ちなみに、「『Zend Frameworkと同じ命名規則でクラス名をつける』というルールの元で作ったものに対する計測」ってことでそのあたりはご注意を。

実験環境

環境はこんな感じ。

$ cat /etc/redhat-release
CentOS release 6.6 (Final)
$ uname -r
2.6.32-504.16.2.el6.x86_64
$ free
             total       used       free     shared    buffers     cached
Mem:       1922192    1480608     441584        536     202304     748092
-/+ buffers/cache:     530212    1391980
Swap:      2097148       6472    2090676
$ php -v
PHP 5.3.3 (cli) (built: Jul  9 2015 17:39:00) 
Copyright (c) 1997-2010 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
$ php -r 'var_dump(get_include_path());'
string(32) ".:/usr/share/pear:/usr/share/php"
$

ちなみにCPUは「Intel Xeon E312xx (Sandy Bridge)」の3コアらしい。

使用するプログラム

App/Hoge0.php

<?php
class App_Hoge0 {
    public static function test() {}
}

test0.php
- 1回だけrequire_onceする。

<?php
printf("%s\n", __FILE__);

$n = 1000000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

$startTime = microtime(true);

require_once('App/Hoge0.php');
for ($i = 0; $i < $n; ++$i) {
    App_Hoge0::test();
}

$endTime = microtime(true);
printf("%8.5lf\n", $endTime - $startTime);

test1.php
- ループのたびにrequire_onceする。

<?php
printf("%s\n", __FILE__);

$n = 1000000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

$startTime = microtime(true);

for ($i = 0; $i < $n; ++$i) {
    require_once('App/Hoge0.php');
    App_Hoge0::test();
}

$endTime = microtime(true);
printf("%8.5lf\n", $endTime - $startTime);

test2.php
- spl_autoload_registerで似非クラスローダを登録

<?php
printf("%s\n", __FILE__);

$n = 1000000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

$startTime = microtime(true);

function _myLoader($className) {
    return @include(str_replace('_', DIRECTORY_SEPARATOR, $className) . '.php');
}
spl_autoload_register('_myLoader');

for ($i = 0; $i < $n; ++$i) {
    App_Hoge0::test();
}

$endTime = microtime(true);
printf("%8.5lf\n", $endTime - $startTime);

実行

面倒なので、以下のようなスクリプトを書いて実行。

for i in {1,2,5}{,00,000,0000,00000,000000} ; do
    printf "%d" $i
    for t in test0.php test1.php test2.php ; do
        php ${t} ${i} | grep -v /home/ | awk '{printf(",%s", $0);}'
    done
    printf "\n"
done

結果(生データ)

1, 0.00011, 0.00011, 0.00014
100, 0.00013, 0.00030, 0.00016
1000, 0.00037, 0.00217, 0.00040
10000, 0.00264, 0.01997, 0.00298
100000, 0.02665, 0.20718, 0.02599
1000000, 0.24950, 1.90784, 0.25750
2, 0.00012, 0.00011, 0.00020
200, 0.00019, 0.00051, 0.00019
2000, 0.00062, 0.00419, 0.00065
20000, 0.00557, 0.04184, 0.00566
200000, 0.05527, 0.41456, 0.05114
2000000, 0.51434, 3.97734, 0.54890
5, 0.00014, 0.00013, 0.00015
500, 0.00025, 0.00227, 0.00036
5000, 0.00145, 0.00972, 0.00154
50000, 0.01421, 0.11185, 0.01292
500000, 0.13399, 0.98989, 0.13024
5000000, 1.27753,10.18646, 1.28340

結果(グラフ)

生データでもなんとなく分かると思うが、一応並べ替えてグラフ化してみる。

f:id:hhelibex:20150721160945p:plain

まぁ、火を見るより明らかというか‥。自前クラスローダ版(test2)が意外と健闘している。

そんなわけで(謎)、これからZend Framework本体のrequire_onceを消しまくる作業をしようと思う(と言っても、シェルコマンドのワンライナーで一括だが(謎))‥

2015/07/23追記

上記の検証だと、「require_onceが遅い」という証明にはならない気がしてきた。実行される命令数が違うのだから遅いのは当然。

ということで、test0.phpを以下のようにしてみた。

test0.php
- 「何かの適当な処理」をする

<?php
printf("%s\n", __FILE__);

$n = 1000000;
if ($argc >= 2) {
    $n = (int)$argv[1];
}

$startTime = microtime(true);

require_once('App/Hoge0.php');
for ($i = 0; $i < $n; ++$i) {
    何かの適当な処理
    App_Hoge0::test();
}

$endTime = microtime(true);
printf("%8.5lf\n", $endTime - $startTime);

「何かの適当な処理」には以下の2通りのコードを入れて試してみた。

require('./dummy.php'); (「dummy.php」は空のファイル)
defined("___{$i}___");

そうしたら、こんな感じになった。

require('./dummy.php');

f:id:hhelibex:20150723145451p:plain

defined("___{$i}___");

f:id:hhelibex:20150723153641p:plain

Zend Frameworkのソースを見ると、確かにメソッドの中で、例えば例外を投げる直前にrequire_once('Zend/Exception');していたりするので(そういうのが何箇所もある)、それは無駄だろうが、諸悪の根源とまではいかないようだ‥

2015-07-11

配列の統合時の先勝ち後勝ちの話

PHP

起

2つの配列$aと$bをマージした結果として$expectedのようなものが欲しくて‥

<?php

$a = array(
            '11',
    'k2' => '22',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
);
$b = array(
            '111',
    'k2' => '222',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);
$expected = array (
            '111',
    'k2' => '222',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);

+演算子やarray_merge関数を使ってみるが‥

<?php

$a = array(
            '11',
    'k2' => '22',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
);
$b = array(
            '111',
    'k2' => '222',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);
$expected = array (
            '111',
    'k2' => '222',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);

printf("===%s===\n", 'expected');
var_dump($expected);
printf("===%s===\n", '$a + $b');
var_dump($a + $b);
printf("===%s===\n", '$b + $a');
var_dump($b + $a);
printf("===%s===\n", 'array_merge($a, $b)');
var_dump(array_merge($a, $b));
printf("===%s===\n", 'array_merge($b, $a)');
var_dump(array_merge($b, $a));

何かが違う‥

===expected===
array(8) {
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(3) "222"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}
===$a + $b===
array(8) {
  [0]=>
  string(2) "11"
  ["k2"]=>
  string(2) "22"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}
===$b + $a===
array(8) {
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(3) "222"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
}
===array_merge($a, $b)===
array(9) {
  [0]=>
  string(2) "11"
  ["k2"]=>
  string(3) "222"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  [1]=>
  string(3) "111"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}
===array_merge($b, $a)===
array(9) {
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(2) "22"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
  [1]=>
  string(2) "11"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
}

まぁ$b + $aは、内容的には合っているんだけど、ベースは$aで後勝ちしたいのでちょっと違和感‥

array_mergeは惜しいが違うし‥

承

見た目の順番が同じものが得られないかな‥と思い、意味的にはこんな感じで、

<?php

$a = array(
            '11',
    'k2' => '22',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
);
$b = array(
            '111',
    'k2' => '222',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);

printf("===%s===\n", 'foreach');
$r = $a;
foreach ($b as $k => $v) {
    $r[$k] = $v;
}
var_dump($r);

こんな感じの結果が得られるやつ。

===foreach===
array(8) {
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(3) "222"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}

転

で、いろいろ試行錯誤して、

<?php

$a = array(
            '11',
    'k2' => '22',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
);
$b = array(
            '111',
    'k2' => '222',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);

printf("===%s===\n", 'array_merge(array_diff($a, array_intersect_key($a, $b)), $b)');
$r = array_merge(array_diff($a, array_intersect_key($a, $b)), $b);
var_dump($r);

で、見た目の順番は違うけど同じ結果が得られた。

array(8) {
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(3) "222"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}

結

‥と喜んでいたら、array_replaceなるものがあった件。

<?php

$a = array(
            '11',
    'k2' => '22',
    'k4' => '44',
    'k6' => '66',
    'k8' => '88',
);
$b = array(
            '111',
    'k2' => '222',
    'k3' => '333',
    'k5' => '555',
    'k7' => '777',
);

printf("===%s===\n", 'array_replace($a, $b)');
$r = array_replace($a, $b);
var_dump($r);

見た目の順番まで期待通り。

===array_replace($a, $b)===
array(8) {
  [0]=>
  string(3) "111"
  ["k2"]=>
  string(3) "222"
  ["k4"]=>
  string(2) "44"
  ["k6"]=>
  string(2) "66"
  ["k8"]=>
  string(2) "88"
  ["k3"]=>
  string(3) "333"
  ["k5"]=>
  string(3) "555"
  ["k7"]=>
  string(3) "777"
}

まとめ

私が知る限りの配列の「足し算」をする演算/関数の違いの簡単なまとめ。

演算/関数	数値インデックス	文字列インデックス
`$a + $b`	先勝ち	先勝ち
`array_merge($a, $b)`	続きから追記	後勝ち
`array_replace($a, $b)`	後勝ち	後勝ち

知らない事がまだまだ多いな‥