更新:重新启用 fulin.org

此处停止更新,重新启用 fulin.org !

永久博客地址: http://blog.fulin.org

关于我(我的简历)

中文:http://blog.fulin.org/aboutme-cn

英文:http://blog.fulin.org/about-author

我的Google Profilehttp://www.google.com/profiles/tangfulin

自我介绍的草稿

出生于“永州之野产异蛇”的湖南永州,毕业于“学为人师 行为世范”的北师大,从高中的 NOIP 到大学的 ACM,编程竞赛的参与者,算法爱好者,以追求程序或系统的性能的极致为乐。会 PHP,会一点 Linux 下的 C,和一点点 Java,喜欢CentOS 和 Ubuntu。希望通过自己的努力,能在将来的某个时刻被人称为“架构师”。曾参与过新浪爱问视频搜索,新浪播客,新浪支付,新浪简单存储系统的开发,以及腾讯城市达人的部分功能实现,负责了手机之家基于lucene的站内搜索的开发及持续改进,负责并带领团队实现了12530咪咕播放器音乐搜索,当前正负责并带领团队实现新浪邮箱全文搜索和新浪邮箱网盘,文件中转站项目。认可开源的理念,所以很有兴趣跟大家讨论有关前面出现过的任何一个名词的内容。

更多的了解,欢迎参观我的博客: http://blog.fulin.org

或 follow 我的 twitter:https://twitter.com/tangfl (这个现在有一定的技术难度)

或来新浪微薄上找我:唐福林@新浪微薄

自我介绍定稿

唐福林,1984年出生于“产异蛇”的永州之野,毕业于“学为人师 行为世范”的北京师范大学。

从高中的 NOIP 到大学的 ACM ,是编程竞赛的参与者,也是算法的爱好者,以追求程序或系统的性能的极致为乐。

会 PHP,会一点 Linux 下的 C,和一点点 Java,认可开源的理念,喜欢 CentOS 和 Ubuntu。

曾参与过新浪爱问视频搜索,新浪播客,新浪支付,新浪简单存储系统的开发,以及腾讯城市达人的部分功能实现。

目前就职于手机之家,负责基于lucene的站内搜索的开发及持续改进。

负责了手机之家基于lucene的站内搜索的开发及持续改进,负责并带领团队实现了12530咪咕播放器音乐搜索。

当前正负责并带领团队实现新浪邮箱全文搜索和新浪邮箱网盘,文件中转站项目。

欢迎参观我的博客: http://blog.fulin.org 或者 twitter: https://twitter.com/tangfl 或来新浪微薄上找我:唐福林@新浪微薄

联系方式

  • qq: 36716628 (非讨论技术,一般只加认识的人)
  • msn: tangfulin#msn.com (技术讨论)
  • gtalk: tangfulin#gmail.com (技术讨论)
  • email: tangfulin#[gmail.com,126.com,qq.com,yahoo.com.cn]

成长历程

我的博客

Posted in 默认分类 | Leave a comment

邮件相关的一些 rfc 收集

RFC 描述 状态
RFC 1939 POP3 protocol Updated by RFC 2449
RFC 2449 POP3 Extension Mechanism
RFC 822 STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES Obsoleted by RFC 2822
RFC 2822 Internet Message Format
RFC 2045 MIME Part One: Format of Internet Message Bodies Updated by RFC 2231
RFC 2046 MIME Part Two: Media Types
RFC 2047 MIME Part Three: Message Header Extensions for Non-ASCII Text Updated by RFC 2231
RFC 2183 The Content-Disposition Header Field Updated by RFC 2231
RFC 2231 MIME Parameter Value and Encoded Word Extensions: Character Sets,...
RFC 2387 The MIME Multipart/Related Content-type
RFC 3462 The MIME Multipart/Report Content-type
RFC 2111 Content-ID and Message-ID Uniform Resource Locators
RFC 2632 S/MIME Version 3 Certificate Handling
RFC 2633 S/MIME Version 3 Message Specification
RFC 2821 Simple Mail Transfer Protocol
Posted in 工作日志 | Leave a comment

收藏:设计模式简图

设计模式简图
Continue reading

Posted in 休闲转载, 技术资料 | Leave a comment

音乐搜索系统部署说明

工作日志,转载请保留出处:唐福林 博客雨 音乐搜索系统部署说明 http://blog.fulin.org/2009/11/pcsearch_deploy.html

                       PC 客户端搜索系统部署说明
唐福林 <tangfulin AT gmail.com>
PC 客户端搜索系统主要由 负责建索引的 IndexServer 和负责提供搜索服务的 SearchServer 两部分组成。 IndexServer (目前是 98)负责接收资源库发过来的xml原始文件,解析原始文件,更新索引,并将更新后的索引推送到 SearchServer 上的指定目录供搜索使用。当前架构中,IndexServer 不支持分布式和负载均衡,只能部署在一台机器上。因为索引更新并不频繁,所以这里并不会带来性能瓶颈。如果是为了消除单点故障,可以在另外一台机器上起一个备份进程,当主 IndexServer 故障的时候接收资源库发过来的xml原始文件,但为了保持索引一致,建议不做实际的索引更新操作。这样 IndexServer 故障带来的唯一影响是索引更新延迟。只要做了必要的监控措施,延迟的时间是可以控制在可接受范围内的。IndexServer 主要消耗磁盘IO,也会有一些 cpu 和内存消耗。 SearchServer (目前是 221)负责对外提供搜索服务。SearchServer 当前使用 Resin 提供的 Servlet 环境,以 HTTP 协议提供 Rest 风格的服务。SearchServer 会定期查看工作目录下是否有新的索引到达,如果有,则打开新索引,关闭旧索引。已关闭的旧索引随后会被脚本删除。SearchServer 主要消耗内存和网络IO。
一。部署之前的服务器检查:
1. 确认操作系统发行版本: cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.3 (Tikanga) 或以上 2. 确认系统安装的 java 版本:java -version java version "1.5.0_15" 或以上 3. 确认系统安装的 rsync 版本: rsync --version rsync version 3.0.5 protocol version 30 或以上 4. 确认磁盘空间:df -h 确保有超过 2G 的剩余空间,建议预留 5G,方便打log。如果/home下磁盘空间不够,可以使用软链接 5. 确认svn客户端版本:svn --version 确保含有 https 模块: ra_dav : Module for accessing a repository via WebDAV (DeltaV) protocol. - handles 'http' scheme - handles 'https' scheme 6. 确认 ant 版本:ant -version Apache Ant version 1.6.2 或以上 7. 确认 resin 版本: 3.1.9 或以上
二。建立目录结构
项目根目录设置为 /home/yyjd,以下的相对路径,如无特殊说明,都是相对于项目的根目录
mkdir /home/yyjd && cd /home/yyjd && mkdir code data software && cd data && mkdir -p backup dict indexes indexserver logs pids webapps/search xml
三。取得文件数据
1. 取得必要的软件 如果 “一” 中的检查不满足,或者希望安装 Hudson 或 JIRA,可以从 98 的 /home/yyjd/software 中取得软件,并进行安装。安装的时候,如果是 rpm 包,可以使用 rpm -ivh xxx.rpm 进行安装,如果是zip或tgz压缩包,解压缩即可,Hudson 是一个 war 包,可以直接运行,参考 code/PCSearcher/trunk/shell/server-start.sh 里的运行命令说明。 注意,在安装 rpm 包的时候,可能会有一些依赖问题,按照提示先安装依赖包即可;安装 rpm 包如果报冲突,在命令行上加一个 --force 参数即可。
2. 取得代码 cd /home/yyjd/code && svn checkout https://*.*.*.254/svn/PCSearch
四。相关软件设置
1. rsync 设置:在搜索机上 vi /etc/rsyncd.conf
uid = root gid = root max connections = 200 timeout = 600 use chroot = no read only = no pid file=/var/run/rsyncd.pid hosts allow=*.*.*.98
[search] path=/home/yyjd comment = PCsearch project sync model
保存退出,然后运行 rsync --daemon
在 IndexServer 上测试 rsync 是否畅通: rsync -vn 搜索机ip::search/data/indexes . 如果不通,则再测试一下使用 ssh 协议是否畅通: 首先将 IndexServer 的 /root/.ssh/id_rsa.pub (如果没有该文件,运行一下 ssh-keygen 命令)的内容添加到搜索机 /root/.ssh/authorized_keys 文件的末尾(注意不要引入多余的换行符),然后测试: rsync -vn root@搜索机ip:/home/yyjd/data/indexes .
测试正常之后,将测试成功的地址添加到 IndexServer 的 /home/yyjd/code/PCSearcher/trunk/shell/trans-dest.conf 文件末尾
2. resin 设置: 将 data/webapps/search 目录添加到 resin app 目录列表,也可以在 resin 的 webapps 目录下建一个到 data/webapps/search 的软链接
五。编译代码
cd /home/yyjd/code/PCSearcher/trunk && svn up && ant && ant indexserver
六。启动进程 进入 /home/yyjd/code/PCSearcher/trunk/shell 目录 参考 server-start.sh 1. IndexServer: a. 启动 resin (仅作测试,非必要) b. 启动 IndexServer: ./indexUpdater.sh restart c. 启动 trans 索引传输脚本: (nohup ./trans-indexsnap.sh songs 2>&1 >> /home/yyjd/data/logs/trans-songs.log &amp ;) (nohup ./trans-indexsnap.sh albums 2>&1 >> /home/yyjd/data/logs/trans-albums.log &amp ;) (nohup ./trans-indexsnap.sh keywords 2>&1 >> /home/yyjd/data/logs/trans-keywords.log &amp ;) d. 启动 clean 索引清除脚本(如果启用了 resin,就必须启动这些清除脚本): (nohup ./clean-indexsnap.sh songs 2>&1 >> /home/yyjd/data/logs/clean-songs.log &amp ;) (nohup ./clean-indexsnap.sh albums 2>&1 >> /home/yyjd/data/logs/clean-albums.log &amp ;) (nohup ./clean-indexsnap.sh keywords 2>&1 >> /home/yyjd/data/logs/clean-keywords.log &amp ;) e. 检查 /home/yyjd/data/logs 下的日志是否都正常
2. SearchServer: a. 启动 resin b. 启动 IndexServer: ./indexUpdater.sh restart (可以作为 IndexServer 接收xml原始文件的功能的备份,非必要) c. 启动 clean 索引清除脚本: (nohup ./clean-indexsnap.sh songs 2>&1 >> /home/yyjd/data/logs/clean-songs.log &amp ;) (nohup ./clean-indexsnap.sh albums 2>&1 >> /home/yyjd/data/logs/clean-albums.log &amp ;) (nohup ./clean-indexsnap.sh keywords 2>&1 >> /home/yyjd/data/logs/clean-keywords.log &amp ;) d. 检查 /home/yyjd/data/logs 下的日志是否都正常
七。日常管理
1. shell 脚本介绍:code/PCSearcher/trunk/shell build.sh : Hudson 使用的编译,重启 IndexServer 进程的脚本,不建议使用 clean-indexsnap.sh : 无用索引清除脚本,SearchServer 使用 indexUpdater.sh : 索引更新 daemon 进程,接受 {start|stop|restart} 参数调用 initKeyword.sh : 关键词索引初始化脚本,平常不用 processManager.sh : 底层的进程启动,停止管理脚本,由其他脚本调用 rebuild.sh : 模拟发送重建歌曲,专辑索引的xml原始数据的脚本,重建索引时使用 server-start.sh : 记录了新部署机器时可能会用到的启动命令,不建议直接运行 trans-dest.conf : 传输脚本的目的地址配置文件 trans-indexsnap.sh : 索引传输脚本,IndexServer 使用
2. logs 日志介绍:data/logs classpath.log : 索引更新 daemon 进程启动的时候的 classpath,debug 使用 clean-*.log : 索引清除日志 trans-*.log : 索引传输日志 search.log : 搜索日志 indexserver.log : 索引更新日志 另外,resin 的日志也需要留意观察。
八。常见错误及修正方法
1. svn 不支持 https 2. ant 编译出错 3. indexUpdater.sh 启动 索引更新 daemon 进程 出错 4. rsync 传输出错 5. 搜索出错,出白页面或 404 6. 搜索出错,报 Exception:确认一下 data/indexes/ 下面的每个子目录下面都有索引文件
Posted in 工作日志 | Leave a comment

音乐搜索的极致(续)

12530 PC客户端音乐搜索项目一期的总结和思考。

SlideShare 上的 pdf:

PPT 的文字内容:

  1. 音乐搜索的极致 唐福林 tangfulin@gmail.com http://blog.fulin.org
  2. 目录  项目简介  需求描述  搜索实现  查询示例  持续改进
  3. 项目简介 (1/3)  中国移动  12530  咪咕  Miniportal  搜索  Out source : edadao
  4. 项目简介 (2/3)  时间: 2009 年 9 月 12 日到 10 月 22 日  地点:成都,郫县,犀浦,移动音乐基 地  参与人员:  需求提供: wangquanli@12530,   zhengchangsong@12530  开发人员: mike,tangfulin,xww,wanghui  特别贡献: dave
  5. 项目简介 (3/3)  部署情况:  位置:移动音乐基地,西区枢纽机房  机器:  建索引: 227.98  搜索: 227.221  Dual Core AMD Opteron(tm) Processor 8218 2.6G * 8  8G mem  Red Hat Enterprise Linux Server release 5.3 (Tikanga)  Linux 2.6.18­128.el5PAE #1 SMP  i686 athlon  GNU/Linux  索引大小: 3 个索引目录共 1.2G  流量:  机器负载情况:
  6. 需求描述  搜索字段:歌手,歌曲,专辑,歌词  搜索方式:  精确匹配  前缀匹配  分词匹配  模糊匹配  拼音全量匹配  拼音首字母匹配  拼音同音匹配  拼音纠错匹配  关键词提示: • 搜索框下拉提示 • 纠错提示
  7. 需求-精确匹配  规则:精确匹配或过滤所有特殊字符后精 确匹配  单个字段:  歌手:阿唬 , 80 前后  歌曲:爱情 BT 大讲堂  专辑: Alive!  多个字段联合:  歌手名+歌曲名:刘德华 今天  歌手名+专辑名:许茹芸 爱 . 旅行 . 一公里
  8. 需求-前缀匹配  规则:过滤所有特殊字符后前缀匹配  单个字段:  歌手:刘德,张学  歌曲:  专辑:
  9. 需求-分词匹配  规则:过滤所有特殊字符后分词匹配  单个字段:  歌手:德华,杰伦,学友  歌曲:  专辑:  歌词:  多个字段联合:先 Must ,再 Should  歌手名+歌曲名+专辑名:  歌手名+专辑名:
  10. 需求-模糊匹配  规则:一定模糊度的词匹配(注:很 慢)  单个字段:  歌手:刘大华  歌曲: beautful, califonia  专辑:
  11. 需求-拼音全量匹配  规则:用户输入拼音匹配  单个字段:  歌手: liudehua, zhangxueyou  歌曲:  专辑:
  12. 需求-拼音首字母匹配  规则:用户输入拼音首字母匹配  单个字段:  歌手: ldh, zxy  歌曲:  专辑:
  13. 需求-拼音同音匹配  规则:用户拼音输入法,输入错误的同 音字匹配  单个字段:  歌手:柳的话,两用器  歌曲:  专辑:
  14. 需求-拼音纠错匹配  规则:用户拼音输入法输入错误的字, 或直接输入错误的拼音  n l,h­f,z zh,c ch,s sh,an ang,en eng,in ing  单个字段:  歌手:牛德华, niudehua  歌曲:  专辑:
  15. 搜索实现  建索引策略  冗余字段  中文将拼音,首字母也建进索引里  搜索 Query 策略  弃用多次查询的策略  采用多个 Query 拼装成一个 BooleanQuery ,设置不同的权值, 一次查询的策略
  16. 搜索实现:建索引策略 (1/2)  歌手: singer_name  singer_name_save: 保存字段, trim 后,原封不动  singer_name_filtered:  过滤字段,过滤所有的特殊字符,转小写  singer_name_analyzed:  分词字段  singer_name_notanalyzed:  不分词字段,前缀匹配使用  singer_name_full:  拼音全量字段  singer_name_first:  拼音首字母字段  歌曲: song_name   专辑: album_name
  17. 搜索实现:建索引策略 (2/2)  关键词: keyword  所有的歌手名,歌曲名,专辑名, 歌手+歌曲,歌手+专辑,都视为 关键词  单独一个索引文件  供下拉提示和搜索无结果时的纠错 提示使用  也提供拼音全量,拼音首字母,拼 音纠错等功能
  18. 搜索实现:搜索策略 (1/6)  策略列表: 1. 精确匹配:歌手,歌曲,专辑,不分词字段,去掉前后多余空格,精确匹配 2. 过滤后的精确匹配:歌手,歌曲,专辑,过滤字段,去掉所有特殊字符,英文转成小 写,精确匹配 3. 拼音全量匹配:歌手,歌曲,专辑,拼音全量字段,去掉所有非英文字符,英文转成小 写,精确匹配 4. 同音纠错匹配:歌手,歌曲,专辑,拼音全量字段,只对含中文的搜索词使用,中文转 拼音,英文转小写,去掉所有特殊字符,精确匹配 5. 拼音首字母匹配:歌手,拼音首字母字段,中文转拼音首字母,英文转小写,去掉所有 特殊字符,精确匹配 6. 前缀匹配:歌手,歌曲,专辑,不分词字段,去掉前后多余空格,英文转小写,前缀匹 配 7. 分词 Must 匹配:歌手,歌曲,专辑,(歌词),分词字段,分词,词之间使用 Must 连 接,分词匹配
  19. 搜索实现:搜索策略 (2/6)  策略列表(续): 1. 分词 Should 匹配:歌手,歌曲,专辑,(歌词),分词字段,分词,词之间使用 Should 连接,分词匹配 2. 合并分词 (must) 匹配:歌手+歌曲+专辑 分词字段,分词,(当前使用 must  连 接),分词匹配 3. 合并分词 (should) 匹配:歌手+歌曲+专辑 分词字段,分词,(当前使用 Should  连 接),分词匹配 4. 拼音纠错匹配查询(忽略掉鼻音等) : 歌手,歌曲,专辑,分词字段,去掉前后多余空 格,英文转小写 . 5. 中文模糊匹配 ,中文时模糊度: 0.65: 歌手,歌曲,专辑,分词字段,去掉前后多余空 格,英文转小写 . 6. 英文模糊匹配,英文模糊度: 0.85: 歌手,歌曲,专辑,分词字段,去掉前后多余空 格,英文转小写 .
  20. 搜索实现:搜索策略 (3/6)  精度选择  只搜索精确匹配结果  精确匹配,过滤后精确匹配,前缀匹配  拼音全量,首字母  分词全部命中  只搜索模糊匹配结果  分词部分命中  同音纠错,拼音纠错  模糊匹配  去掉精确匹配的结果  搜索全部结果
  21. 搜索实现:搜索策略 (4/6)  字段选择  只搜索歌手  只搜索歌曲  只搜索专辑  只搜索歌词  搜索关键词字段  搜索全部字段(暂时不包括关键词 和歌词)
  22. 搜索实现:搜索策略 (5/6)  设置权值  将所有的策略置入一个有序列表中  列表中相邻的两个策略之间权值相差常数倍(当前 设置为 10 )。过大可能会导致 lucene 评分溢出,过 小可能会导致不同策略命中的结果集重叠  调整列表中策略的先后次序以调整结果集中各种命 中的出现顺序  二级权值:在搜索全部的时候,歌手 > 歌曲 > 专 辑,所以需要在同一个策略内部再设置字段权值  中文分词命中的权值设置: Lucene  默认打分策略 中,并没有考虑命中的词的长度。为了优先显示长 的词命中的结果,对分词 Query 中每个词根据长度 设置不同的权值
  23. 搜索实现:搜索策略 (6/6)  索引文件划分  搜索歌曲索引  搜索专辑索引  搜索关键词索引  排序策略  编辑置顶  Lucene  评分  业务量(点击,订阅,播放等)  关键词词频
  24. 搜索示例  歌曲索引,全部字段,精确搜索  搜 刘德华,结果条数: 329  QUERY :( ((singer_name_notanslysis: 刘德华 ^9.0  song_name_notanslysis: 刘德华 ^4.0 album_name_notanslysis: 刘德 华 ))^10000.0) (((singer_name_filtered: 刘德华 ^9.0 song_name_filtered: 刘德华 ^4.0 album_name_filtered: 刘德华 ))^1000.0)  (((singer_name_filtered: 刘德华 *^9.0 song_name_filtered: 刘德华 *^4.0  album_name_filtered: 刘德华 *))^100.0) ((((((+singer_name_anslysis: 刘 德华 +singer_name_anslysis: 德华 )^9.0) ((+song_name_anslysis: 刘德 华 +song_name_anslysis: 德华 )^4.0) (+album_name_anslysis: 刘德华 +album_name_anslysis: 德华 ))))^10.0) ((((+singer_song_album: 刘德华 +singer_song_album: 德华 ))))
  25. 搜索示例  歌曲索引,全部字段,精确搜索  搜 ldh ,结果条数: 401 (命中刘德华,刘大浩等)  QUERY :( ((singer_name_notanslysis:ldh^9.0  song_name_notanslysis:ldh^4.0  album_name_notanslysis:ldh))^1000.00006)  (((singer_name_filtered:ldh^9.0 song_name_filtered:ldh^4.0  album_name_filtered:ldh))^100.00001) (((singer_name_full:ldh^9.0  song_name_full:ldh^4.0 album_name_full:ldh))^10.000001)  (((singer_name_first:ldh))^1.0000001) (((singer_name_filtered:ldh*^9.0  song_name_filtered:ldh*^4.0 album_name_filtered:ldh*))^0.10000001)  ((((((+singer_name_anslysis:ldh)^9.0) ((+song_name_anslysis:ldh)^4.0)  (+album_name_anslysis:ldh))))^0.010000001)  (((((+singer_song_album:ldh))))^0.0010)
  26. 搜索示例  歌曲索引,全部字段,精确搜索  搜 liudehua ,结果条数: 324 (歌名《我不是刘德华》无法命中)  QUERY :( ((singer_name_notanslysis:liudehua^9.0  song_name_notanslysis:liudehua^4.0  album_name_notanslysis:liudehua))^1000.00006)  (((singer_name_filtered:liudehua^9.0 song_name_filtered:liudehua^4.0  album_name_filtered:liudehua))^100.00001)  (((singer_name_full:liudehua^9.0 song_name_full:liudehua^4.0  album_name_full:liudehua))^10.000001)  (((singer_name_filtered:liudehua*^9.0 song_name_filtered:liudehua*^4.0  album_name_filtered:liudehua*))^1.0000001)  ((((((+singer_name_anslysis:liudehua)^9.0)  ((+song_name_anslysis:liudehua)^4.0)  (+album_name_anslysis:liudehua))))^0.10000001)  (((((+singer_song_album:liudehua))))^0.010000001)  (((singer_name_first:liudehua))^0.0010)
  27. 搜索示例  歌曲索引,全部字段,模糊搜索  搜 刘德华,结果条数: 44 (命中爱德华,杨德华等)  QUERY: ((((singer_name_notanslysis: 刘德华 ^9.0 song_name_notanslysis: 刘德华 ^4.0  album_name_notanslysis: 刘德华 ))^10000.0) (((singer_name_filtered: 刘德华 ^9.0  song_name_filtered: 刘德华 ^4.0 album_name_filtered: 刘德华 ))^1000.0)  (((singer_name_filtered: 刘德华 *^9.0 song_name_filtered: 刘德华 *^4.0  album_name_filtered: 刘德华 *))^100.0) ((((((+singer_name_anslysis: 刘德华 +singer_name_anslysis: 德华 )^9.0) ((+song_name_anslysis: 刘德华 +song_name_anslysis: 德华 )^4.0) (+album_name_anslysis: 刘德华 +album_name_anslysis: 德华 ))))^10.0)  ((((+singer_song_album: 刘德华 +singer_song_album: 德华 ))))) +(((((singer_song_album: 刘德华 ^9.0 singer_song_album: 德华 ^4.0)))^10000.0) (((((singer_name_anslysis: 刘德华 ^9.0 singer_name_anslysis: 德华 ^4.0)^9.0) ((song_name_anslysis: 刘德华 ^9.0  song_name_anslysis: 德华 ^4.0)^4.0) (album_name_anslysis: 刘德华 ^9.0  album_name_anslysis: 德华 ^4.0)))^1000.0) (((((singer_name_full:liudehua)^9.0)  ((song_name_full:liudehua)^4.0) (album_name_full:liudehua)))^100.0) ((())^10.0)  ((singer_name_anslysis: 刘德华 ~0.65^9.0 song_name_anslysis: 刘德华 ~0.65^4.0  album_name_anslysis: 刘德华 ~0.65)))
  28. 搜索示例  歌曲索引,全部字段,模糊搜索  搜 牛德华,结果条数: 840 (命中刘德华,爱德华,杨德华等)  QUERY: ((((singer_name_notanslysis: 牛德华 ^9.0 song_name_notanslysis: 牛德华 ^4.0  album_name_notanslysis: 牛德华 ))^10000.0) (((singer_name_filtered: 牛德华 ^9.0  song_name_filtered: 牛德华 ^4.0 album_name_filtered: 牛德华 ))^1000.0)  (((singer_name_filtered: 牛德华 *^9.0 song_name_filtered: 牛德华 *^4.0  album_name_filtered: 牛德华 *))^100.0) ((((((+singer_name_anslysis: 牛 +singer_name_anslysis: 德华 )^9.0) ((+song_name_anslysis: 牛 +song_name_anslysis: 德 华 )^4.0) (+album_name_anslysis: 牛 +album_name_anslysis: 德华 ))))^10.0)  ((((+singer_song_album: 牛 +singer_song_album: 德华 ))))) +(((((singer_song_album: 牛 singer_song_album: 德华 ^4.0)))^10000.0) (((((singer_name_anslysis: 牛 singer_name_anslysis: 德华 ^4.0)^9.0) ((song_name_anslysis: 牛 song_name_anslysis: 德华 ^4.0)^4.0) (album_name_anslysis: 牛 album_name_anslysis: 德华 ^4.0)))^1000.0)  (((((singer_name_full:niudehua)^9.0) ((song_name_full:niudehua)^4.0)  (album_name_full:niudehua)))^100.0) ((())^10.0) ((singer_name_anslysis: 牛德华 ~0.65^9.0  song_name_anslysis: 牛德华 ~0.65^4.0 album_name_anslysis: 牛德华 ~0.65)))
  29. 搜索示例  歌曲索引,全部字段,模糊搜索  搜 niudehua ,结果条数: 324  QUERY:­((((singer_name_notanslysis:niudehua^9.0 song_name_notanslysis:niudehua^4.0  album_name_notanslysis:niudehua))^1000.00006) (((singer_name_filtered:niudehua^9.0  song_name_filtered:niudehua^4.0 album_name_filtered:niudehua))^100.00001)  (((singer_name_full:niudehua^9.0 song_name_full:niudehua^4.0  album_name_full:niudehua))^10.000001) (((singer_name_filtered:niudehua*^9.0  song_name_filtered:niudehua*^4.0 album_name_filtered:niudehua*))^1.0000001)  ((((((+singer_name_anslysis:niudehua)^9.0) ((+song_name_anslysis:niudehua)^4.0)  (+album_name_anslysis:niudehua))))^0.10000001)  (((((+singer_song_album:niudehua))))^0.010000001)  (((singer_name_first:niudehua))^0.0010)) +(((((singer_song_album:niudehua^64.0)))^1000.0)  (((((singer_name_anslysis:niudehua^64.0)^9.0) ((song_name_anslysis:niudehua^64.0)^4.0)  (album_name_anslysis:niudehua^64.0)))^100.0) (((((singer_name_full:niudefua  singer_name_full:liudehua)^9.0) ((song_name_full:niudefua song_name_full:liudehua)^4.0)  (album_name_full:niudefua album_name_full:liudehua)))^10.0)  ((singer_name_anslysis:niudehua~0.85^9.0 song_name_anslysis:niudehua~0.85^4.0  album_name_anslysis:niudehua~0.85)))
  30. 搜索示例  歌曲索引,歌手字段,精确搜索  搜 杰伦,结果条数: 270  QUERY :( ((singer_name_notanslysis: 杰伦 ))^1000.0) (((singer_name_filtered: 杰伦 ))^100.0)  (((singer_name_filtered: 杰伦 *))^10.0) ((((+singer_name_anslysis: 杰伦 ))))
  31. 搜索示例  歌曲索引,歌曲字段,模糊搜索  搜 li ,结果条数: 117  (命中 你 )  QUERY: ((((song_name_notanslysis:li))^10000.0) (((song_name_filtered:li))^1000.0)  (((song_name_full:li))^100.0) (((song_name_filtered:li*))^10.0)  ((((+song_name_anslysis:li))))) +(((((song_name_anslysis:li^4.0)))^100.0)  ((((song_name_full:ni)))^10.0) ((song_name_anslysis:li~0.85)))
  32. 搜索示例  关键词索引,搜索歌曲  搜 liu ,结果:  流浪 流星 留恋 六月雪 浏阳河 流浪狗 流浪者之歌 留不住你的温柔  QUERY:+keyword_type:2  keyword_word_notanslysis:liu + (((keyword_word_filtered:liu*)^100.0) ((keyword_word_full:liu*)^100.0)  ((keyword_word_first:liu*)^100.0))
  33. 搜索示例  关键词索引,搜索歌手  搜 liu ,结果:  刘德华 刘冠群 刘亦敏 刘韵 刘若英 刘基俊 刘益中 刘芳 刘庆  QUERY:+keyword_type:1  keyword_word_notanslysis:liu + (((keyword_word_filtered:liu*)^100.0) ((keyword_word_full:liu*)^100.0)  ((keyword_word_first:liu*)^100.0))
  34. 搜索示例  关键词索引,搜索全部字段  搜 zhoujie ,结果:(有歌手+歌曲的命中)  周杰伦 周杰磊 周杰伦我求求你了 周杰伦传递祝福 周杰伦春节祝福 周杰伦情人节祝福  周杰伦 蒲公英的约定  周杰伦 最长的电影  周杰伦 阳光宅男  周杰伦 甜甜的  QUERY:+(keyword_type:1 keyword_type:2 keyword_type:3)  keyword_word_notanslysis:zhoujie +(((keyword_word_filtered:zhoujie*)^100.0)  ((keyword_word_full:zhoujie*)^100.0) ((keyword_word_first:zhoujie*)^100.0))
  35. 搜索示例  关键词索引,搜索全部字段  搜 柳的话,结果:(拼音同音命中)  刘德华 留得华  刘德华 冰雨  刘德华 中国人  刘德华 幸福这么远那么甜  刘德华 百分百好戏  刘德华 笑着哭  刘德华 谢谢你的爱  刘德华 爱你一万年  刘德华 情义俩心坚  QUERY:+(keyword_type:1 keyword_type:2 keyword_type:3)  keyword_word_notanslysis: 柳的话 +(((keyword_word_filtered: 柳的话 *)^100.0)  (((keyword_word_full:liudehua*)^100.0) ((keyword_word_full:liudihua*)^100.0)))
  36. 持续改进  性能调优: resin ,内存, cache  汉字转拼音:多音字,特殊符号  拼音纠错:另一种思路,标准化 vs 排列组合  关键词: Trie 树,按词频排序,加入歌词数据  业务要求以频繁更新的业务量作为排序依据  标签搜索:新需求  搜索关键词,保持先后顺序  自定义打分算法的尝试  Lucene  升级到 2.9.1 , bug 1974 , explain  显示 0 或者 NaN  显示时最佳片段截取, html 实体截断问题  标红,按策略的山寨标红与 Lucene 自带标红的优劣比较及取舍
  37. 性能调优 • 目标:单台机器,百万数量级的索引, 1000 个并发 下, 99% 0.5 秒内返回 • 优势: • 索引更新不是很频繁,可以忽略不计 • 服务器性能不错, 8cpu , 8G 内存 • 劣势: • 并发大,返回时间 0.5 秒要求太苛刻 • 一期代码有很多不合理的地方 • 项目时间紧张 • 使用了 resin 作为容器,有太多不可控因素
  38. 汉字转拼音 • Pinyin4j  的词库 • 自己整理的多音字表 • 当前将所有多音字的组合都建在索引里 • 莫文蔚: mowenwei, mowenyu  • 优点:保证能查到 • 缺点:输入 mowenyu  能查到莫文蔚,而且一个歌 名中如果有好几个多音字,排列组合的数量比较 可观 • 拼音首字母,而不是拼音声母 • 张学友: zxy ,不是 zhxy
  39. 拼音纠错 • 规则: • n l,h f,z zh,c ch,s sh,an ang,en eng,in ing (on ong) • 当前实现: • 将用户输入的搜索词中的每个出现,依次替换成对应的纠 错,联合成一个 Should Query • 如: liudehua: niudehua, liudefua • 另一种思路:标准化 • 规定规则中的替换只能单向: n >l,z >zh 等 • 在索引中增加一个标准化拼音字段,如曾经最美,该字段 存储的值为 chengjingzhuimei • 用户输入关键词,也经过同样的标准化后,在该字段进行 查询
  40. 关键词 • 关键词当前使用 Lucene  的索引前缀查询的方式实现 • 也包括拼音,首字母,纠错等 Query • 本来还有模糊查询的 Query ,但后来发现太影响查询 速度了,于是就暂时去掉了 • 当前没有把歌词数据建到关键词索引中去。如果把歌 词建进去,这个索引就太大了,必须要进行分拆 • 考虑使用 Trie 树: • 多棵树,拼音,首字母,中文需要各自建树 • 模糊查询的问题
  41. 排序 vs 更新 • 业务需求希望能以歌曲的业务量(播放,下载等量) 作为排序的一个依据 • 意味着需要频繁的更新索引,而且为了更新这样一个 数字字段,需要将整个文档删除重新添加,不划算 • 打算重载 Lucene  的 Collection  类,自己实现排序字 段值的加载,不从索引里面读取 • 问题: 2.4  与 2.9  在这个地方的实现上有很大的不 同,没法无缝切换
  42. 标签搜索 • 固定维度的标签,编辑填写,非用户产生内容 • 如:奥运,免费,铃声,开心,悲伤等 • 关键是产品设计,非技术实现 • 参考: google  泡泡挑歌
  43. 保持搜索关键词的顺序 • 延后实现的一个需求 • 只命中跟用户输入的多个关键词之间的顺序一致的结果 • 如: • 用户输入 “谢谢 爱”命中“谢谢你的爱”,但输入 “爱 谢谢”不命中 • 用户输入 “眼睛 背叛 心”命中“你的眼睛背叛了 你的心”,但输入“心 背叛 眼睛”不命中 • 实现: • 分词命中,无法保留顺序信息 • 模糊查询,效率太差 • ???
  44. 打分算法  需求: • 产品人员对 Lucene 打分算法的不理解,要求单纯的以某一个 依据来进行排序,如命中的词的个数 • 拼多个 Query 查询的副作用:多个 Query 的评分累加得到的 最后得分,会导致各个 Query 的命中结果重叠(想法:把 累加改成取最大值?) • 一个想法:同样的分词命中,命中较长的词的结果排前面  教训: • 没有金刚钻,别揽瓷器活!不要轻易的去改 Lucene 的评分算法
  45. Lucene 版本的选择 • 首选 2.9.0 • Bug 1974 : https://issues.apache.org/jira/browse/LUCENE 1974 • 换成了 2.4.1  • 为了性能及长远打算,还是希望换回 2.9.1 • Explain  函数调用返回评分 0  或 NaN
  46. 摘要截取 • Miniportal  空间有限,歌曲,专辑,甚至歌手名都可 能需要截断 • 截断时需要考虑标红问题 • 截断时需要考虑 html  实体的问题
  47. 标红 • Lucene  标红的优点与不足 • 优点:正统,可升级 • 缺点:不能满足需求,前缀标红,拼音标红等 • 山寨标红 • 按照每种 Query 进行相应的标红,最后合并 •
  48. 更多讨论 http://blog.fulin.org
Posted in 工作日志 | 2 Comments

音乐搜索的极致

12530 PC客户端 咪咕 (页面最下方有一个很不显眼的下载链接) 搜索 原本计划是今天上线内测,20号正是随资源库后台一起上线,其实昨晚就已经替换掉了正式服务器上原来的接口。正因为昨晚悄无声息的上线,原本已经下班走到家门口的我们,又被电话叫回公司,来解决一个刚刚发现的bug。

音乐搜索,第一期还没有特别做歌词的搜索,只对歌手名,歌曲名,专辑名做优化,加上数据量本身就很小(一共才不到100万首歌),只好在查询上做文章。我们当前一共设置了十层查询 Query:

1。精确匹配:歌手,歌曲,专辑,不分词字段,去掉前后多余空格,精确匹配
2。过滤后的精确匹配:歌手,歌曲,专辑,过滤字段,去掉所有特殊字符,英文转成小写,精确匹配
3。拼音全量匹配:歌手,歌曲,专辑,拼音全量字段,去掉所有非英文字符,英文转成小写,精确匹配
4。同音纠错匹配:歌手,歌曲,专辑,拼音全量字段,只对含中文的搜索词使用,中文转拼音,英文转小写,去掉所有特殊字符,精确匹配
5。拼音首字母匹配:歌手,拼音首字母字段,中文转拼音首字母,英文转小写,去掉所有特殊字符,精确匹配
6。前缀匹配:歌手,歌曲,专辑,不分词字段,去掉前后多余空格,英文转小写,前缀匹配
7。分词Must匹配:歌手,歌曲,专辑,(歌词),分词字段,分词,词之间使用Must连接,分词匹配
8。分词Should匹配:歌手,歌曲,专辑,(歌词),分词字段,分词,词之间使用Should连接,分词匹配
9。合并分词匹配:歌手+歌曲+专辑 分词字段,分词,(当前使用 Should 连接),分词匹配
10。模糊匹配:歌手,歌曲,专辑,分词字段,去掉前后多余空格,英文转小写,模糊匹配, 包含中文时模糊度:0.65 全英文模糊度:0.85

其中模糊匹配还分了两级:

a 拼音纠错
b 模糊查询,包括中文模糊和英文模糊(模糊度不一样)

当前拼音模糊是使用组合的办法来实现的:

1。建索引的时候,拼音全量字段里建的是字段的准确拼音,包括多音字的组合
2。搜索的时候,将用户输入的关键词转成拼音,在拼音全量字段里搜
3。模糊的时候,将用户输入关键词转成的拼音,按照模糊规则:n-l 互换,zh-z, ch-c, sh-s 互换,an-ang, en-eng, in-ing, on-ong 互换,每次只换一个(当前只支持模糊度为1的拼音模糊查询),如果有多个可以替换的点,则返回的结果为一个数组组合,然后使用 精确匹配在拼音全量字段进行查询

还有一种做法:

首先定义个所谓的拼音标准化过程:
n->l,zh->z, ch->c, sh->s ,an->ang, en->eng, in->ing, on->ong 不是互换,而是单向替换。
将一个拼音串的所有可替换点都替换后,得到的一个串,称为标准化串。
1。建索引的时候,歌曲名,歌手名,专辑名各新增一个标准化串字段,按","分词(多音字),存储字段的拼音标准化串
2。搜索的时候,将用户输入的关键词转成拼音,在拼音全量里面搜索
3。模糊的时候,将用户输入关键词的拼音再转成标准化串,在标准化串字段里面搜索

优点:不同那么复杂的组合逻辑
缺点:无法控制模糊度

拼一个很大的 Query 去 Lucene 里面查询最大的问题就是,排序很难控制。不停的查看 Lucene Explain 出来的打分细节,再微调 Query 之间的 boost 权值,再查看打分细节,再微调。特别是分词命中这个 Query ,Lucene 分词命中的默认打分规则,总觉得不太满意。自己做了一个 Similarity 的子类来算分,可毕竟不是专业的,考虑的不够全面,解决了一个问题,副作用带来更多的问题。最后,还是不得不放弃这个方向的尝试。

拼一个大 Query 的一个意外收获就是,发现 Lucene 2.9.0 的一个 bug:LUCENE-1974,提到官方 JIRA 后,很快被确认,并修复了,并且我提交的 TestCase 也被将接纳到 Lucene 的测试用例集合中。可惜 2.9.1 出来前,我们还是不得不将项目切换回 2.4.1 , 以避免这个 bug。

现在使用的 IK 分词器,总觉得行为有些奇怪,又没有什么地方可以设置的。天龙八部,最多分词分出来“天”,“八”是我们不想要的,最长分词,又分不出“天龙”,真是郁闷。

因为 Query 太复杂,Lucene 自带的标红效果不是很令人满意,所以标红的部分也是完全自己做的。仿照 Query 的模式,定义了一系列的规则,如全量命中,拼音命中,分词命中等,记录下每种规则匹配到的区段,最后做一次归并就可以了。

关键词提示,即用户在搜索框里输入的同时,下拉一个提示列表。现在的做法是建了一个单独的关键词索引,用户输入的时候,使用前缀去匹配。中文的 Trie 树比英文复杂不少,所以最开始没有选它。但现在发现关键词索引太大了,Lucene 更新太慢,才后悔最初的选型失误。关键词里面需要保存词频的信息,搜索量的信息,所以以后的更新肯定也会不少。明天继续想办法解决这个问题。

做完这个项目以后,大约所有的搜索功能,都不会让我觉得害怕了吧。

原创,转载请著名出处:唐福林 博客雨 http://blog.fulin.org/2009/10/acme_of_music_search.html

Posted in 工作日志 | 3 Comments

关于音乐搜索

音乐搜索属于垂直搜索的一种,但它又有着自己独特的一些需求。

首先,几乎所有的音乐搜索都实现了用户输入时的关键词提示功能。但在网上搜索相关的技术文章,大多是讲如何用 Js 实现前台表现层的功能,少有的几篇关于后台技术实现的文章,也都太过简单。标准的办法是使用 Trie 树,但太过晦涩,不够直观。我们打算直接使用 Lucene 的前缀查询来实现,并且计划在项目上线后写一个比较详细的说明。

其次,很多的音乐搜索都提供了拼音查询的功能。比如说用户输入 “liudehua”,关键词提示里会给出 “刘德华”,但即使用户不理会提示,直接点击提交,在服务器端,还是可以查询到关于 “刘德华” 的条目。甚至,用户输入拼音首字母 “ldh”,都可以匹配到 “刘德华”。这主要是考虑到使用音乐搜索的用户群的特点(低龄?懒惰?互联网初级用户?),以及某些艺人的名字确实比较难拼写吧。技术上其实很简单,建索引的时候,将歌曲名,歌手名等都转成拼音一并进行索引就可以了。唯一一点需要注意的地方在于,多音字的处理。

再次,有些搜索引擎,像 qq music,提供了同音字纠错的功能,可以在用户输入“周洁论”的时候,命中关于“周杰伦”的结果。有了上一步的拼音索引,这一步也很容易实现了。再多做一步,考虑到南北方的口音差别,很多人 en 与 eng,zh 与 z,n 与 l 不分,在搜索过程中进行一些简单的替换,拼音模糊纠错功能也就水到渠成了。

最后,汉字的模糊搜索。我们常用的一个例子就是,用户输入“刘大华”,能否命中“刘德华”?技术上肯定是可以的,lucene 本身就提供这样的查询,只是在产品设计上,是否有代替用户思考的嫌疑呢?这就需要产品人员去仔细思量了。

前面说的是功能,后面说说排序。

最基础的排序当然是按文档匹配度,也就是 lucene 的 score 来排了。但是有时候编辑推荐的歌曲是一定要排前面的,这个比较好实现。可是点击率比较高的歌曲也要靠前排,这个就有点麻烦了,因为牵涉到频繁的字段更新,以及 boost 值的微调。

最麻烦的是上面说的那一堆的特殊处理。比如用户输入了一个词,精确匹配肯定应该排最前面了,没有精确匹配中文的,拼音全量匹配也可以,分词匹配,或者部分匹配的结果次之,再接下来应该是前缀搜索,同音字纠错,模糊搜索的匹配条目。最开始的想法一直是多次搜索,可是在多次搜索里,一是无法控制所谓的精确匹配;二是多次搜索打包的结果用于排序的时候,很麻烦;三则,多次搜索,本身的逻辑就非常复杂。不过今天学会一招,如果不考虑性能损耗,可以说是屠龙刀级别的必杀技:打包多个 Query 对象,一次搜索!排序的问题,当然使用 Query.setBoost 解决了。至于精确匹配,冗余一个字段,不分词就行。

搭建好了 Hudson,写了一个看起来蛮复杂的 build.xml ,然后每天看着它自动的编译,测试,发布,还是有点成就感的。

开始写测试用例。一边写也一边在思考,搜索引擎项目该如何进行功能正确性的测试,又如何进行搜索结果好坏的评价呢?

Posted in 工作日志 | Leave a comment

转:微软的产品线

Most Popular 这些是微软最重要的产品和战略

·Bing 必应搜索引擎,微软未来几年最重要的战略

·Bing cashback 搜索引擎bing的用户现金反馈计划

·Internet Explorer IE浏览器。目前最新版是8.0

·Microsoft Advertising 微软广告联盟

·Office 办公软件。版本号: 95-2007 包括Word, Excel, PowerPoint, Access, Outlook, FrontPage, InfoPath, OneNote, Project, Publisher, SharePoint, Visio, Communicator。是微软最赚钱的两组软件之一。

·Windows 操作系统。版本号: Vista, XP, 2003, NT, 2000, me, 98, 95,及最新的Windows 7,及个人版服务器版等多种版本。是微软最赚钱的两组个软件之一

·Windows Live 微软的互联网在线软件集合,包括Hotmail Live Mail MSN Messenger Toolbar(工具条) Spaces(博客)。曾是bing之前的搜索战略集成

Business Software商业软件

·Microsoft Amalga 医疗企业系统。08年初推出,国内青医附院高调采购了该软件

·Microsoft Dynamics Products 微软商业解决方案,CRM , ERP等,在全球都有相当的市场份额)

·Microsoft Forefront 服务器安全产品,主要包括Microsoft Forefront Client Security FCS, Forefront Security for Exchange Server ,Forefront Security for SharePoint)

·Microsoft Office Live office的在线版,挑战GoogleDocs的重要战略之一

·Microsoft Online Services 企业级通信和协作软件。在中国目前还不能用

·Virtual Earth 虚拟地球,类似Google Earth, 加入到刚整合的bing旗下了

·Windows Essential Business Server EBS,原来叫Windows Centro. 是面向中小型企业的64Windows整合方案,提供关键IT工作的统一管理平台

·Windows Small Business Server SBS,原来叫Windows Cougar. 是面向小型企业的64Windows整合方案)

Design & User Experience设计及用户体验软件

·Microsoft Expression Windows平台界面设计工具。包括网页设计工具Expression WebFrontPage的替代产品;专业图形设计工具Expression Design,类似Adobe Photoshop;物件管理工具-Expression Media;可视化交互工具-Expression Blend。界面其实真的很重要,从DosWindows 7,界面改变整个IT历史。

·Microsoft Silverlight 是一个跨浏览器、跨客户平台的技术。能够设计、开发和发布有多媒体体验与富交互(RIA,Rich Interface Application)的网络交互程序

Developer Tools 软件开发工具

·.NET Framework Windows开发组件平台。最新版 4 Beta,成熟版本1.1 2.0 3.0 3.5

·ASP.NET web应用开发

·MSDN Subscriptions MSDN收费订阅

·Robotics Developer Studio RDS 机器人软件开发平台

·Visual Basic VB, 主流软件开发工具

·Visual C VC, 软件开发工具

·Visual C# 软件开发工具

·Visual Studio 主流集成化软件开发工具组。包含多个开发工具,可使用多种语言最新版VS 2010

·XNA 是基于DirectX3D游戏开发环境。XNAMicrosoft的下一代软件开发平台,致力于帮助开发者更快地开发更好的游戏。

Entertainment娱乐相关

·DirectX 一种应用程序接口(API),它可让以windows为平台的游戏或多媒体程序获得更高的执行效率,加强3d图形和声音效果,并提供设计人员一个共同的硬件驱动标准,让游戏开发者不必为每一品牌的硬件来写不同的驱动程序,也降低用户安装及设置硬件的复杂度。其API包括显示、声音、输入和网络四个部分。

·Microsoft Mediaroom IPTV网络电视平台,07年中推出

·MSN (Microsoft Service Network), 其中最常用的是Messenger,通常我们将MSN代指Messenger,目前最新版本9.0)

·MSN Games 在线游戏。类似QQ Game

·MSNBC 微软全国有线广播电视公司。它由NBC(美国国家广播公司)和微软公司合资组建的新公司,在美国将它的24小时不间断新闻信息频道和它的交互式在线新闻服务一同推出

·PC Gaming 单机游戏。有很多很多种游戏

·Windows Media Center 多媒体娱乐中心。Vista操作系统上进一步整合

·Xbox Home Abox Live的升级版,家用游戏机

·Xbox Live 多用户在线游戏对战平台

·Zune 便携媒体播放设备,是一个类似于苹果公司iPod的产品

Home & Educational Software家庭和教育软件

·AutoCollage 一个小的电子相册工具

·Encarta 数位多媒体百科全书,据传微软将于今年1031日正式关闭Encarta,不赚钱

·Healthvault 免费的在线服务,允许用户收集,管理自己和家人的健康和身体状况信息并从中获益

·Money 个人,家庭财务软件。可管理收支,报税等

·MSN Direct 实时信息服务。包括交通状况,天气,新闻,股票等等,可发送至无线设备,如GPSGet up-to-date traffic, current gas prices, weather reports and Doppler maps, news, stocks, local events, movie information, flight status, and Send to GPS all sent wirelessly to your navigation device.

·MSN Internet Access MSN互联网连接访问工具

·Office Home & Student 供学生和家庭使用的办公软件

·Songsmith 自动伴奏软件。可以根据演唱者的嗓音以及所选择的音乐类型自动作曲、生成伴音和背景音乐

·Streets & Trips 微软电子地图软件。类路由型或者地图型的GPS导航软件

·Windows Home Server Windows家庭服务器。帮助众多家庭将他们的数字素材集中并联系起来,从而轻松安全地共享和访问重要的文件、照片、视频和音乐。

·Windows Live OneCare 微软Windows Live旗下的杀毒软件,也是微软进入安全防护领域的第一个杀毒软件,估计下一步该归到bing旗下了,可能要停止并被Morro代替。另外本月(200906)刚推出Microsoft Security Essentials杀毒软件

·Works 一种家用综合软件。它主要面向低端的家庭用户,提供基本的能提高生活效率的工具,比如提供简单的文档处理、数据库、电子表格的入门级办公包功能,从而使每天的生活变得简单有

·WorldWide Telescope 全球望远镜 。用户可在桌面上浏览夜空,并且可以在不同的区域进行缩放观看.其数据是来自近哈勃望远镜及近十个分布于世界各地的天文望远镜

Servers 软件服务器

·BizTalk Server 是微软公司针对大中型企业设计的服务器产品,功能包括业务流程自动化,业务流程管理,企业应用集成以及企业之间的集成。

·Exchange Server 电子邮件及消息通信和协作软件服务器

·Server Trials 服务器评测

·SQL Server 数据库服务器。版本:720002005,目前最新版2008

·TechNet Subscriptions 提供关于所有Microsoft企业产品的技术信息订阅和光盘订购

·Windows Server 服务器操作系统,最新版是Windows Server 2008

*This is by Jasper Zhu 2009.06

Mobile Devices & Software 移动设备和软件

·Microsoft Tag 微软的2D条形码。2D条形码近年应用逐渐广泛,比一维条码能表达更丰富的信息。

·Mobile Software Catalog 包括众多的移动设备软件

·Ultra-Mobile PC 超移动便携PC

·Windows Mobile 为手持设备推出的移动版Windows”。使用Windows Mobile操作系统的设备主要有手机、PDA、随身音乐播放器等。Windows Mobile操作系统有三种,分别是Windows Mobile StandardWindows Mobile ProfessionalWindows Mobile Classic

·Windows Mobile Devices Windows移动设备

Macintosh针对苹果操作系统的软件

· All Macintosh Products

·Mac Expression

·Mac Mouse & Keyboard Products

·Mac Office

Hardware硬件产品

·All PC Hardware 生产几乎所有PC组件耗材

·Digital Communications 数字通信设备

·Media Center Peripherals 多媒体中心相关外设

·Microsoft Surface 微软的第一款平面电脑,没有鼠标键盘,通过人的手势,触摸和其他外在物理物来和电脑进行交互,改变了人和信息之间的交互方式

·Mouse & Keyboard Products 鼠标键盘产品

·MSN TV 电视和机顶盒

·PC Gaming Hardware 单机游戏配套硬件,如手柄

·Xbox Gaming Xbox游戏机及控制器

转自:http://hi.baidu.com/zhuxp/

Posted in 休闲转载 | Leave a comment

转:JDK6 参数大全

The most complete list of -XX options for Java 6 JVM

  • product flags are always settable / visible
  • develop flags are settable / visible only during development and are constant in the PRODUCT version
  • notproduct flags are settable / visible only during development and are not declared in the PRODUCT version
  • diagnostic options not meant for VM tuning or for product modes. They are to be used for VM quality assurance or field diagnosis of VM bugs. They are hidden so that users will not be encouraged to try them as if they were VM ordinary execution options. However, they are available in the product version of the VM. Under instruction from support engineers, VM customers can turn them on to collect diagnostic information about VM problems. To use a VM diagnostic option, you must first specify +UnlockDiagnosticVMOptions. (This master switch also affects the behavior of -Xprintflags.)
  • manageable flags are writeable external product flags. They are dynamically writeable through the JDK management interface (com.sun.management.HotSpotDiagnosticMXBean API) and also through JConsole. These flags are external exported interface (see CCC). The list of manageable flags can be queried programmatically through the management interface.
  • product_rw flags are writeable internal product flags. They are like "manageable" flags but for internal/private use. The list of product_rw flags are internal/private flags which may be changed/removed in a future release. It can be set through the management interface to get/set value when the name of flag is supplied.
  • product_pd
  • develop_pd
Name Description Default Type
product
UseMembar (Unstable) Issues membars on thread state transitions false bool
PrintCommandLineFlags Prints flags that appeared on the command line false bool
JavaMonitorsInStackTrace Print info. about Java monitor locks when the stacks are dumped true bool
LargePageSizeInBytes Large page size (0 to let VM choose the page size 0 uintx
LargePageHeapSizeThreshold Use large pages if max heap is at least this big 128*M uintx
ForceTimeHighResolution Using high time resolution(For Win32 only) false bool
PrintVMQWaitTime Prints out the waiting time in VM operation queue false bool
PrintJNIResolving Used to implement -v:jni false bool
UseInlineCaches Use Inline Caches for virtual calls true bool
UseCompilerSafepoints Stop at safepoints in compiled code true bool
UseSplitVerifier use split verifier with StackMapTable attributes true bool
FailOverToOldVerifier fail over to old verifier when split verifier fails true bool
SuspendRetryCount Maximum retry count for an external suspend request 50 intx
SuspendRetryDelay Milliseconds to delay per retry (* current_retry_count) 5 intx
UseSuspendResumeThreadLists Enable SuspendThreadList and ResumeThreadList true bool
MaxFDLimit Bump the number of file descriptors to max in solaris. true bool
BytecodeVerificationRemote Enables the Java bytecode verifier for remote classes true bool
BytecodeVerificationLocal Enables the Java bytecode verifier for local classes false bool
PrintGCApplicationConcurrentTime Print the time the application has been running false bool
PrintGCApplicationStoppedTime Print the time the application has been stopped false bool
ShowMessageBoxOnError Keep process alive on VM fatal error false bool
SuppressFatalErrorMessage Do NO Fatal Error report [Avoid deadlock] false bool
OnError Run user-defined commands on fatal error; see VMError.cpp for examples "" ccstr
OnOutOfMemoryError Run user-defined commands on first java.lang.OutOfMemoryError "" ccstr
PrintCompilation Print compilations false bool
StackTraceInThrowable Collect backtrace in throwable when exception happens true bool
OmitStackTraceInFastThrow Omit backtraces for some 'hot' exceptions in optimized code true bool
ProfilerPrintByteCodeStatistics Prints byte code statictics when dumping profiler output false bool
ProfilerRecordPC Collects tick for each 16 byte interval of compiled code false bool
ProfileVM Profiles ticks that fall within VM (either in the VM Thread or VM code called through stubs) false bool
ProfileIntervals Prints profiles for each interval (see ProfileIntervalsTicks) false bool
RegisterFinalizersAtInit Register finalizable objects at end of Object. or after allocation. true bool
ClassUnloading Do unloading of classes true bool
ConvertYieldToSleep Converts yield to a sleep of MinSleepInterval to simulate Win32 behavior (SOLARIS only) false bool
UseBoundThreads Bind user level threads to kernel threads (for SOLARIS only) true bool
UseLWPSynchronization Use LWP-based instead of libthread-based synchronization (SPARC only) true bool
SyncKnobs (Unstable) Various monitor synchronization tunables "" ccstr
EmitSync (Unsafe,Unstable) Controls emission of inline sync fast-path code 0 intx
AlwaysInflate (Unstable) Force inflation 0 intx
Atomics (Unsafe,Unstable) Diagnostic - Controls emission of atomics 0 intx
EmitLFence (Unsafe,Unstable) Experimental 0 intx
AppendRatio (Unstable) Monitor queue fairness" ) product(intx, SyncFlags, 0,(Unsafe,Unstable) Experimental Sync flags" ) product(intx, SyncVerbose, 0,(Unstable)" ) product(intx, ClearFPUAtPark, 0,(Unsafe,Unstable)" ) product(intx, hashCode, 0, (Unstable) select hashCode generation algorithm" ) product(intx, WorkAroundNPTLTimedWaitHang, 1, (Unstable, Linux-specific)" avoid NPTL-FUTEX hang pthread_cond_timedwait" ) product(bool, FilterSpuriousWakeups , true, Prevent spurious or premature wakeups from object.wait" (Solaris only) 11 intx
AdjustConcurrency call thr_setconcurrency at thread create time to avoid LWP starvation on MP systems (For Solaris Only) false bool
ReduceSignalUsage Reduce the use of OS signals in Java and/or the VM false bool
AllowUserSignalHandlers Do not complain if the application installs signal handlers (Solaris & Linux only) false bool
UseSignalChaining Use signal-chaining to invoke signal handlers installed by the application (Solaris & Linux only) true bool
UseAltSigs Use alternate signals instead of SIGUSR1 & SIGUSR2 for VM internal signals. (Solaris only) false bool
UseSpinning Use spinning in monitor inflation and before entry false bool
PreSpinYield Yield before inner spinning loop false bool
PostSpinYield Yield after inner spinning loop true bool
AllowJNIEnvProxy Allow JNIEnv proxies for jdbx false bool
JNIDetachReleasesMonitors JNI DetachCurrentThread releases monitors owned by thread true bool
RestoreMXCSROnJNICalls Restore MXCSR when returning from JNI calls false bool
CheckJNICalls Verify all arguments to JNI calls false bool
UseFastJNIAccessors Use optimized versions of GetField true bool
EagerXrunInit Eagerly initialize -Xrun libraries; allows startup profiling, but not all -Xrun libraries may support the state of the VM at this time false bool
PreserveAllAnnotations Preserve RuntimeInvisibleAnnotations as well as RuntimeVisibleAnnotations false bool
LazyBootClassLoader Enable/disable lazy opening of boot class path entries true bool
UseBiasedLocking Enable biased locking in JVM true bool
BiasedLockingStartupDelay Number of milliseconds to wait before enabling biased locking 4000 intx
BiasedLockingBulkRebiasThreshold Threshold of number of revocations per type to try to rebias all objects in the heap of that type 20 intx
BiasedLockingBulkRevokeThreshold Threshold of number of revocations per type to permanently revoke biases of all objects in the heap of that type 40 intx
BiasedLockingDecayTime Decay time (in milliseconds) to re-enable bulk rebiasing of a type after previous bulk rebias 25000 intx
TraceJVMTI Trace flags for JVMTI functions and events "" ccstr
StressLdcRewrite Force ldc -> ldc_w rewrite during RedefineClasses false bool
TraceRedefineClasses Trace level for JVMTI RedefineClasses 0 intx
VerifyMergedCPBytecodes Verify bytecodes after RedefineClasses constant pool merging true bool
HPILibPath Specify alternate path to HPI library "" ccstr
TraceClassResolution Trace all constant pool resolutions (for debugging) false bool
TraceBiasedLocking Trace biased locking in JVM false bool
TraceMonitorInflation Trace monitor inflation in JVM false bool
Use486InstrsOnly Use 80486 Compliant instruction subset false bool
UseSerialGC Tells whether the VM should use serial garbage collector false bool
UseParallelGC Use the Parallel Scavenge garbage collector false bool
UseParallelOldGC Use the Parallel Old garbage collector false bool
UseParallelOldGCCompacting In the Parallel Old garbage collector use parallel compaction true bool
UseParallelDensePrefixUpdate In the Parallel Old garbage collector use parallel dense" prefix update true bool
HeapMaximumCompactionInterval How often should we maximally compact the heap (not allowing any dead space) 20 uintx
HeapFirstMaximumCompactionCount The collection count for the first maximum compaction 3 uintx
UseMaximumCompactionOnSystemGC In the Parallel Old garbage collector maximum compaction for a system GC true bool
ParallelOldDeadWoodLimiterMean The mean used by the par compact dead wood" limiter (a number between 0-100). 50 uintx
ParallelOldDeadWoodLimiterStdDev The standard deviation used by the par compact dead wood" limiter (a number between 0-100). 80 uintx
UseParallelOldGCDensePrefix Use a dense prefix with the Parallel Old garbage collector true bool
ParallelGCThreads Number of parallel threads parallel gc will use 0 uintx
ParallelCMSThreads Max number of threads CMS will use for concurrent work 0 uintx
YoungPLABSize Size of young gen promotion labs (in HeapWords) 4096 uintx
OldPLABSize Size of old gen promotion labs (in HeapWords) 1024 uintx
GCTaskTimeStampEntries Number of time stamp entries per gc worker thread 200 uintx
AlwaysTenure Always tenure objects in eden. (ParallelGC only) false bool
NeverTenure Never tenure objects in eden, May tenure on overflow" (ParallelGC only) false bool
ScavengeBeforeFullGC Scavenge youngest generation before each full GC," used with UseParallelGC true bool
UseConcMarkSweepGC Use Concurrent Mark-Sweep GC in the old generation false bool
ExplicitGCInvokesConcurrent A System.gc() request invokes a concurrent collection;" (effective only when UseConcMarkSweepGC) false bool
UseCMSBestFit Use CMS best fit allocation strategy true bool
UseCMSCollectionPassing Use passing of collection from background to foreground true bool
UseParNewGC Use parallel threads in the new generation. false bool
ParallelGCVerbose Verbose output for parallel GC. false bool
ParallelGCBufferWastePct wasted fraction of parallel allocation buffer. 10 intx
ParallelGCRetainPLAB Retain parallel allocation buffers across scavenges. true bool
TargetPLABWastePct target wasted space in last buffer as pct of overall allocation 10 intx
PLABWeight Percentage (0-100) used to weight the current sample when" computing exponentially decaying average for ResizePLAB. 75 uintx
ResizePLAB Dynamically resize (survivor space) promotion labs true bool
PrintPLAB Print (survivor space) promotion labs sizing decisions false bool
ParGCArrayScanChunk Scan a subset and push remainder, if array is bigger than this 50 intx
ParGCDesiredObjsFromOverflowList The desired number of objects to claim from the overflow list 20 intx
CMSParPromoteBlocksToClaim Number of blocks to attempt to claim when refilling CMS LAB for parallel GC. 50 uintx
AlwaysPreTouch It forces all freshly committed pages to be pre-touched. false bool
CMSUseOldDefaults A flag temporarily introduced to allow reverting to some older" default settings; older as of 6.0 false bool
CMSYoungGenPerWorker The amount of young gen chosen by default per GC worker thread available 16*M intx
CMSIncrementalMode Whether CMS GC should operate in \"incremental\" mode false bool
CMSIncrementalDutyCycle CMS incremental mode duty cycle (a percentage, 0-100). If" CMSIncrementalPacing is enabled, then this is just the initial" value 10 uintx
CMSIncrementalPacing Whether the CMS incremental mode duty cycle should be automatically adjusted true bool
CMSIncrementalDutyCycleMin Lower bound on the duty cycle when CMSIncrementalPacing is" enabled (a percentage, 0-100). 0 uintx
CMSIncrementalSafetyFactor Percentage (0-100) used to add conservatism when computing the" duty cycle. 10 uintx
CMSIncrementalOffset Percentage (0-100) by which the CMS incremental mode duty cycle" is shifted to the right within the period between young GCs 0 uintx
CMSExpAvgFactor Percentage (0-100) used to weight the current sample when" computing exponential averages for CMS statistics. 25 uintx
CMS_FLSWeight Percentage (0-100) used to weight the current sample when" computing exponentially decating averages for CMS FLS statistics. 50 uintx
CMS_FLSPadding The multiple of deviation from mean to use for buffering" against volatility in free list demand. 2 uintx
FLSCoalescePolicy CMS: Aggression level for coalescing, increasing from 0 to 4 2 uintx
CMS_SweepWeight Percentage (0-100) used to weight the current sample when" computing exponentially decaying average for inter-sweep duration. 50 uintx
CMS_SweepPadding The multiple of deviation from mean to use for buffering" against volatility in inter-sweep duration. 2 uintx
CMS_SweepTimerThresholdMillis Skip block flux-rate sampling for an epoch unless inter-sweep duration exceeds this threhold in milliseconds 10 uintx
CMSClassUnloadingEnabled Whether class unloading enabled when using CMS GC false bool
CMSCompactWhenClearAllSoftRefs Compact when asked to collect CMS gen with clear_all_soft_refs true bool
UseCMSCompactAtFullCollection Use mark sweep compact at full collections true bool
CMSFullGCsBeforeCompaction Number of CMS full collection done before compaction if > 0 0 uintx
CMSIndexedFreeListReplenish Replenish and indexed free list with this number of chunks 4 uintx
CMSLoopWarn Warn in case of excessive CMS looping false bool
CMSMarkStackSize Size of CMS marking stack 32*K uintx
CMSMarkStackSizeMax Max size of CMS marking stack 4*M uintx
CMSMaxAbortablePrecleanLoops (Temporary, subject to experimentation)" Maximum number of abortable preclean iterations, if > 0 0 uintx
CMSMaxAbortablePrecleanTime (Temporary, subject to experimentation)" Maximum time in abortable preclean in ms 5000 intx
CMSAbortablePrecleanMinWorkPerIteration (Temporary, subject to experimentation)" Nominal minimum work per abortable preclean iteration 100 uintx
CMSAbortablePrecleanWaitMillis (Temporary, subject to experimentation)" Time that we sleep between iterations when not given" enough work per iteration 100 intx
CMSRescanMultiple Size (in cards) of CMS parallel rescan task 32 uintx
CMSConcMarkMultiple Size (in cards) of CMS concurrent MT marking task 32 uintx
CMSRevisitStackSize Size of CMS KlassKlass revisit stack 1*M uintx
CMSAbortSemantics Whether abort-on-overflow semantics is implemented false bool
CMSParallelRemarkEnabled Whether parallel remark enabled (only if ParNewGC) true bool
CMSParallelSurvivorRemarkEnabled Whether parallel remark of survivor space" enabled (effective only if CMSParallelRemarkEnabled) true bool
CMSPLABRecordAlways Whether to always record survivor space PLAB bdries" (effective only if CMSParallelSurvivorRemarkEnabled) true bool
CMSConcurrentMTEnabled Whether multi-threaded concurrent work enabled (if ParNewGC) true bool
CMSPermGenPrecleaningEnabled Whether concurrent precleaning enabled in perm gen" (effective only when CMSPrecleaningEnabled is true) true bool
CMSPermGenSweepingEnabled Whether sweeping of perm gen is enabled false bool
CMSPrecleaningEnabled Whether concurrent precleaning enabled true bool
CMSPrecleanIter Maximum number of precleaning iteration passes 3 uintx
CMSPrecleanNumerator CMSPrecleanNumerator:CMSPrecleanDenominator yields convergence" ratio 2 uintx
CMSPrecleanDenominator CMSPrecleanNumerator:CMSPrecleanDenominator yields convergence" ratio 3 uintx
CMSPrecleanRefLists1 Preclean ref lists during (initial) preclean phase true bool
CMSPrecleanRefLists2 Preclean ref lists during abortable preclean phase false bool
CMSPrecleanSurvivors1 Preclean survivors during (initial) preclean phase false bool
CMSPrecleanSurvivors2 Preclean survivors during abortable preclean phase true bool
CMSPrecleanThreshold Don't re-iterate if #dirty cards less than this 1000 uintx
CMSCleanOnEnter Clean-on-enter optimization for reducing number of dirty cards true bool
CMSRemarkVerifyVariant Choose variant (1,2) of verification following remark 1 uintx
CMSScheduleRemarkEdenSizeThreshold If Eden used is below this value, don't try to schedule remark 2*M uintx
CMSScheduleRemarkEdenPenetration The Eden occupancy % at which to try and schedule remark pause 50 uintx
CMSScheduleRemarkSamplingRatio Start sampling Eden top at least before yg occupancy reaches" 1/ of the size at which we plan to schedule remark 5 uintx
CMSSamplingGrain The minimum distance between eden samples for CMS (see above) 16*K uintx
CMSScavengeBeforeRemark Attempt scavenge before the CMS remark step false bool
CMSWorkQueueDrainThreshold Don't drain below this size per parallel worker/thief 10 uintx
CMSWaitDuration Time in milliseconds that CMS thread waits for young GC 2000 intx
CMSYield Yield between steps of concurrent mark & sweep true bool
CMSBitMapYieldQuantum Bitmap operations should process at most this many bits" between yields 10*M uintx
BlockOffsetArrayUseUnallocatedBlock Maintain _unallocated_block in BlockOffsetArray" (currently applicable only to CMS collector) trueInDebug bool
RefDiscoveryPolicy Whether reference-based(0) or referent-based(1) 0 intx
ParallelRefProcEnabled Enable parallel reference processing whenever possible false bool
CMSTriggerRatio Percentage of MinHeapFreeRatio in CMS generation that is allocated before a CMS collection cycle commences 80 intx
CMSBootstrapOccupancy Percentage CMS generation occupancy at which to initiate CMS collection for bootstrapping collection stats 50 intx
CMSInitiatingOccupancyFraction Percentage CMS generation occupancy to start a CMS collection cycle (A negative value means that CMSTirggerRatio is used) -1 intx
UseCMSInitiatingOccupancyOnly Only use occupancy as a crierion for starting a CMS collection false bool
HandlePromotionFailure The youngest generation collection does not require" a guarantee of full promotion of all live objects. true bool
PreserveMarkStackSize Size for stack used in promotion failure handling 40 uintx
ZeroTLAB Zero out the newly created TLAB false bool
PrintTLAB Print various TLAB related information false bool
TLABStats Print various TLAB related information true bool
AlwaysActAsServerClassMachine Always act like a server-class machine false bool
DefaultMaxRAM Maximum real memory size for setting server class heap size G uintx
DefaultMaxRAMFraction Fraction (1/n) of real memory used for server class max heap 4 uintx
DefaultInitialRAMFraction Fraction (1/n) of real memory used for server class initial heap 64 uintx
UseAutoGCSelectPolicy Use automatic collection selection policy false bool
AutoGCSelectPauseMillis Automatic GC selection pause threshhold in ms 5000 uintx
UseAdaptiveSizePolicy Use adaptive generation sizing policies true bool
UsePSAdaptiveSurvivorSizePolicy Use adaptive survivor sizing policies true bool
UseAdaptiveGenerationSizePolicyAtMinorCollection Use adaptive young-old sizing policies at minor collections true bool
UseAdaptiveGenerationSizePolicyAtMajorCollection Use adaptive young-old sizing policies at major collections true bool
UseAdaptiveSizePolicyWithSystemGC Use statistics from System.GC for adaptive size policy false bool
UseAdaptiveGCBoundary Allow young-old boundary to move false bool
AdaptiveSizeThroughPutPolicy Policy for changeing generation size for throughput goals 0 uintx
AdaptiveSizePausePolicy Policy for changing generation size for pause goals 0 uintx
AdaptiveSizePolicyInitializingSteps Number of steps where heuristics is used before data is used 20 uintx
AdaptiveSizePolicyOutputInterval Collecton interval for printing information, zero => never 0 uintx
UseAdaptiveSizePolicyFootprintGoal Use adaptive minimum footprint as a goal true bool
AdaptiveSizePolicyWeight Weight given to exponential resizing, between 0 and 100 10 uintx
AdaptiveTimeWeight Weight given to time in adaptive policy, between 0 and 100 25 uintx
PausePadding How much buffer to keep for pause time 1 uintx
PromotedPadding How much buffer to keep for promotion failure 3 uintx
SurvivorPadding How much buffer to keep for survivor overflow 3 uintx
AdaptivePermSizeWeight Weight for perm gen exponential resizing, between 0 and 100 20 uintx
PermGenPadding How much buffer to keep for perm gen sizing 3 uintx
ThresholdTolerance Allowed collection cost difference between generations 10 uintx
AdaptiveSizePolicyCollectionCostMargin If collection costs are within margin, reduce both by full delta 50 uintx
YoungGenerationSizeIncrement Adaptive size percentage change in young generation 20 uintx
YoungGenerationSizeSupplement Supplement to YoungedGenerationSizeIncrement used at startup 80 uintx
YoungGenerationSizeSupplementDecay Decay factor to YoungedGenerationSizeSupplement 8 uintx
TenuredGenerationSizeIncrement Adaptive size percentage change in tenured generation 20 uintx
TenuredGenerationSizeSupplement Supplement to TenuredGenerationSizeIncrement used at startup 80 uintx
TenuredGenerationSizeSupplementDecay Decay factor to TenuredGenerationSizeIncrement 2 uintx
MaxGCPauseMillis Adaptive size policy maximum GC pause time goal in msec max_uintx uintx
MaxGCMinorPauseMillis Adaptive size policy maximum GC minor pause time goal in msec max_uintx uintx
GCTimeRatio Adaptive size policy application time to GC time ratio 99 uintx
AdaptiveSizeDecrementScaleFactor Adaptive size scale down factor for shrinking 4 uintx
UseAdaptiveSizeDecayMajorGCCost Adaptive size decays the major cost for long major intervals true bool
AdaptiveSizeMajorGCDecayTimeScale Time scale over which major costs decay 10 uintx
MinSurvivorRatio Minimum ratio of young generation/survivor space size 3 uintx
InitialSurvivorRatio Initial ratio of eden/survivor space size 8 uintx
BaseFootPrintEstimate Estimate of footprint other than Java Heap 256*M uintx
UseGCOverheadLimit Use policy to limit of proportion of time spent in GC before an OutOfMemory error is thrown true bool
GCTimeLimit Limit of proportion of time spent in GC before an OutOfMemory" error is thrown (used with GCHeapFreeLimit) 98 uintx
GCHeapFreeLimit Minimum percentage of free space after a full GC before an OutOfMemoryError is thrown (used with GCTimeLimit) 2 uintx
PrintAdaptiveSizePolicy Print information about AdaptiveSizePolicy false bool
DisableExplicitGC Tells whether calling System.gc() does a full GC false bool
CollectGen0First Collect youngest generation before each full GC false bool
BindGCTaskThreadsToCPUs Bind GCTaskThreads to CPUs if possible false bool
UseGCTaskAffinity Use worker affinity when asking for GCTasks false bool
ProcessDistributionStride Stride through processors when distributing processes 4 uintx
CMSCoordinatorYieldSleepCount number of times the coordinator GC thread will sleep while yielding before giving up and resuming GC 10 uintx
CMSYieldSleepCount number of times a GC thread (minus the coordinator) will sleep while yielding before giving up and resuming GC 0 uintx
PrintGCTaskTimeStamps Print timestamps for individual gc worker thread tasks false bool
TraceClassLoadingPreorder Trace all classes loaded in order referenced (not loaded) false bool
TraceGen0Time Trace accumulated time for Gen 0 collection false bool
TraceGen1Time Trace accumulated time for Gen 1 collection false bool
PrintTenuringDistribution Print tenuring age information false bool
PrintHeapAtSIGBREAK Print heap layout in response to SIGBREAK true bool
TraceParallelOldGCTasks Trace multithreaded GC activity false bool
PrintParallelOldGCPhaseTimes Print the time taken by each parallel old gc phase." PrintGCDetails must also be enabled. false bool
CITime collect timing information for compilation false bool
Inline enable inlining true bool
ClipInlining clip inlining if aggregate method exceeds DesiredMethodLimit true bool
UseTypeProfile Check interpreter profile for historically monomorphic calls true bool
TypeProfileMinimumRatio Minimum ratio of profiled majority type to all minority types 9 intx
Tier1UpdateMethodData Update methodDataOops in Tier1-generated code false bool
PrintVMOptions print VM flag settings trueInDebug bool
ErrorFile If an error occurs, save the error data to this file [default: ./hs_err_pid%p.log] (%p replaced with pid) "" ccstr
DisplayVMOutputToStderr If DisplayVMOutput is true, display all VM output to stderr false bool
DisplayVMOutputToStdout If DisplayVMOutput is true, display all VM output to stdout false bool
UseHeavyMonitors use heavyweight instead of lightweight Java monitors false bool
RangeCheckElimination Split loop iterations to eliminate range checks true bool
SplitIfBlocks Clone compares and control flow through merge points to fold some branches true bool
AggressiveOpts Enable aggressive optimizations - see arguments.cpp false bool
PrintInterpreter Prints the generated interpreter code false bool
UseInterpreter Use interpreter for non-compiled methods true bool
UseNiagaraInstrs Use Niagara-efficient instruction subset false bool
UseLoopCounter Increment invocation counter on backward branch true bool
UseFastEmptyMethods Use fast method entry code for empty methods true bool
UseFastAccessorMethods Use fast method entry code for accessor methods true bool
EnableJVMPIInstructionStartEvent Enable JVMPI_EVENT_INSTRUCTION_START events - slows down interpretation false bool
JVMPICheckGCCompatibility If JVMPI is used, make sure that we are using a JVMPI-compatible garbage collector true bool
ProfileMaturityPercentage number of method invocations/branches (expressed as % of CompileThreshold) before using the method's profile 20 intx
UseCompiler use compilation true bool
UseCounterDecay adjust recompilation counters true bool
AlwaysCompileLoopMethods when using recompilation, never interpret methods containing loops false bool
DontCompileHugeMethods don't compile methods > HugeMethodLimit true bool
EstimateArgEscape Analyze bytecodes to estimate escape state of arguments true bool
BCEATraceLevel How much tracing to do of bytecode escape analysis estimates 0 intx
MaxBCEAEstimateLevel Maximum number of nested calls that are analyzed by BC EA. 5 intx
MaxBCEAEstimateSize Maximum bytecode size of a method to be analyzed by BC EA. 150 intx
SelfDestructTimer Will cause VM to terminate after a given time (in minutes) (0 means off) 0 intx
MaxJavaStackTraceDepth Max. no. of lines in the stack trace for Java exceptions (0 means all) 1024 intx
NmethodSweepFraction Number of invocations of sweeper to cover all nmethods 4 intx
MaxInlineSize maximum bytecode size of a method to be inlined 35 intx
ProfileIntervalsTicks # of ticks between printing of interval profile (+ProfileIntervals) 100 intx
EventLogLength maximum nof events in event log 2000 intx
PerMethodRecompilationCutoff After recompiling N times, stay in the interpreter (-1=>'Inf') 400 intx
PerBytecodeRecompilationCutoff Per-BCI limit on repeated recompilation (-1=>'Inf') 100 intx
PerMethodTrapLimit Limit on traps (of one kind) in a method (includes inlines) 100 intx
PerBytecodeTrapLimit Limit on traps (of one kind) at a particular BCI 4 intx
AliasLevel 0 for no aliasing, 1 for oop/field/static/array split, 2 for best 2 intx
ReadSpinIterations Number of read attempts before a yield (spin inner loop) 100 intx
PreBlockSpin Number of times to spin in an inflated lock before going to an OS lock 10 intx
MaxHeapSize Default maximum size for object heap (in bytes) ScaleForWordSize(64*M) uintx
MaxNewSize Maximum size of new generation (in bytes) max_uintx uintx
PretenureSizeThreshold Max size in bytes of objects allocated in DefNew generation 0 uintx
MinTLABSize Minimum allowed TLAB size (in bytes) 2*K uintx
TLABAllocationWeight Allocation averaging weight 35 uintx
TLABWasteTargetPercent Percentage of Eden that can be wasted 1 uintx
TLABRefillWasteFraction Max TLAB waste at a refill (internal fragmentation) 64 uintx
TLABWasteIncrement Increment allowed waste at slow allocation 4 uintx
MaxLiveObjectEvacuationRatio Max percent of eden objects that will be live at scavenge 100 uintx
OldSize Default size of tenured generation (in bytes) ScaleForWordSize(4096*K) uintx
MinHeapFreeRatio Min percentage of heap free after GC to avoid expansion 40 uintx
MaxHeapFreeRatio Max percentage of heap free after GC to avoid shrinking 70 uintx
SoftRefLRUPolicyMSPerMB Number of milliseconds per MB of free space in the heap 1000 intx
MinHeapDeltaBytes Min change in heap space due to GC (in bytes) ScaleForWordSize(128*K) uintx
MinPermHeapExpansion Min expansion of permanent heap (in bytes) ScaleForWordSize(256*K) uintx
MaxPermHeapExpansion Max expansion of permanent heap without full GC (in bytes) ScaleForWordSize(4*M) uintx
QueuedAllocationWarningCount Number of times an allocation that queues behind a GC will retry before printing a warning 0 intx
MaxTenuringThreshold Maximum value for tenuring threshold 15 intx
InitialTenuringThreshold Initial value for tenuring threshold 7 intx
TargetSurvivorRatio Desired percentage of survivor space used after scavenge 50 intx
MarkSweepDeadRatio Percentage (0-100) of the old gen allowed as dead wood." Serial mark sweep treats this as both the min and max value." CMS uses this value only if it falls back to mark sweep." Par compact uses a variable scale based on the density of the" generation and treats this as the max value when the heap is" either completely full or completely empty. Par compact also" has a smaller default value; see arguments.cpp. 5 intx
PermMarkSweepDeadRatio Percentage (0-100) of the perm gen allowed as dead wood." See MarkSweepDeadRatio for collector-specific comments. 20 intx
MarkSweepAlwaysCompactCount How often should we fully compact the heap (ignoring the dead space parameters) 4 intx
PrintCMSStatistics Statistics for CMS 0 intx
PrintCMSInitiationStatistics Statistics for initiating a CMS collection false bool
PrintFLSStatistics Statistics for CMS' FreeListSpace 0 intx
PrintFLSCensus Census for CMS' FreeListSpace 0 intx
DeferThrSuspendLoopCount (Unstable) Number of times to iterate in safepoint loop before blocking VM threads 4000 intx
DeferPollingPageLoopCount (Unsafe,Unstable) Number of iterations in safepoint loop before changing safepoint polling page to RO -1 intx
SafepointSpinBeforeYield (Unstable) 2000 intx
UseDepthFirstScavengeOrder true: the scavenge order will be depth-first, false: the scavenge order will be breadth-first true bool
GCDrainStackTargetSize how many entries we'll try to leave on the stack during parallel GC 64 uintx
ThreadSafetyMargin Thread safety margin is used on fixed-stack LinuxThreads (on Linux/x86 only) to prevent heap-stack collision. Set to 0 to disable this feature 50*M uintx
CodeCacheMinimumFreeSpace When less than X space left, we stop compiling. 500*K uintx
CompileOnly List of methods (pkg/class.name) to restrict compilation to "" ccstr
CompileCommandFile Read compiler commands from this file [.hotspot_compiler] "" ccstr
CompileCommand Prepend to .hotspot_compiler; e.g. log,java/lang/String. "" ccstr
CICompilerCountPerCPU 1 compiler thread for log(N CPUs) false bool
UseThreadPriorities Use native thread priorities true bool
ThreadPriorityPolicy 0 : Normal. VM chooses priorities that are appropriate for normal applications. On Solaris NORM_PRIORITY and above are mapped to normal native priority. Java priorities below NORM_PRIORITY" map to lower native priority values. On Windows applications" are allowed to use higher native priorities. However, with ThreadPriorityPolicy=0, VM will not use the highest possible" native priority, THREAD_PRIORITY_TIME_CRITICAL, as it may interfere with system threads. On Linux thread priorities are ignored because the OS does not support static priority in SCHED_OTHER scheduling class which is the only choice for" non-root, non-realtime applications. 1 : Aggressive. Java thread priorities map over to the entire range of native thread priorities. Higher Java thread priorities map to higher native thread priorities. This policy should be used with care, as sometimes it can cause performance degradation in the application and/or the entire system. On Linux this policy requires root privilege. 0 intx
ThreadPriorityVerbose print priority changes false bool
DefaultThreadPriority what native priority threads run at if not specified elsewhere (-1 means no change) -1 intx
CompilerThreadPriority what priority should compiler threads run at (-1 means no change) -1 intx
VMThreadPriority what priority should VM threads run at (-1 means no change) -1 intx
CompilerThreadHintNoPreempt (Solaris only) Give compiler threads an extra quanta true bool
VMThreadHintNoPreempt (Solaris only) Give VM thread an extra quanta false bool
JavaPriority1_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority2_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority3_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority4_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority5_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority6_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority7_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority8_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority9_To_OSPriority Map Java priorities to OS priorities -1 intx
JavaPriority10_To_OSPriority,-1 Map Java priorities to OS priorities intx
StarvationMonitorInterval Pause between each check in ms 200 intx
Tier1BytecodeLimit Must have at least this many bytecodes before tier1" invocation counters are used 10 intx
StressTieredRuntime Alternate client and server compiler on compile requests false bool
InterpreterProfilePercentage number of method invocations/branches (expressed as % of CompileThreshold) before profiling in the interpreter 33 intx
MaxDirectMemorySize Maximum total size of NIO direct-buffer allocations -1 intx
UseUnsupportedDeprecatedJVMPI Flag to temporarily re-enable the, soon to be removed, experimental interface JVMPI. false bool
UsePerfData Flag to disable jvmstat instrumentation for performance testing" and problem isolation purposes. true bool
PerfDataSaveToFile Save PerfData memory to hsperfdata_ file on exit false bool
PerfDataSamplingInterval Data sampling interval in milliseconds 50 /*ms*/ intx
PerfDisableSharedMem Store performance data in standard memory false bool
PerfDataMemorySize Size of performance data memory region. Will be rounded up to a multiple of the native os page size. 32*K intx
PerfMaxStringConstLength Maximum PerfStringConstant string length before truncation 1024 intx
PerfAllowAtExitRegistration Allow registration of atexit() methods false bool
PerfBypassFileSystemCheck Bypass Win32 file system criteria checks (Windows Only) false bool
UnguardOnExecutionViolation Unguard page and retry on no-execute fault (Win32 only)" 0=off, 1=conservative, 2=aggressive 0 intx
ManagementServer Create JMX Management Server false bool
DisableAttachMechanism Disable mechanism that allows tools to attach to this VM false bool
StartAttachListener Always start Attach Listener at VM startup false bool
UseSharedSpaces Use shared spaces in the permanent generation true bool
RequireSharedSpaces Require shared spaces in the permanent generation false bool
ForceSharedSpaces Require shared spaces in the permanent generation false bool
DumpSharedSpaces Special mode: JVM reads a class list, loads classes, builds shared spaces, and dumps the shared spaces to a file to be used in future JVM runs. false bool
PrintSharedSpaces Print usage of shared spaces false bool
SharedDummyBlockSize Size of dummy block used to shift heap addresses (in bytes) 512*M uintx
SharedReadWriteSize Size of read-write space in permanent generation (in bytes) 12*M uintx
SharedReadOnlySize Size of read-only space in permanent generation (in bytes) 8*M uintx
SharedMiscDataSize Size of the shared data area adjacent to the heap (in bytes) 4*M uintx
SharedMiscCodeSize Size of the shared code area adjacent to the heap (in bytes) 4*M uintx
TaggedStackInterpreter Insert tags in interpreter execution stack for oopmap generaion false bool
ExtendedDTraceProbes Enable performance-impacting dtrace probes false bool
DTraceMethodProbes Enable dtrace probes for method-entry and method-exit false bool
DTraceAllocProbes Enable dtrace probes for object allocation false bool
DTraceMonitorProbes Enable dtrace probes for monitor events false bool
RelaxAccessControlCheck Relax the access control checks in the verifier false bool
UseVMInterruptibleIO (Unstable, Solaris-specific) Thread interrupt before or with EINTR for I/O operations results in OS_INTRPT true bool
manageable
HeapDumpOnOutOfMemoryError Dump heap to file when java.lang.OutOfMemoryError is thrown false bool
HeapDumpPath When HeapDumpOnOutOfMemoryError is on, the path (filename or" directory) of the dump file (defaults to java_pid.hprof" in the working directory) "" ccstr
PrintGC Print message at garbage collect false bool
PrintGCDetails Print more details at garbage collect false bool
PrintGCTimeStamps Print timestamps at garbage collect false bool
PrintClassHistogram Print a histogram of class instances false bool
PrintConcurrentLocks Print java.util.concurrent locks in thread dump false bool
product_rw
TraceClassLoading Trace all classes loaded false bool
TraceClassUnloading Trace unloading of classes false bool
TraceLoaderConstraints Trace loader constraints false bool
PrintHeapAtGC Print heap layout before and after each GC false bool
develop
TraceItables Trace initialization and use of itables false bool
TracePcPatching Trace usage of frame::patch_pc false bool
TraceJumps Trace assembly jumps in thread ring buffer false bool
TraceRelocator Trace the bytecode relocator false bool
TraceLongCompiles Print out every time compilation is longer than a given threashold false bool
SafepointALot Generates a lot of safepoints. Works with GuaranteedSafepointInterval false bool
BailoutToInterpreterForThrows Compiled methods which throws/catches exceptions will be deopt and intp. false bool
NoYieldsInMicrolock Disable yields in microlock false bool
TraceOopMapGeneration Shows oopmap generation false bool
MethodFlushing Reclamation of zombie and not-entrant methods true bool
VerifyStack Verify stack of each thread when it is entering a runtime call false bool
TraceDerivedPointers Trace traversal of derived pointers on stack false bool
InlineArrayCopy inline arraycopy native that is known to be part of base library DLL true bool
InlineObjectHash inline Object::hashCode() native that is known to be part of base library DLL true bool
InlineNatives inline natives that are known to be part of base library DLL true bool
InlineMathNatives inline SinD, CosD, etc. true bool
InlineClassNatives inline Class.isInstance, etc true bool
InlineAtomicLong inline sun.misc.AtomicLong true bool
InlineThreadNatives inline Thread.currentThread, etc true bool
InlineReflectionGetCallerClass inline sun.reflect.Reflection.getCallerClass(), known to be part of base library DLL true bool
InlineUnsafeOps inline memory ops (native methods) from sun.misc.Unsafe true bool
ConvertCmpD2CmpF Convert cmpD to cmpF when one input is constant in float range true bool
ConvertFloat2IntClipping Convert float2int clipping idiom to integer clipping true bool
SpecialStringCompareTo special version of string compareTo true bool
SpecialStringIndexOf special version of string indexOf true bool
TraceCallFixup traces all call fixups false bool
DeoptimizeALot deoptimize at every exit from the runtime system false bool
DeoptimizeOnlyAt a comma separated list of bcis to deoptimize at "" ccstr
Debugging set when executing debug methods in debug.ccp (to prevent triggering assertions) false bool
TraceHandleAllocation Prints out warnings when suspicious many handles are allocated false bool
ShowSafepointMsgs Show msg. about safepoint synch. false bool
SafepointTimeout Time out and warn or fail after SafepointTimeoutDelay milliseconds if failed to reach safepoint false bool
DieOnSafepointTimeout Die upon failure to reach safepoint (see SafepointTimeout) false bool
ForceFloatExceptions Force exceptions on FP stack under/overflow trueInDebug bool
SoftMatchFailure If the DFA fails to match a node, print a message and bail out trueInProduct bool
VerifyStackAtCalls Verify that the stack pointer is unchanged after calls false bool
TraceJavaAssertions Trace java language assertions false bool
ZapDeadCompiledLocals Zap dead locals in compiler frames false bool
UseMallocOnly use only malloc/free for allocation (no resource area/arena) false bool
PrintMalloc print all malloc/free calls false bool
ZapResourceArea Zap freed resource/arena space with 0xABABABAB trueInDebug bool
ZapJNIHandleArea Zap freed JNI handle space with 0xFEFEFEFE trueInDebug bool
ZapUnusedHeapArea Zap unused heap space with 0xBAADBABE trueInDebug bool
PrintVMMessages Print vm messages on console true bool
Verbose Prints additional debugging information from other modes false bool
PrintMiscellaneous Prints uncategorized debugging information (requires +Verbose) false bool
WizardMode Prints much more debugging information false bool
SegmentedHeapDumpThreshold Generate a segmented heap dump (JAVA PROFILE 1.0.2 format) when the heap usage is larger than this 2*G uintx
HeapDumpSegmentSize Approximate segment size when generating a segmented heap dump 1*G uintx
BreakAtWarning Execute breakpoint upon encountering VM warning false bool
TraceVMOperation Trace vm operations false bool
UseFakeTimers Tells whether the VM should use system time or a fake timer false bool
PrintAssembly Print assembly code false bool
PrintNMethods Print assembly code for nmethods when generated false bool
PrintNativeNMethods Print assembly code for native nmethods when generated false bool
PrintDebugInfo Print debug information for all nmethods when generated false bool
PrintRelocations Print relocation information for all nmethods when generated false bool
PrintDependencies Print dependency information for all nmethods when generated false bool
PrintExceptionHandlers Print exception handler tables for all nmethods when generated false bool
InterceptOSException Starts debugger when an implicit OS (e.g., NULL) exception happens false bool
PrintCodeCache2 Print detailed info on the compiled_code cache when exiting false bool
PrintStubCode Print generated stub code false bool
PrintJVMWarnings Prints warnings for unimplemented JVM functions false bool
InitializeJavaLangSystem Initialize java.lang.System - turn off for individual method debugging true bool
InitializeJavaLangString Initialize java.lang.String - turn off for individual method debugging true bool
InitializeJavaLangExceptionsErrors Initialize various error and exception classes - turn off for individual method debugging true bool
RegisterReferences Tells whether the VM should register soft/weak/final/phantom references true bool
IgnoreRewrites Supress rewrites of bytecodes in the oopmap generator. This is unsafe! false bool
PrintCodeCacheExtension Print extension of code cache false bool
UsePrivilegedStack Enable the security JVM functions true bool
IEEEPrecision Enables IEEE precision (for INTEL only) true bool
ProtectionDomainVerification Verifies protection domain before resolution in system dictionary true bool
DisableStartThread Disable starting of additional Java threads (for debugging only) false bool
MemProfiling Write memory usage profiling to log file false bool
UseDetachedThreads Use detached threads that are recycled upon termination (for SOLARIS only) true bool
UsePthreads Use pthread-based instead of libthread-based synchronization (SPARC only) false bool
UpdateHotSpotCompilerFileOnError Should the system attempt to update the compiler file when an error occurs? true bool
LoadLineNumberTables Tells whether the class file parser loads line number tables true bool
LoadLocalVariableTables Tells whether the class file parser loads local variable tables true bool
LoadLocalVariableTypeTables Tells whether the class file parser loads local variable type tables true bool
PreallocatedOutOfMemoryErrorCount Number of OutOfMemoryErrors preallocated with backtrace 4 uintx
PrintBiasedLockingStatistics Print statistics of biased locking in JVM false bool
TraceJVMPI Trace JVMPI false bool
TraceJNICalls Trace JNI calls false bool
TraceJNIHandleAllocation Trace allocation/deallocation of JNI handle blocks false bool
TraceThreadEvents Trace all thread events false bool
TraceBytecodes Trace bytecode execution false bool
TraceClassInitialization Trace class initialization false bool
TraceExceptions Trace exceptions false bool
TraceICs Trace inline cache changes false bool
TraceInlineCacheClearing Trace clearing of inline caches in nmethods false bool
TraceDependencies Trace dependencies false bool
VerifyDependencies Exercise and verify the compilation dependency mechanism trueInDebug bool
TraceNewOopMapGeneration Trace OopMapGeneration false bool
TraceNewOopMapGenerationDetailed Trace OopMapGeneration: print detailed cell states false bool
TimeOopMap Time calls to GenerateOopMap::compute_map() in sum false bool
TimeOopMap2 Time calls to GenerateOopMap::compute_map() individually false bool
TraceMonitorMismatch Trace monitor matching failures during OopMapGeneration false bool
TraceOopMapRewrites Trace rewritting of method oops during oop map generation false bool
TraceSafepoint Trace safepoint operations false bool
TraceICBuffer Trace usage of IC buffer false bool
TraceCompiledIC Trace changes of compiled IC false bool
TraceStartupTime Trace setup time false bool
TraceHPI Trace Host Porting Interface (HPI) false bool
TraceProtectionDomainVerification Trace protection domain verifcation false bool
TraceClearedExceptions Prints when an exception is forcibly cleared false bool
UseParallelOldGCChunkPointerCalc In the Parallel Old garbage collector use chucks to calculate" new object locations true bool
VerifyParallelOldWithMarkSweep Use the MarkSweep code to verify phases of Parallel Old false bool
VerifyParallelOldWithMarkSweepInterval Interval at which the MarkSweep code is used to verify phases of Parallel Old 1 uintx
ParallelOldMTUnsafeMarkBitMap Use the Parallel Old MT unsafe in marking the bitmap false bool
ParallelOldMTUnsafeUpdateLiveData Use the Parallel Old MT unsafe in update of live size false bool
TraceChunkTasksQueuing Trace the queuing of the chunk tasks false bool
ScavengeWithObjectsInToSpace Allow scavenges to occur when to_space contains objects. false bool
UseCMSAdaptiveFreeLists Use Adaptive Free Lists in the CMS generation true bool
UseAsyncConcMarkSweepGC Use Asynchronous Concurrent Mark-Sweep GC in the old generation true bool
RotateCMSCollectionTypes Rotate the CMS collections among concurrent and STW false bool
CMSTraceIncrementalMode Trace CMS incremental mode false bool
CMSTraceIncrementalPacing Trace CMS incremental mode pacing computation false bool
CMSTraceThreadState Trace the CMS thread state (enable the trace_state() method) false bool
CMSDictionaryChoice Use BinaryTreeDictionary as default in the CMS generation 0 intx
CMSOverflowEarlyRestoration Whether preserved marks should be restored early false bool
CMSTraceSweeper Trace some actions of the CMS sweeper false bool
FLSVerifyDictionary Do lots of (expensive) FLS dictionary verification false bool
VerifyBlockOffsetArray Do (expensive!) block offset array verification false bool
TraceCMSState Trace the state of the CMS collection false bool
CMSTestInFreeList Check if the coalesced range is already in the free lists as claimed. false bool
CMSIgnoreResurrection Ignore object resurrection during the verification. true bool
FullGCALot Force full gc at every Nth exit from the runtime system (N=FullGCALotInterval) false bool
PromotionFailureALotCount Number of promotion failures occurring at ParGCAllocBuffer" refill attempts (ParNew) or promotion attempts (other young collectors) 1000 uintx
PromotionFailureALotInterval Total collections between promotion failures alot 5 uintx
WorkStealingSleepMillis Sleep time when sleep is used for yields 1 intx
WorkStealingYieldsBeforeSleep Number of yields before a sleep is done during workstealing 1000 uintx
TraceAdaptiveGCBoundary Trace young-old boundary moves false bool
PSAdaptiveSizePolicyResizeVirtualSpaceAlot Resize the virtual spaces of the young or old generations -1 intx
PSAdjustTenuredGenForMinorPause Adjust tenured generation to achive a minor pause goal false bool
PSAdjustYoungGenForMajorPause Adjust young generation to achive a major pause goal false bool
AdaptiveSizePolicyReadyThreshold Number of collections before the adaptive sizing is started 5 uintx
AdaptiveSizePolicyGCTimeLimitThreshold Number of consecutive collections before gc time limit fires 5 uintx
UsePrefetchQueue Use the prefetch queue during PS promotion true bool
ConcGCYieldTimeout If non-zero, assert that GC threads yield within this # of ms. 0 intx
TraceReferenceGC Trace handling of soft/weak/final/phantom references false bool
TraceFinalizerRegistration Trace registration of final references false bool
TraceWorkGang Trace activities of work gangs false bool
TraceBlockOffsetTable Print BlockOffsetTable maps false bool
TraceCardTableModRefBS Print CardTableModRefBS maps false bool
TraceGCTaskManager Trace actions of the GC task manager false bool
TraceGCTaskQueue Trace actions of the GC task queues false bool
TraceGCTaskThread Trace actions of the GC task threads false bool
TraceParallelOldGCMarkingPhase Trace parallel old gc marking phase false bool
TraceParallelOldGCSummaryPhase Trace parallel old gc summary phase false bool
TraceParallelOldGCCompactionPhase Trace parallel old gc compaction phase false bool
TraceParallelOldGCDensePrefix Trace parallel old gc dense prefix computation false bool
IgnoreLibthreadGPFault Suppress workaround for libthread GP fault false bool
CIPrintCompilerName when CIPrint is active, print the name of the active compiler false bool
CIPrintCompileQueue display the contents of the compile queue whenever a compilation is enqueued false bool
CIPrintRequests display every request for compilation false bool
CITimeEach display timing information after each successful compilation false bool
CICountOSR use a separate counter when assigning ids to osr compilations true bool
CICompileNatives compile native methods if supported by the compiler true bool
CIPrintMethodCodes print method bytecodes of the compiled code false bool
CIPrintTypeFlow print the results of ciTypeFlow analysis false bool
CITraceTypeFlow detailed per-bytecode tracing of ciTypeFlow analysis false bool
CICloneLoopTestLimit size limit for blocks heuristically cloned in ciTypeFlow 100 intx
UseStackBanging use stack banging for stack overflow checks (required for proper StackOverflow handling; disable only to measure cost of stackbanging) true bool
Use24BitFPMode Set 24-bit FPU mode on a per-compile basis true bool
Use24BitFP use FP instructions that produce 24-bit precise results true bool
UseStrictFP use strict fp if modifier strictfp is set true bool
GenerateSynchronizationCode generate locking/unlocking code for synchronized methods and monitors true bool
GenerateCompilerNullChecks Generate explicit null checks for loads/stores/calls true bool
GenerateRangeChecks Generate range checks for array accesses true bool
PrintSafepointStatistics print statistics about safepoint synchronization false bool
InlineAccessors inline accessor methods (get/set) true bool
UseCHA enable CHA true bool
PrintInlining prints inlining optimizations false bool
EagerInitialization Eagerly initialize classes if possible false bool
TraceMethodReplacement Print when methods are replaced do to recompilation false bool
PrintMethodFlushing print the nmethods being flushed false bool
UseRelocIndex use an index to speed random access to relocations false bool
StressCodeBuffers Exercise code buffer expansion and other rare state changes false bool
DebugVtables add debugging code to vtable dispatch false bool
PrintVtables print vtables when printing klass false bool
TraceCreateZombies trace creation of zombie nmethods false bool
MonomorphicArrayCheck Uncommon-trap array store checks that require full type check true bool
DelayCompilationDuringStartup Delay invoking the compiler until main application class is loaded true bool
CompileTheWorld Compile all methods in all classes in bootstrap class path (stress test) false bool
CompileTheWorldPreloadClasses Preload all classes used by a class before start loading true bool
TraceIterativeGVN Print progress during Iterative Global Value Numbering false bool
FillDelaySlots Fill delay slots (on SPARC only) true bool
VerifyIterativeGVN Verify Def-Use modifications during sparse Iterative Global Value Numbering false bool
TimeLivenessAnalysis Time computation of bytecode liveness analysis false bool
TraceLivenessGen Trace the generation of liveness analysis information false bool
PrintDominators Print out dominator trees for GVN false bool
UseLoopSafepoints Generate Safepoint nodes in every loop true bool
DeutschShiffmanExceptions Fast check to find exception handler for precisely typed exceptions true bool
FastAllocateSizeLimit Inline allocations larger than this in doublewords must go slow 100000 intx
UseVTune enable support for Intel's VTune profiler false bool
CountCompiledCalls counts method invocations false bool
CountJNICalls counts jni method invocations false bool
ClearInterpreterLocals Always clear local variables of interpreter activations upon entry false bool
UseFastSignatureHandlers Use fast signature handlers for native calls true bool
UseV8InstrsOnly Use SPARC-V8 Compliant instruction subset false bool
UseCASForSwap Do not use swap instructions, but only CAS (in a loop) on SPARC false bool
PoisonOSREntry Detect abnormal calls to OSR code true bool
CountBytecodes Count number of bytecodes executed false bool
PrintBytecodeHistogram Print histogram of the executed bytecodes false bool
PrintBytecodePairHistogram Print histogram of the executed bytecode pairs false bool
PrintSignatureHandlers Print code generated for native method signature handlers false bool
VerifyOops Do plausibility checks for oops false bool
CheckUnhandledOops Check for unhandled oops in VM code false bool
VerifyJNIFields Verify jfieldIDs for instance fields trueInDebug bool
VerifyFPU Verify FPU state (check for NaN's, etc.) false bool
VerifyThread Watch the thread register for corruption (SPARC only) false bool
VerifyActivationFrameSize Verify that activation frame didn't become smaller than its minimal size false bool
TraceFrequencyInlining Trace frequency based inlining false bool
PrintMethodData Print the results of +ProfileInterpreter at end of run false bool
VerifyDataPointer Verify the method data pointer during interpreter profiling trueInDebug bool
TraceCompilationPolicy Trace compilation policy false bool
TimeCompilationPolicy Time the compilation policy false bool
CounterHalfLifeTime half-life time of invocation counters (in secs) 30 intx
CounterDecayMinIntervalLength Min. ms. between invocation of CounterDecay 500 intx
TraceDeoptimization Trace deoptimization false bool
DebugDeoptimization Tracing various information while debugging deoptimization false bool
GuaranteedSafepointInterval Guarantee a safepoint (at least) every so many milliseconds (0 means none) 1000 intx
SafepointTimeoutDelay Delay in milliseconds for option SafepointTimeout 10000 intx
MallocCatchPtr Hit breakpoint when mallocing/freeing this pointer -1 intx
TotalHandleAllocationLimit Threshold for total handle allocation when +TraceHandleAllocation is used 1024 uintx
StackPrintLimit number of stack frames to print in VM-level stack dump 100 intx
MaxInlineLevel maximum number of nested calls that are inlined 9 intx
MaxRecursiveInlineLevel maximum number of nested recursive calls that are inlined 1 intx
InlineSmallCode Only inline already compiled methods if their code size is less than this 1000 intx
MaxTrivialSize maximum bytecode size of a trivial method to be inlined 6 intx
MinInliningThreshold min. invocation count a method needs to have to be inlined 250 intx
AlignEntryCode aligns entry code to specified value (in bytes) 4 intx
MethodHistogramCutoff cutoff value for method invoc. histogram (+CountCalls) 100 intx
ProfilerNumberOfInterpretedMethods # of interpreted methods to show in profile 25 intx
ProfilerNumberOfCompiledMethods # of compiled methods to show in profile 25 intx
ProfilerNumberOfStubMethods # of stub methods to show in profile 25 intx
ProfilerNumberOfRuntimeStubNodes # of runtime stub nodes to show in profile 25 intx
DontYieldALotInterval Interval between which yields will be dropped (milliseconds) 10 intx
MinSleepInterval Minimum sleep() interval (milliseconds) when ConvertSleepToYield is off (used for SOLARIS) 1 intx
ProfilerPCTickThreshold Number of ticks in a PC buckets to be a hotspot 15 intx
StressNonEntrant Mark nmethods non-entrant at registration false bool
TypeProfileWidth number of receiver types to record in call profile 2 intx
BciProfileWidth number of return bci's to record in ret profile 2 intx
FreqCountInvocations Scaling factor for branch frequencies (deprecated) 1 intx
InlineFrequencyRatio Ratio of call site execution to caller method invocation 20 intx
InlineThrowCount Force inlining of interpreted methods that throw this often 50 intx
InlineThrowMaxSize Force inlining of throwing methods smaller than this 200 intx
VerifyAliases perform extra checks on the results of alias analysis false bool
ProfilerNodeSize Size in K to allocate for the Profile Nodes of each thread 1024 intx
V8AtomicOperationUnderLockSpinCount Number of times to spin wait on a v8 atomic operation lock 50 intx
ExitAfterGCNum If non-zero, exit after this GC. 0 uintx
GCExpandToAllocateDelayMillis Delay in ms between expansion and allocation 0 uintx
CodeCacheSegmentSize Code cache segment size (in bytes) - smallest unit of allocation 64 uintx
BinarySwitchThreshold Minimal number of lookupswitch entries for rewriting to binary switch 5 intx
StopInterpreterAt Stops interpreter execution at specified bytecode number 0 intx
TraceBytecodesAt Traces bytecodes starting with specified bytecode number 0 intx
CIStart the id of the first compilation to permit 0 intx
CIStop the id of the last compilation to permit -1 intx
CIStartOSR the id of the first osr compilation to permit (CICountOSR must be on) 0 intx
CIStopOSR the id of the last osr compilation to permit (CICountOSR must be on) -1 intx
CIBreakAtOSR id of osr compilation to break at -1 intx
CIBreakAt id of compilation to break at -1 intx
CIFireOOMAt Fire OutOfMemoryErrors throughout CI for testing the compiler (non-negative value throws OOM after this many CI accesses in each compile) -1 intx
CIFireOOMAtDelay Wait for this many CI accesses to occur in all compiles before beginning to throw OutOfMemoryErrors in each compile -1 intx
NewCodeParameter Testing Only: Create a dedicated integer parameter before putback 0 intx
MinOopMapAllocation Minimum number of OopMap entries in an OopMapSet 8 intx
LongCompileThreshold Used with +TraceLongCompiles 50 intx
MaxRecompilationSearchLength max. # frames to inspect searching for recompilee 10 intx
MaxInterpretedSearchLength max. # interp. frames to skip when searching for recompilee 3 intx
DesiredMethodLimit desired max. method size (in bytecodes) after inlining 8000 intx
HugeMethodLimit don't compile methods larger than this if +DontCompileHugeMethods 8000 intx
UseNewReflection Temporary flag for transition to reflection based on dynamic bytecode generation in 1.4; can no longer be turned off in 1.4 JDK, and is unneeded in 1.3 JDK, but marks most places VM changes were needed true bool
VerifyReflectionBytecodes Force verification of 1.4 reflection bytecodes. Does not work in situations like that described in 4486457 or for constructors generated for serialization, so can not be enabled in product. false bool
FastSuperclassLimit Depth of hardwired instanceof accelerator array 8 intx
PerfTraceDataCreation Trace creation of Performance Data Entries false bool
PerfTraceMemOps Trace PerfMemory create/attach/detach calls false bool
SharedOptimizeColdStartPolicy Reordering policy for SharedOptimizeColdStart 0=favor classload-time locality, 1=balanced, 2=favor runtime locality 2 intx
product_pd
UseLargePages Use large page memory bool
UseSSE 0=fpu stack,1=SSE for floats,2=SSE/SSE2 for all (x86/amd only intx
BackgroundCompilation A thread requesting compilation is not blocked during compilation bool
UseVectoredExceptions Temp Flag - Use Vectored Exceptions rather than SEH (Windows Only) bool
DontYieldALot Throw away obvious excess yield calls (for SOLARIS only) bool
ConvertSleepToYield Converts sleep(0) to thread yield (may be off for SOLARIS to improve GUI) bool
UseTLAB Use thread-local object allocation bool
ResizeTLAB Dynamically resize tlab size for threads bool
NeverActAsServerClassMachine Never act like a server-class machine bool
PrefetchCopyIntervalInBytes How far ahead to prefetch destination area (<= 0 means off) intx
PrefetchScanIntervalInBytes How far ahead to prefetch scan area (<= 0 means off) intx
PrefetchFieldsAhead How many fields ahead to prefetch in oop scan (<= 0 means off) intx
CompilationPolicyChoice which compilation policy intx
RewriteBytecodes Allow rewriting of bytecodes (bytecodes are not immutable) bool
RewriteFrequentPairs Rewrite frequently used bytecode pairs into a single bytecode bool
UseOnStackReplacement Use on stack replacement, calls runtime if invoc. counter overflows in loop bool
PreferInterpreterNativeStubs Use always interpreter stubs for native methods invoked via interpreter bool
AllocatePrefetchStyle 0=no prefetch, 1=dead load, 2=prefetch instruction intx
AllocatePrefetchDistance Distance to prefetch ahead of allocation pointer intx
FreqInlineSize maximum bytecode size of a frequent method to be inlined intx
PreInflateSpin Number of times to spin wait before inflation intx
NewSize Default size of new generation (in bytes) uintx
TLABSize Default (or starting) size of TLAB (in bytes) uintx
SurvivorRatio Ratio of eden/survivor space size intx
NewRatio Ratio of new/old generation sizes intx
NewSizeThreadIncrease Additional size added to desired new generation size per non-daemon thread (in bytes) uintx
PermSize Default size of permanent generation (in bytes) uintx
MaxPermSize Maximum size of permanent generation (in bytes) uintx
StackYellowPages Number of yellow zone (recoverable overflows) pages intx
StackRedPages Number of red zone (unrecoverable overflows) pages intx
StackShadowPages Number of shadow zone (for overflow checking) pages" this should exceed the depth of the VM and native call stack intx
ThreadStackSize Thread Stack Size (in Kbytes) intx
VMThreadStackSize Non-Java Thread Stack Size (in Kbytes) intx
CompilerThreadStackSize Compiler Thread Stack Size (in Kbytes) intx
InitialCodeCacheSize Initial code cache size (in bytes) uintx
ReservedCodeCacheSize Reserved code cache size (in bytes) - maximum code cache size uintx
CodeCacheExpansionSize Code cache expansion size (in bytes) uintx
CompileThreshold number of method invocations/branches before (re-)compiling intx
Tier2CompileThreshold threshold at which a tier 2 compilation is invoked intx
Tier2BackEdgeThreshold Back edge threshold at which a tier 2 compilation is invoked intx
TieredCompilation Enable two-tier compilation bool
OnStackReplacePercentage number of method invocations/branches (expressed as % of CompileThreshold) before (re-)compiling OSR code intx
develop_pd
ShareVtableStubs Share vtable stubs (smaller code but worse branch prediction bool
CICompileOSR compile on stack replacement methods if supported by the compiler bool
ImplicitNullChecks generate code for implicit null checks bool
UncommonNullCast Uncommon-trap NULLs passed to check cast bool
InlineIntrinsics Inline intrinsics that can be statically resolved bool
ProfileInterpreter Profile at the bytecode level during interpretation bool
ProfileTraps Profile deoptimization traps at the bytecode level bool
InlineFrequencyCount Count of call site execution necessary to trigger frequent inlining intx
JVMInvokeMethodSlack Stack space (bytes) required for JVM_InvokeMethod to complete uintx
CodeEntryAlignment Code entry alignment for generated code (in bytes) intx
CodeCacheMinBlockLength Minimum number of segments in a code cache block. uintx
notproduct
StressDerivedPointers Force scavenge when a derived pointers is detected on stack after rtm call false bool
TraceCodeBlobStacks Trace stack-walk of codeblobs false bool
PrintRewrites Print methods that are being rewritten false bool
DeoptimizeRandom deoptimize random frames on random exit from the runtime system false bool
ZombieALot creates zombies (non-entrant) at exit from the runt. system false bool
WalkStackALot trace stack (no print) at every exit from the runtime system false bool
StrictSafepointChecks Enable strict checks that safepoints cannot happen for threads that used No_Safepoint_Verifier trueInDebug bool
VerifyLastFrame Verify oops on last frame on entry to VM false bool
LogEvents Enable Event log trueInDebug bool
CheckAssertionStatusDirectives temporary - see javaClasses.cpp false bool
PrintMallocFree Trace calls to C heap malloc/free allocation false bool
PrintOopAddress Always print the location of the oop false bool
VerifyCodeCacheOften Verify compiled-code cache often false bool
ZapDeadLocalsOld Zap dead locals (old version, zaps all frames when entering the VM false bool
CheckOopishValues Warn if value contains oop ( requires ZapDeadLocals) false bool
ZapVMHandleArea Zap freed VM handle space with 0xBCBCBCBC trueInDebug bool
PrintCompilation2 Print additional statistics per compilation false bool
PrintAdapterHandlers Print code generated for i2c/c2i adapters false bool
PrintCodeCache Print the compiled_code cache when exiting false bool
ProfilerCheckIntervals Collect and print info on spacing of profiler ticks false bool
WarnOnStalledSpinLock Prints warnings for stalled SpinLocks 0 uintx
PrintSystemDictionaryAtExit Prints the system dictionary at exit false bool
ValidateMarkSweep Do extra validation during MarkSweep collection false bool
RecordMarkSweepCompaction Enable GC-to-GC recording and querying of compaction during MarkSweep false bool
TraceRuntimeCalls Trace run-time calls false bool
TraceJVMCalls Trace JVM calls false bool
TraceInvocationCounterOverflow Trace method invocation counter overflow false bool
TraceZapDeadLocals Trace zapping dead locals false bool
CMSMarkStackOverflowALot Whether we should simulate frequent marking stack / work queue" overflow false bool
CMSMarkStackOverflowInterval A per-thread `interval' counter that determines how frequently" we simulate overflow; a smaller number increases frequency 1000 intx
CMSVerifyReturnedBytes Check that all the garbage collected was returned to the free lists. false bool
ScavengeALot Force scavenge at every Nth exit from the runtime system (N=ScavengeALotInterval) false bool
GCALotAtAllSafepoints Enforce ScavengeALot/GCALot at all potential safepoints false bool
PromotionFailureALot Use promotion failure handling on every youngest generation collection false bool
CheckMemoryInitialization Checks memory initialization false bool
TraceMarkSweep Trace mark sweep false bool
PrintReferenceGC Print times spent handling reference objects during GC (enabled only when PrintGCDetails) false bool
TraceScavenge Trace scavenge false bool
TimeCompiler time the compiler false bool
TimeCompiler2 detailed time the compiler (requires +TimeCompiler) false bool
LogMultipleMutexLocking log locking and unlocking of mutexes (only if multiple locks are held) false bool
PrintSymbolTableSizeHistogram print histogram of the symbol table false bool
ExitVMOnVerifyError standard exit from VM if bytecode verify error (only in debug mode) false bool
AbortVMOnException Call fatal if this exception is thrown. Example: java -XX:AbortVMOnException=java.lang.NullPointerException Foo "" ccstr
PrintVtableStats print vtables stats at end of run false bool
IgnoreLockingAssertions disable locking assertions (for speed) false bool
VerifyLoopOptimizations verify major loop optimizations false bool
CompileTheWorldIgnoreInitErrors Compile all methods although class initializer failed false bool
TracePhaseCCP Print progress during Conditional Constant Propagation false bool
TraceLivenessQuery Trace queries of liveness analysis information false bool
CollectIndexSetStatistics Collect information about IndexSets false bool
TraceCISCSpill Trace allocators use of cisc spillable instructions false bool
TraceSpilling Trace spilling false bool
CountVMLocks counts VM internal lock attempts and contention false bool
CountRuntimeCalls counts VM runtime calls false bool
CountJVMCalls counts jvm method invocations false bool
CountRemovableExceptions count exceptions that could be replaced by branches due to inlining false bool
ICMissHistogram produce histogram of IC misses false bool
PrintClassStatistics prints class statistics at end of run false bool
PrintMethodStatistics prints method statistics at end of run false bool
TraceOnStackReplacement Trace on stack replacement false bool
VerifyJNIEnvThread Verify JNIEnv.thread == Thread::current() when entering VM from JNI false bool
TraceTypeProfile Trace type profile false bool
MemProfilingInterval Time between each invocation of the MemProfiler 500 intx
AssertRepeat number of times to evaluate expression in assert (to estimate overhead); only works with -DUSE_REPEATED_ASSERTS 1 intx
SuppressErrorAt List of assertions (file:line) to muzzle "" ccstr
HandleAllocationLimit Threshold for HandleMark allocation when +TraceHandleAllocation is used 1024 uintx
MaxElementPrintSize maximum number of elements to print 256 intx
MaxSubklassPrintSize maximum number of subklasses to print when printing klass 4 intx
ScavengeALotInterval Interval between which scavenge will occur with +ScavengeALot 1 intx
FullGCALotInterval Interval between which full gc will occur with +FullGCALot 1 intx
FullGCALotStart For which invocation to start FullGCAlot 0 intx
FullGCALotDummies Dummy object allocated with +FullGCALot, forcing all objects to move 32*K intx
DeoptimizeALotInterval Number of exits until DeoptimizeALot kicks in 5 intx
ZombieALotInterval Number of exits until ZombieALot kicks in 5 intx
ExitOnFullCodeCache Exit the VM if we fill the code cache. false bool
CompileTheWorldStartAt First class to consider when using +CompileTheWorld 1 intx
CompileTheWorldStopAt Last class to consider when using +CompileTheWorld max_jint intx
diagnostic
UnlockDiagnosticVMOptions Enable processing of flags relating to field diagnostics trueInDebug bool
LogCompilation Log compilation activity in detail to hotspot.log or LogFile false bool
UnsyncloadClass Unstable: VM calls loadClass unsynchronized. Custom classloader must call VM synchronized for findClass & defineClass false bool
FLSVerifyAllHeapReferences Verify that all refs across the FLS boundary are to valid objects false bool
FLSVerifyLists Do lots of (expensive) FreeListSpace verification false bool
FLSVerifyIndexTable Do lots of (expensive) FLS index table verification false bool
VerifyBeforeExit Verify system before exiting trueInDebug bool
VerifyBeforeGC Verify memory system before GC false bool
VerifyAfterGC Verify memory system after GC false bool
VerifyDuringGC Verify memory system during GC (between phases) false bool
VerifyRememberedSets Verify GC remembered sets false bool
VerifyObjectStartArray Verify GC object start array if verify before/after true bool
BindCMSThreadToCPU Bind CMS Thread to CPU if possible false bool
CPUForCMSThread When BindCMSThreadToCPU is true, the CPU to bind CMS thread to 0 uintx
TraceJVMTIObjectTagging Trace JVMTI object tagging calls false bool
VerifyBeforeIteration Verify memory system before JVMTI iteration false bool
DebugNonSafepoints Generate extra debugging info for non-safepoints in nmethods trueInDebug bool
SerializeVMOutput Use a mutex to serialize output to tty and hotspot.log true bool
DisplayVMOutput Display all VM output on the tty, independently of LogVMOutput true bool
LogVMOutput Save VM output to hotspot.log, or to LogFile trueInDebug bool
LogFile If LogVMOutput is on, save VM output to this file [hotspot.log] "" ccstr
MallocVerifyInterval if non-zero, verify C heap after every N calls to malloc/realloc/free 0 intx
MallocVerifyStart if non-zero, start verifying C heap after Nth call to malloc/realloc/free 0 intx
VerifyGCStartAt GC invoke count where +VerifyBefore/AfterGC kicks in 0 uintx
VerifyGCLevel Generation level at which to start +VerifyBefore/AfterGC 0 intx
UseNewCode Testing Only: Use the new version while testing false bool
UseNewCode2 Testing Only: Use the new version while testing false bool
UseNewCode3 Testing Only: Use the new version while testing false bool
SharedOptimizeColdStart At dump time, order shared objects to achieve better cold startup time. true bool
SharedSkipVerify Skip assert() and verify() which page-in unwanted shared objects. false bool
PauseAtStartup Causes the VM to pause at startup time and wait for the pause file to be removed (default: ./vm.paused.) false bool
PauseAtStartupFile The file to create and for whose removal to await when pausing at startup. (default: ./vm.paused.) "" ccstr
Posted in JAVA, 工作日志 | Leave a comment

当前几个主要的Lucene中文分词器的比较

1. 基本介绍:

paoding :Lucene中文分词“庖丁解牛” Paoding Analysis
imdict :imdict智能词典所采用的智能中文分词程序
mmseg4j : 用 Chih-Hao Tsai 的 MMSeg 算法 实现的中文分词器
ik :采用了特有的“正向迭代最细粒度切分算法“,多子处理器分析模式

2. 开发者及开发活跃度:

paodingqieqie.wang, google code 上最后一次代码提交:2008-06-12,svn 版本号 132
imdictXiaoPingGao, 进入了 lucene contribute,lucene trunk 中 contrib/analyzers/smartcn/ 最后一次提交:2009-07-24,
mmseg4jchenlb2008,google code 中 2009-08-03 (昨天),版本号 57,log为:mmseg4j-1.7 创建分支
iklinliangyi2005,google code 中 2009-07-31,版本号 41

3. 用户自定义词库:

paoding :支持不限制个数的用户自定义词库,纯文本格式,一行一词,使用后台线程检测词库的更新,自动编译更新过的词库到二进制版本,并加载
imdict :暂时不支持用户自定义词库。但 原版 ICTCLAS 支持。支持用户自定义 stop words
mmseg4j :自带sogou词库,支持名为 wordsxxx.dic, utf8文本格式的用户自定义词库,一行一词。不支持自动检测。 -Dmmseg.dic.path
ik : 支持api级的用户词库加载,和配置级的词库文件指定,无 BOM 的 UTF-8 编码,\r\n 分割。不支持自动检测。

4. 速度(基于官方介绍,非自己测试)

paoding :在PIII 1G内存个人机器上,1秒 可准确分词 100万 汉字
imdict483.64 (字节/秒),259517(汉字/秒)
mmseg4j : complex 1200kb/s左右, simple 1900kb/s左右
ik :具有50万字/秒的高速处理能力

5. 算法和代码复杂度

paoding :svn src 目录一共1.3M,6个properties文件,48个java文件,6895 行。使用不用的 Knife 切不同类型的流,不算很复杂。
imdict :词库 6.7M(这个词库是必须的),src 目录 152k,20个java文件,2399行。使用 ICTCLAS HHMM隐马尔科夫模型,“利用大量语料库的训练来统计汉语词汇的词频和跳转概率,从而根据这些统计结果对整个汉语句子计算最似然(likelihood)的切分”
mmseg4j : svn src 目录一共 132k,23个java文件,2089行。MMSeg 算法 ,有点复杂。
ik : svn src 目录一共6.6M(词典文件也在里面),22个java文件,4217行。多子处理器分析,跟paoding类似,歧义分析算法还没有弄明白。

6. 文档

paoding :几乎无。代码里有一些注释,但因为实现比较复杂,读代码还是有一些难度的。
imdict : 几乎无。 ICTCLAS 也没有详细的文档,HHMM隐马尔科夫模型的数学性太强,不太好理解。
mmseg4jMMSeg 算法 是英文的,但原理比较简单。实现也比较清晰。
ik : 有一个pdf使用手册,里面有使用示例和配置说明。

7. 其它

paoding :引入隐喻,设计比较合理。search 1.0 版本就用的这个。主要优势在于原生支持词库更新检测。主要劣势为作者已经不更新甚至不维护了。
imdict :进入了 lucene trunk,原版 ictclas 在各种评测中都有不错的表现,有坚实的理论基础,不是个人山寨。缺点为暂时不支持用户词库。
mmseg4j : 在complex基础上实现了最多分词(max-word),但是还不成熟,还有很多需要改进的地方。
ik :  针对Lucene全文检索优化的查询分析器IKQueryParser

8. 结论

个人觉得,可以在 mmseg4j 和 paoding 中选一个。关于这两个分词效果的对比,可以参考:

http://blog.chenlb.com/2009/04/mmseg4j-max-word-segment-compare-with-paoding-in-effect.html

或者自己再包装一下,将 paoding 的词库更新检测做一个单独的模块实现,然后就可以在所有基于词库的分词算法之间无缝切换了。

ps,对不同的 field 使用不同的分词器是一个可以考虑的方法。比如 tag 字段,就应该使用一个最简单的分词器,按空格分词就可以了。

Posted in LUCENE, 工作日志 | 2 Comments