2008年7月31日星期四

我们的骄傲

软件: SourceForge宣布2008年度社区选择奖

matrix 发表于 2008年8月01日 08时35分 星期五 Printer-friendly Email story
来自最有可能被GFW封杀项目部门
SourceForge宣布了2008年度社区选择奖 最佳项目、最佳企业项目和最佳教育项目皆为OpenOffice.org
最有可能成为下个10亿美元收购项目:phpMyAdmin
最佳多媒体项目:VLC
最佳游戏项目:XBMC
最可能改变世界的项目:Linux
最佳新项目:Magento
最有可能被起诉侵犯专利的项目:Wine Is Not an Emulator
最有可能让用户被过时的行业协会为保护死亡的商业模式起诉的项目:eMule
最佳系统管理工具:phpMyAdmin
最佳开发者工具:Notepad++

2008年7月30日星期三

amazon太牛了

对于一个真正爱好计算机的人来说,没有比在amazon上面找到如此多的英文原版书籍更快乐的事情了:)

2008年7月29日星期二

ipod死机啦!!

正在听红辣椒,换到另一首的时候,系统挂掉,然后一直保持这种状态,屏幕长亮。
回家升级ipod的软件版本到1.3,然后ipod不停的重启,直到我拔掉该死的usb线,妈的,这反映和我司的路由器一样(挂住-升级-不停重启)。

2008年7月28日星期一

邋遢大王和郑渊洁

突然在电视上面看到了郑渊洁,想起了《舒克和贝塔》,《皮皮鲁和鲁西西》,还有邋遢大王,感觉像是回到了小学、初中的假期。
去verycd上面搜一下,还真是有资源,这两天赶紧都下下来!

2008年7月26日星期六

10号线

今天去北京站送老妈回来,觉得闲来无事,决定坐下10号线,体验一下新的地铁。
从国贸那里转车,通往10号线的通道很漂亮,也很新,国道里面还是照旧有北京的傻老娘们跟那看着,怕人们乱穿通道;上了车发现人不是很多,车厢设计的很科学,包括残疾人的轮椅座位,正常人靠到上面也很舒服,车站之间的距离很短,车停第一下往往对不准外面的防护门,还有开通前吹牛的什么全程可以打电话,我上去是全程没有信号,妈的,不知道是运营商还是谁的问题,反正就是不能一次把事情做对,就是让你留点遗憾。
以后去中关村可以做地铁了。

2008年7月25日星期五

一个伟大的人

科学: 卡耐基梅隆大学“最后一课”教授Randy Pausch去世

matrix 发表于 2008年7月26日 10时12分 星期六 Printer-friendly Email story
来自再见部门
Randy Pausch,卡耐基梅隆大学的教授,《人生最后一课》的作者终于因癌症逝世(不,他没有屈服,只是下课铃响了)。2008年7月25日,他在家中去世,最后身边陪伴着的是妻子和3个孩子,终年47岁。 2007年9月,罹患胰脏癌的Randy Pausch作了一次极为成功的《最后一课》演讲演讲视频(Google video)的下载量超过了1千万次,奇幻基金会的创始人朱学恒先生也将其翻译成了中文。他在演讲中告诉观众永远不要放弃梦想,面对困难他说“但请记住,阻挡你的障碍必有其原因!这道墙并不是为了阻止我们,这道墙让我们有机会展现自己有多想达到这目标。这道墙是为了阻止那些不够渴望的人,它们是为了阻挡那些不够热爱的人而存在的。

又是top10--没想到关系数据库这么牛:P

Top 10 Concepts That Every Software Engineer Should Know

Written by Alex Iskold / July 22, 2008 8:21 PM / 49 Comments

The future of software development is about good craftsmen. With infrastructure like Amazon Web Services and an abundance of basic libraries, it no longer takes a village to build a good piece of software.

These days, a couple of engineers who know what they are doing can deliver complete systems. In this post, we discuss the top 10 concepts software engineers should know to achieve that.

A successful software engineer knows and uses design patterns, actively refactors code, writes unit tests and religiously seeks simplicity. Beyond the basic methods, there are concepts that good software engineers know about. These transcend programming languages and projects - they are not design patterns, but rather broad areas that you need to be familiar with. The top 10 concepts are:

  1. Interfaces
  2. Conventions and Templates
  3. Layering
  4. Algorithmic Complexity
  5. Hashing
  6. Caching
  7. Concurrency
  8. Cloud Computing
  9. Security
  10. Relational Databases

10. Relational Databases

Relational Databases have recently been getting a bad name because they cannot scale well to support massive web services. Yet this was one of the most fundamental achievements in computing that has carried us for two decades and will remain for a long time. Relational databases are excellent for order management systems, corporate databases and P&L data.

At the core of the relational database is the concept of representing information in records. Each record is added to a table, which defines the type of information. The database offers a way to search the records using a query language, nowadays SQL. The database offers a way to correlate information from multiple tables.

The technique of data normalization is about correct ways of partitioning the data among tables to minimize data redundancy and maximize the speed of retrieval.

9. Security

With the rise of hacking and data sensitivity, the security is paramount. Security is a broad topic that includes authentication, authorization, and information transmission.

Authentication is about verifying user identity. A typical website prompts for a password. The authentication typically happens over SSL (secure socket layer), a way to transmit encrypted information over HTTP. Authorization is about permissions and is important in corporate systems, particularly those that define workflows. The recently developed OAuth protocol helps web services to enable users to open access to their private information. This is how Flickr permits access to individual photos or data sets.

Another security area is network protection. This concerns operating systems, configuration and monitoring to thwart hackers. Not only network is vulnerable, any piece of software is. Firefox browser, marketed as the most secure, has to patch the code continuously. To write secure code for your system requires understanding specifics and potential problems.

8. Cloud Computing

In our recent post Reaching For The Sky Through Compute Clouds we talked about how commodity cloud computing is changing the way we deliver large-scale web applications. Massively parallel, cheap cloud computing reduces both costs and time to market.

Cloud computing grew out of parallel computing, a concept that many problems can be solved faster by running the computations in parallel.

After parallel algorithms came grid computing, which ran parallel computations on idle desktops. One of the first examples was SETI@home project out of Berkley, which used spare CPU cycles to crunch data coming from space. Grid computing is widely adopted by financial companies, which run massive risk calculations. The concept of under-utilized resources, together with the rise of J2EE platform, gave rise to the precursor of cloud computing: application server virtualization. The idea was to run applications on demand and change what is available depending on the time of day and user activity.

Today's most vivid example of cloud computing is Amazon Web Services, a package available via API. Amazon's offering includes a cloud service (EC2), a database for storing and serving large media files (S3), an indexing service (SimpleDB), and the Queue service (SQS). These first blocks already empower an unprecedented way of doing large-scale computing, and surely the best is yet to come.

7. Concurrency

Concurrency is one topic engineers notoriously get wrong, and understandibly so, because the brain does juggle many things at a time and in schools linear thinking is emphasized. Yet concurrency is important in any modern system.

Concurrency is about parallelism, but inside the application. Most modern languages have an in-built concept of concurrency; in Java, it's implemented using Threads.

A classic concurrency example is the producer/consumer, where the producer generates data or tasks, and places it for worker threads to consume and execute. The complexity in concurrency programming stems from the fact Threads often needs to operate on the common data. Each Thread has its own sequence of execution, but accesses common data. One of the most sophisticated concurrency libraries has been developed by Doug Lea and is now part of core Java.

6. Caching

No modern web system runs without a cache, which is an in-memory store that holds a subset of information typically stored in the database. The need for cache comes from the fact that generating results based on the database is costly. For example, if you have a website that lists books that were popular last week, you'd want to compute this information once and place it into cache. User requests fetch data from the cache instead of hitting the database and regenerating the same information.

Caching comes with a cost. Only some subsets of information can be stored in memory. The most common data pruning strategy is to evict items that are least recently used (LRU). The prunning needs to be efficient, not to slow down the application.

A lot of modern web applications, including Facebook, rely on a distributed caching system called Memcached, developed by Brad Firzpatrick when working on LiveJournal. The idea was to create a caching system that utilises spare memory capacity on the network. Today, there are Memcached libraries for many popular languages, including Java and PHP.

5. Hashing

The idea behind hashing is fast access to data. If the data is stored sequentially, the time to find the item is proportional to the size of the list. For each element, a hash function calculates a number, which is used as an index into the table. Given a good hash function that uniformly spreads data along the table, the look-up time is constant. Perfecting hashing is difficult and to deal with that hashtable implementations support collision resolution.

Beyond the basic storage of data, hashes are also important in distributed systems. The so-called uniform hash is used to evenly allocate data among computers in a cloud database. A flavor of this technique is part of Google's indexing service; each URL is hashed to particular computer. Memcached similarly uses a hash function.

Hash functions can be complex and sophisticated, but modern libraries have good defaults. The important thing is how hashes work and how to tune them for maximum performance benefit.

4. Algorithmic Complexity

There are just a handful of things engineers must know about algorithmic complexity. First is big O notation. If something takes O(n) it's linear in the size of data. O(n^2) is quadratic. Using this notation, you should know that search through a list is O(n) and binary search (through a sorted list) is log(n). And sorting of n items would take n*log(n) time.

Your code should (almost) never have multiple nested loops (a loop inside a loop inside a loop). Most of the code written today should use Hashtables, simple lists and singly nested loops.

Due to abundance of excellent libraries, we are not as focused on efficiency these days. That's fine, as tuning can happen later on, after you get the design right.

Elegant algorithms and performance is something you shouldn't ignore. Writing compact and readable code helps ensure your algorithms are clean and simple.

3. Layering

Layering is probably the simplest way to discuss software architecture. It first got serious attention when John Lakos published his book about Large-scale C++ systems. Lakos argued that software consists of layers. The book introduced the concept of layering. The method is this. For each software component, count the number of other components it relies on. That is the metric of how complex the component is.

Lakos contended a good software follows the shape of a pyramid; i.e., there's a progressive increase in the cummulative complexity of each component, but not in the immediate complexity. Put differently, a good software system consists of small, reusable building blocks, each carrying its own responsibility. In a good system, no cyclic dependencies between components are present and the whole system is a stack of layers of functionality, forming a pyramid.

Lakos's work was a precursor to many developments in software engineering, most notably Refactoring. The idea behind refactoring is continuously sculpting the software to ensure it'is structurally sound and flexible. Another major contribution was by Dr Robert Martin from Object Mentor, who wrote about dependecies and acyclic architectures

Among tools that help engineers deal with system architecture are Structure 101 developed by Headway software, and SA4J developed by my former company, Information Laboratory, and now available from IBM.

2. Conventions and Templates

Naming conventions and basic templates are the most overlooked software patterns, yet probably the most powerful.

Naming conventions enable software automation. For example, Java Beans framework is based on a simple naming convention for getters and setters. And canonical URLs in del.icio.us: http://del.icio.us/tag/software take the user to the page that has all items tagged software.

Many social software utilise naming conventions in a similar way. For example, if your user name is johnsmith then likely your avatar is johnsmith.jpg and your rss feed is johnsmith.xml.

Naming conventions are also used in testing, for example JUnit automatically recognizes all the methods in the class that start with prefix test.

The templates are not C++ or Java language constructs. We're talking about template files that contain variables and then allow binding of objects, resolution, and rendering the result for the client.

Cold Fusion was one of the first to popularize templates for web applications. Java followed with JSPs, and recently Apache developed handy general purpose templating for Java called Velocity. PHP can be used as its own templating engine because it supports eval function (be careful with security). For XML programming it is standard to use XSL language to do templates.

From generation of HTML pages to sending standardized support emails, templates are an essential helper in any modern software system.

1. Interfaces

The most important concept in software is interface. Any good software is a model of a real (or imaginary) system. Understanding how to model the problem in terms of correct and simple interfaces is crucial. Lots of systems suffer from the extremes: clumped, lengthy code with little abstractions, or an overly designed system with unnecessary complexity and unused code.

Among the many books, Agile Programming by Dr Robert Martin stands out because of focus on modeling correct interfaces.

In modeling, there are ways you can iterate towards the right solution. Firstly, never add methods that might be useful in the future. Be minimalist, get away with as little as possible. Secondly, don't be afraid to recognize today that what you did yesterday wasn't right. Be willing to change things. Thirdly, be patient and enjoy the process. Ultimately you will arrive at a system that feels right. Until then, keep iterating and don't settle.

Conclusion

Modern software engineering is sophisticated and powerful, with decades of experience, millions of lines of supporting code and unprecidented access to cloud computing. Today, just a couple of smart people can create software that previously required the efforts of dozens of people. But a good craftsman still needs to know what tools to use, when and why.

In this post we discussed concepts that are indispensible for software engineers. And now tell us please what you would add to this list. Sh

2008年7月24日星期四

喝高了

又一个同事离职了,一个我认为不太可能离职的人,哎,干杯!!

2008年7月23日星期三

pidgin升级后,可以显示文字动画了

感觉没有原来好用了,而且有bug,比如发信息的时候有延时,打字的时候有时候没有候选的字,哎,怎么会这样捏??

2008年7月21日星期一

amd

今天从solidot上面看到这条新闻
"

硬件: AMD亏损12亿美元,CEO辞职

matrix 发表于 2008年7月20日 14时15分 星期日 Printer-friendly Email story
来自漫漫长途部门
AMD的第二季度财政报告亏损12亿美元,AMD 的CEO已经辞职。填补空缺的将是Dirk Meyer,前公司董事长和COO。 仅仅在两年半前,AMD还在处理器性能取得领先,对Intel的市场地位发起了冲击。但是现在AMD再一次落后于Intel的Core技术,在刚刚开启的 多核时代再次在性能上掉在后面。问题随着巴塞罗那(Barcelona)处理器发布进一步恶化,并促使CEO于去年12月公开道歉。
"

心里感到很可惜,其实从今年开始就能感觉得到,amd的新架构cpu迟迟不能推出,或者推出了就严重的bug,肯定是amd被intel搞的手忙脚乱了,自己的内部开发都受到了外界的严重的影响,其实还是amd自己的内功不行,啥时候amd能真正的稳定持续的盈利,就说明丫可以和intel真正的pk了。

看来要买amd的新架构cpu,得等明年这时候了。

要么不更新,要么100个更新

今天升级ubuntu,一上午才搞完61个,妈的,网速太慢,明天go on

2008年7月20日星期日

功夫熊猫

今天终于和小狗去把这电影看了,最近北京地铁里面的人超级多,外国人也多了起来,我们下午3点赶到西单,换了4点的电影票,小狗还犹豫是想看赤壁,我坚持看熊猫。

非常不错非常好看的一部电影,典型的美国思维+中国人物,很励志,很积极向上,符合美国人一贯的价值观,还有他们对中国的理解,就是比中国的煞笔大片强1万倍。小狗也非常的喜欢这部电影。

被美国的动画设计,电脑科技所深深的震撼,中国人啥时候能搞出这种动画特技呀

2008年7月19日星期六

天津记忆

1 我的破sony笔记本+ubuntu6.06LTS跑起来嗖嗖的,真叫一个好用,好使!

2 天津的街道也和上海一样,用全国各地的地名来命名。

3 天津的窗口行业服务态度那叫一个差,真不是我对天津有偏差,那地方的人感觉给人非常的不友好,从宾馆酒店到出租车司机,都对人爱搭不理的,说话也傻的要死,那的口音真叫人讨厌。

4 还有天津的破包子-狗不理,我诅咒他尽快的倒闭,啥破玩意啊,真不知道他们怎么把老祖宗的东西经营成这样,我第一次见到这么急功近利的百年老店,快餐包子店里面提供的都是一次性的碗和盘子,我的同事向他们要盛醋的盘子,他们不但不给我们,还反问我们要干什么,我同事大怒,和他们大吵了一架。

5 酒店,当天晚上到天津的时候换了两个酒店,换到第三家才住下来,就是在第三家那里,我终于用我的无敌ubuntu6.06lts上了网,真叫一个快呀。

6 出租车司机:拒载率非常的高,而且还不喜欢给你打印发票

7 十八街麻花:卖麻花的地方都管自己叫总店,店面没有超过10M2的,而且一般是两个服务员,里面除了麻花以外,还卖烟和酒,还有软饮料,非常搞笑

8 火车站,这个临时的火车站用了一年多了,是我这辈子见过的最脏最破的车站

就这么多吧,实在是再也不想去那个破地方了:(

天津归来

该死的招聘终于结束了,结果是一无所获,很打击人的信心,联系了半天,只有三个人来,有两个还直接被干部处家伙给pass了,我连面都没有见到。
还有一个怪家伙,85年的,就结婚了,胖胖的,看上去就不爽,我把他推荐给别的组了。
还有一个做单片机的,pass。
的确有不错的家伙,大家都在抢。
实在不想等到下午5点再回,于是2点吃完饭就去天津的车站了。
那个破临时站都用了一年了,真不知道天津市政府是怎么想的,和难民营一样。
没有赶上2:15的火车,买的3:45的票,结果车晚点42分钟,真他妈的让人沮丧。
赶到家里都7点了,真叫一个累呀,不过有狗狗给我做好了饭。
真没有想到这次出差这么狼狈。
没事,咱一直的作风就是屡败屡战:)

2008年7月17日星期四

天津太烂了

我的ubuntu又被我搞坏了,这下只能在宾馆里面看电视了,55555
实在受不了,17号晚上回到北京,哈哈,18号中午再赶回去

2008年7月15日星期二

tianjin chuchai

sony notebook : p3 cpu,256Msdram, 30Ghardisk;ubuntu 6.06LTS
on internet as fly!!

2008年7月13日星期日

朝鲜战争

最近看了一些关于历史的书籍(流血的仕途)和电影(巴顿将军);渐渐的对以前那些熟悉的历史产生了重新认识的性趣。
今天去wikipedia上面看了下对--朝鲜战争的介绍,虽然大家对历史的解读不可能完全一致,但是我从上面学到了让我从前不知道的知识和认识,我现在有必要好好的对以前受到的历史教育反省和思考一下了。
让我不禁联想到那些谴责日本修改教科书的国人们,是否反省了我们自己的教科书。呵呵,“中国人们志愿军”,“抗美援朝”,现在看起来很像一个玩笑。

这个周末我干了啥?

昨天是周六,中午我去公司冒了个泡就跑回家了,下午睡到4点,和狗狗一起去逛西单,
6点钟才到那里,本来要去看《赤壁》,结果只有最早9:20的那一场了;于是只能去买衣服了。
狗狗买了2件一服,我买了双new balance的鞋子,晚上11:30我们才回家,
饿的不行了都,我炒了蛋炒饭我们吃,半夜两点才睡,狗狗下午睡多了,结果失眠,我倒是睡着了
但是作了噩梦,梦到自己回到通辽了,不知道为啥,自己很害怕。

今天,上午去给我舅的太阳能热水器加水,下午狗狗怕自己睡觉
于是我们去西三旗那里游泳,那里的泳池又小人又多,环境差的不行,我们还是勉强游了一会儿,我都快忘记怎么游了,但是我觉得自己快要学会了。

明天又是星期一拉。。。

2008年7月12日星期六

我的ubuntu启动时间

今天认真的拿电子表测了一下我的ubunut的启动时间,
8.04-desktop正式版,amd64*2 4800+,2G内存,
从grub到显示输入密码对话框,一共30s。
8.04-server版本更牛,只要5s。
对比windows 2003server,要3分钟。

2008年7月10日星期四

一本好书

learning the vi editor--6thed
这几天在公司一直在看,非常喜欢,写的很到位
现在很喜欢看电子版的图书,不用拿在手里,比纸书看起来省力多了

2008年7月7日星期一

狗狗回来了

我下班赶快回家,给她做饭,还给她倒果汁喝,呵呵,
她回来可真好

2008年7月6日星期日

记一下我今天都干了啥

今天我妈又到北京了,她5月份刚刚回的家,也就在家休息了1个多月吧。
我早上10:20起床,去北京站接她,12点到的建国门,(今天第一次从建国门下车到北京站,也就走10分钟吧,挺近的其实)顺便给小狗打了个电话,她在日照玩的不错
接到我妈后,她把带的东西给我,我就回家了
顺便去上地那里买了百合给狗狗,发现那的芥蓝只有一块一支,被成铁上面卖花的骗了
回家2:30,去超市大采购,买鸡蛋,豆浆,饼干,鱼罐头,,,
回家看《巴顿将军》,恩,现在准备给自己做晚饭吃

今天外面奇热无比,太阳非常非常大,我觉得自己的脖子快要被晒掉皮了,
在家里面呆着真是非常享受,要是有小狗一起在就更好了,哈哈

2008年7月4日星期五

小狗去日照了

小狗这个周末要去山东日照开会, 我要一个人过周末了,
今天晚上北京下大雨,小狗去车站的路上被雨浇到了,生了我的气,
我应该去送送她的,我决定在她回来的那天给她点惊喜

爱上郎贤平

看了他3,4年前主持的《财经郎闲评》,非常过瘾,终于听到了不一样的声音,
心里非常兴奋:)
看到他在那里痛斥那些企业和企业家,觉得非常过瘾,也被普及了知识,
这才是真正的经济学家

2008年7月3日星期四

ubuntu

安装了scim-chinese之后,终于可以输入中文了,现在越来越喜欢ubuntu了:)
这两天乌七八糟的装了很多软件,但是我把wine给卸载掉了,觉得在linux上面用它不伦不类的,
还下定决心要学好vim