Stand still and watch the patterns, which by pure chance have been generated: Stains on the wall, or the ashes in a fireplace or the clouds in the sky, or the gravel on the beach, or other things. If you look at them carefully you might discover miraculous inventions. (Leonardo da Vinci)
 

Agile Werte und Prinzipien - Der Weg von Zen Agile

March 31st, 2010 Development, Knowledge| No Comments »


5 Principles to improve your Code Quality

March 18th, 2009 Development| No Comments »

Since I write software I am conviced that this is not just a kind of a technical handicraft but also an act of high creativity - say art! Since I work with different people to produce software code in a team I wonder how different working styles, the different personal and professional backgrounds, and the chaos of … uhmmm … artistic individualism can lead to high quality. To often there is chaos in the code as well, even if individual modules seem to be almost perfect. I want to name 5 fundamental priciples every programmer should follow to prepare the ground of high quality software because upon this ground there is enough space to use his / her own coding and working style as a valuable part of the whole.

1. Simplicity and Elegance!

Simplicity is not always elegance and elegance is not always simple. But in coding there are some strong links between both. In my eyes simplicity means that the complexity of a system must be adequate and comprehensible. Every part of a system should add something distinct and valuable to the whole. The parts are associated together by similarity or domain and don’t hide any nasty surprises. Achieving this is not easy. Simplicity is not easy - it’s hard work of understanding what the code is for and how several parts can be abstracted and substituted. The coder have to recognize patterns within the code and between several projects to identify modules that can be added to common libraries… and so on. A good use of interfaces underlines the simplicity - the coder reduces the complexity of system parts for other parts or other developers that use a system.

And there is a state of simplicity that is somehow elegant or aesthetic. It takes time to really understand that.

2. Give it a proper Name!

When you really know the “thing, be it a class, an objet reference, a variable or a function, there shouldn’t be any hassles in finding a proper name. Names can help you to think about the concepts and functionalties you write or use while developing and they should help you to understand the code while reading it.

3. “Make it readable” equals “make it understandable”!

Your audience is not just the compiler and yourself. It’s other programmers, your team mates, possibly your client or your successor (when you decide to leave). And it’s part of your job to write code for their benefits. There are three simple words that summarize what to do: The three C’s - Consistent, Conventional, Concise!

  • Consistent
    Develop your own style and use it throughout your code parts and projects. If you are coding for a company adopt the “house style” and use that instead of your own. It’s somehow stupid that there are so many fights about how to set brackets, how to intend code, how to insert comments or even how to sort the member of classes. I say: Fight if you want to but stick to the result whenever you find one.
  • Conventional
    Even if you have your own style let it be based on one of the industry standards (e.g. K&K Brace style, Extended Brace Style, Intended Brace Style …)
  • Concise
    Know what you are doing and why. If someone asks you why you write it that way, know your arguments - Think about it! It’s important whenever there is a new team member who has to adopt your style and that is not easy if there are no good reasons to use it.

4. Be defensive!

Be careful while coding! Don’t code in a hurry! Think about what you are doing: It’s engineering and solely because it’s just writing some code to a computer it is still some kind of construction. And construction has to be safe; otherwise there is a high risk of a (system) collapse. Therefore a good software developer has always to be doubtfully about the correctness of some piece of code. He never should assume the context in which a function is called, he should not assume that a function never will nver produce an error and that nobody will try to access that function in ways it is not meant to be called. Some additional points to have in mind when coding the defensice way:

  • use safe data structures
  • check every return value (even if it’s just a standard function you used already for thousends of times)
  • handle memory carefully
  • initiate variables always when they are declared (or use the constructor of a class to initiate fields)
  • check numeric limits
  • add preconditions, postconditions and check vor invariants within functions

5. Selfdocumenting Code needs less documentation!

Edwin Schlossberg: “The skill of writing is to create a context in which other people can think” (found here) When you code always think about others that have to read and understand what you did later on. You only need to comment that fundamental parts (classes, funtions) and avoid extensive inner code comments when you choose meaningful names, avoid redundance, have clear data flows, strict separation of concern and very high cohesion.

Just think about a function call “int x = r12.s(3);” where you also could write “int innerSize = room12.getSize(Room.INNER_SIZE);” Got it?


2 Examples of how OpenSource could improve your overall dev-team performance

February 23rd, 2009 Development, Open Source| 1 Comment »

Developing special, complex, but independent functional subsystems can be done in two ways:

  1. Develop it by yourself
  2. using OpenSource Frameworks, APIs, and SDKs

Of course, this is dependent on the type of software you want to create, it is dependend on the domain, the company you’re working for, and the basic conditions you’re facing. However, my experience usually is: A developer team is confronted with a project, which is too large to accomplish. The common reasons: lack of man power, time, and knowledge. In some cases the team will completely reject the project, in other cases the team will recommend a lightweight version which only covers a subset of the requested features, and in a few cases the team will just begin to develop a solution on its own … flatlining due to those high requirements. In most cases, the easiest way to solve complex but well known problems with limited resources is to use OpenSource Frameworks or APIs. There are solutions for almost every task one could face in programming. Using these, the team can focus on the important things and put first things first. I want to show two examples of how a OpenSource producted helped me and my team to develop a solution which we couldn’t accomplish on our own with the given resources (Manpower, Time)

Example 1: Implementing a flexible, high performance Enterprise Database Search for an existing PHP/MySQL System.

The problem: The client had had a very large mySQL database which was very slow and unstructured: A composite of about 200 tables with up to 18 Million data sets and each table having50 fields and more. The tables were not normalized nor were the fields optimized using proper filed types and lengths. To perform search requests they added two hash values allowing users to find information very fast as long as they’re just looking for standard values. More sophisticated search requests, table analytics, and fuzzy searches are simply not usable: They need about 20 to 30 minutes to finish and paralyze the database server for other processes. This situation was not accaptable regarding the future of the overall system. A complete re-design of the database was not possible as the first step. Because money and manpower was limited. We had to find a neat solution which is cheap but powerful, which could be implemented seamlessly in the PHP based system and which can handle the complexity of the data itself.

The solution: Read queries shall be seperated from the master database. In the same breath fuzzy searches (e.g. phonetic) shall be enabled. Ideally those fuzzy read queries directly deliver results without disturbing the rest of the database. After several brainstormings, weeks of thinking and planning, the rejection of other sound but not usable alternatives, and a lot of coffee we discovered the solution:

Apache Lucene Solr

Apache Lucene Solr

Apache Solr. This is an enterprise search server based on lucene. We could achieve all objectives by appointing two developers three weeks to integrate Solr in our systems. Solr enables access to a structured, highly configurable fulltext index by using the standard HTT protocol. It was not our job to implement all those funky search algorithms and index strucures but to design a proper index scheme which meets both, the complexity of our data and the flexibility of the search queries we wanted to have. We decided to use Solr as a tier right above the database which just stores the index and the particular id of the result dataset but no data. Updates are solved by triggers within the mySQL database. Whenever something is updated or deleted in one of the db tables a trigger writes to a special update table. This table is read frequently by a batch job which transfers db updates to the index.

In the existing system we replaced all reading queries by an lightweight search API which encapsulates a two step retrieval: 1. Search query for Solr 2. Database query with the result set (of id’s). A Search query which needed round about 20 minutes to perform now needs not more then 0.1 seconds. We can create complex analysis and flexibly react on special search requests by our client which were rejected until now “by technical limitations”.

One of the moste important advantages Solr has is its independence. The system which was aimed to use the new search system was written in PHP. But there is no proper solution for PHP. But with Solr we just could use the standard HTTPClient to send requests to the Solr (which runs on tomcat).

Today we use the solr index for several databases and in three different environments: Integrated in PHP, directly in JAVA, and through XSLT as a HTML based web search form.

Example 2: Implementation of a text analysis system to extract and transport structured data out of unstructured text to a relational database.

The Problem: A client uses specific data to back up process critical decisions. This data is embedded in texts and thereby not automatically processable. The manual effort to structure the data of interest ist terribly expensive, but the implementation of a text information retrieval system which could automatize this task is just to expensive to develop by a team of 10 developers in terms of time and money. A simple, lightweight solution is almost impossible to imagine because of the high complexity of the data.

Gate - General Architecture for Text Engineering Applications

The solution: The university of Sheffield develops an OpenSource System which is perfect to use for solving the problem: GATE. This is a framework to read and process textual data. In addition to a basic processing framework GATE consists of a bunch of plugins covering several capabilities from different domains. The most important plugins are consolidated as ANNIE, which stands for A nearly new information extraction system). Basically GATE consists of Language Resources (LR) and Processing Resources (PR). The latter are orchastrated in pipelines and used to process language resources, e.g. documents or corpora. Processing in this context means that contents are annotated throughout the process. Our task especially required the use of two ANNIE processing resources: The JAPE-Transducer and the gazetteer. The Gazetteer uses several lookup tables to apply annotations for named entities. Therefore we built a bunch of general and domain specific tables: Firstnames, Lastnames, Cities, Zip-Codes, Streetnames, Legal Forms, Key Words, Toplevel domains etc. The JAPE-Transducer in turn uses annotations to identify patterns of higher level qualified information. Patterns are described in the JAPE language, which is based on regular expressions but applied on annotations and their features (properties).

The information identified by the JAPE Transducer is anlayzed, structured, normalized at the end of the process to prepare the transaction to the relational database. Our result: The system reads, processes and stores about 5000 documents in 20 minutes. By the addition of a compouter aided manual process for all documents with ambigious information we reached a rate of almost 100 % and a quality that is much higher then the former manual reading of the documents.


Passion for Technology: What is it all about?

February 21st, 2009 Communication, Development, Innovation| No Comments »

I think that Passion is one of the vital skills a software developer should possess. Passion for technology, passion for solutions, passion for progress. Mike Peters has pointed at this tellingly precise in his blog aritcle “How to pick a GREAT Software Engineer“. He writes, that passionate developers are characterized by reading DZone or TecCrunch, testing new software, or writing code in their sparetime:

Love what you do and pass that love to everyone you deal with. Always be positive, energetic and make progress, no matter what. What do you do in your spare time? If you’re not writing code, installing a virtual machine, reading TechCrunch/Slashdot/DZone or testing out the latest version of Windows 7, you are not passionate about technology.

I completely aggree with that. Of course there are things much more important then technology (family, friends, health etc.) but I think that passion in this context just means that technology is not just a job but also a hobby, a hobby which serendipously became ones job. And as a hobby it affects the daily life, the character and thinking. Maybe I can express it that way: A triathlet, a football fan and a technologist (to use a name which expresses the passion more then the title software developer) are on beach holidays with their kids and spouses. The triathlet will go on a beach run as soon as his kids are playing in the water and his wife is relaxing in the sun. The football fan will inform himself about the results of his favorite team. And he will buy newspapers, search the internet, call friends at home or ask other tourists until he knows how the match ended. And so is the technologist. He will pleasurably read a book about an interesiting field of technology at the beach (my beach lecture last year in south africa was The Big Switch written by Nicholas Carr). Maybe he actually will have his laptop with him to check his RSS subsrictions in the evening as soon as his family fall asleep; and some will start eclipse to try a  tutorial about the new framework or SDK. Those three guys have one thing in common: While doing their stuff they feel an inner satisfaction which is motivated completely intrinsic. Friends or their partners only have little understanding and will wonder how one can be that passionate about such an apparantly unimportant thing. But this question is not really important to the triathlet, the football fan an the technologist. Especially for the technologist its just a neat sideeffect that he can make a living with his hobby. To come back to the article I mentioned above: I think that its really important to be passionate about what you do when you work as a software developer or engineer. I made experiences that there are a lot of developers who understand their occupation as nothing more as a 9 to 5 job. In a lot of cases this will be enough. A lot of projects will succeed and they will implement some beautiful systems. But at the end of the day there is no fun and little innovative potential. They will mark time without making progress for theirselfs and on the team they are working in.