A radical approach to application development
From the translator: In 2007, searching the web engine, I came across a very interesting and unusual dialect of Lisp. And after reading several articles I was fascinated by his principles. Since my main job is far from web programming, professionally I don't use, but from time to time returned to him and a little "storm".
For all time of acquaintance with this language it is almost never flickers, and in Russian information on it is almost there. Let's try to fill this gap. Despite the fact that original article dates back to 2006-th year, the theme is quite relevant.
Thank you for the help in translating the Hope Zakharova and great site Of.
the
1. Introduction
I work as a consultant and developer of free software. For twenty years I and my partners worked on such projects as image processing, computer aided design, modeling, and various financial and business applications.
Almost all these projects we have used Lisp. My daily job is to listen to the customers ' requests, analyze business processes and develop software according to their needs.
Usually, in business applications like ERP or CRM is a constant process of change. At the beginning of a new project, neither the developer nor the customer does not know exactly neither what is needed nor how should look the final product.
This comes during an iterative process (some call it "extreme programming"). The client evaluates each new version, and then discusses the strategy for further development of the program. Often, unforeseen requirements are forcing you to rewrite large parts of the project. This does not mean that the project was poorly planned, because the process I am describing, is the plan. In an ideal world, software development is only planning, the time spent on direct coding must tend to zero.
We need a programming language that allows to directly Express what we want to do. We believe that the code should be as simple as possible so that any programmer at any time can understand what's going on in the program.
Over the years the system Pico Lisp have evolved with a minimalist Lisp implementation to a specialized application server. Please note, we are not talking about a tool for rapid prototyping. At each stage of development the result is a fully functional program, not a prototype that expands to the serial (maybe the last) version. On the contrary, it can be called a powerful tool for the professional programmer who is used to keep track of your development environment and wants to Express the logic of your application and data structures in a concise view.
First we want to introduce Pico Lisp, explain why Pico at a low level is significantly different from other Lisp-s or systems development and then show its advantages at higher levels.
the
2. A radical approach
The community (Common-) Lisp will not be enthusiastic about Pico Lisp, because it destroys some of the beliefs and dogmas which have become traditional. Some of them are just myths, but they can cause unnecessary complexity, slowness Lisp. Practical experience with Pico Lisp proves that easy and fast Lisp is optimal for many types of efficient application development.
the
2.1. Myth 1: Lisp have the necessary compiler
In fact, this is the main myth. If you participated in group discussions Lisp, you know that the compiler plays a major role. It may seem that it is almost synonymous with execution environment. People care how the compiler generates code and how effectively. If you think that Lisp is slow is done, then you decide that you need a better compiler.
The idea of interpreted Lisp is regarded as an old misconception. Required for a modern Lisp compiler and interpreter is just a useful addition, and the need for interactive debugging of the application. He's too slow and bloated to run the programs that are already on the stage ready to release.
We believe that the true opposite point of view. On the one hand (and not only from a philosophical point of view) the compiled program more in Lisp is not Lisp at all. This violates the fundamental rule of "formal equivalence of code and data". The resulting code does not contain S-expressions and can not be processed by Lisp. Source language (Lisp) is translated into another language (machine code), with inevitable incompatibilities on different machines.
In practice, the compiler complicates the system as a whole. Features such as strategy, multiple binding, typed variables and macros were developed to meet the needs of compilers. The system is bloated, because it has to support the interpreter, and accordingly, two different architectures.
But is it worth the effort? Of course, execution speed is higher and the creation of a compiler is interesting in learning objectives. But we argue that in daily life well designed "shell" can often outperform a compiled system.
You understand that really we are not talking about "interpretation". The Lisp system immediately converts the transferred data into internal structures of pointers called "S-expressions". True "interpretation" works with one-dimensional codes, characters, and this slows down the process of implementation. Lisp also "calculates" S-expressions more quickly by following these structures of pointers. No searches, so that's not really "interpreted". But we will stick to this familiar term.
Program for Lisp like S-expression forms a tree of executable nodes. The code for these nodes is usually written in optimized C or assembler, so the task of the interpreter is to transfer control from one node to another. Since many of these built-in functions in Lisp is very powerful and performs many calculations, a large part of execution time accounts for the nodes. The tree itself functions as a kind of glue.
The Lisp compiler takes away some of that glue, and replace some nodes with primitive or streaming functionality directly into machine code. But since in any case, most of the execution time falls on the built-in functions, these improvements are not as dramatic as for example, the byte-code compiler Java, for which each node (byte-code) has a relatively primitive functionality.
Of course, the compilation itself also requires a lot of time. The application server often executes source Lisp files on the fly in one pass and immediately throws the code as soon as it is executed. In such cases, either initially slower Lisp interpreter-based system, the compiler, or extra time spent by the compiler, will greatly reduce overall performance.
The internal structure of Pico Lisp was originally designed for ease of interpretation. Despite the fact that they were written entirely in C, and was not specially optimized for speed of execution, never had a problem of insufficient performance. The first commercial system written in Pico Lisp was a system for processing and retouching images, and creating page layouts for printing. It was established in 1988 and has been used on a Mac II with a CPU of 12 MHz and RAM of 8 MB.
Of course, there was a Lisp compiler, there was only low-level manipulation of the pixels and Bezier functions written in C. Even then, when working on a computer that is hundreds of times slower than modern ones, nobody complained about performance.
Just for interest I installed CLisp and compared it with Pico Lisp, for example, simple tests. Of course, this does not mean that the test results show the usefulness of a system as the application server, but they give a rough idea about the performance of these systems. First, I tried to run a simple recursive Fibonacci function.
the
(defun fibo (N)
(if (
When you call this function with the parameter 30 (fibo 30), I got the following results (testing was performed on laptop Pentium I 266MHz)
Pico (interpretation) | 12 seconds |
CLisp interpretation | 37 seconds |
CLisp compiled | 7 seconds |
The interpreter CLisp is almost three times slower, and the compiler is almost two times faster than Pico Lisp.
However, the function Fibonacci is a very good example of a typical Lisp program. It consists only of primitive flow and arithmetic functions that can be easily optimized by the compiler and can be written directly in C if it is critical time (in this case the performance would take only 0.2 s)
So I took the other extreme case, with a function that performs extensive processing of lists:
the
(defun tst ()
(mapcar
(lambda (X) (cons (car X) (reverse (delete (car X) (cdr X)))))
'((a b c a b c) (b c d b c d) (c d e c d e) (d e f d e f)) ) )
Calling this function 1 million times, I got:
Pico (interpretation) | 31 seconds |
CLisp interpretation | 196 seconds |
CLisp compiled | 80 seconds |
Now the CLisp interpreter is more than 6 times slower, but to my surprise even the compiled code to 2.58 times slower than Pico Lisp.
Maybe CLisp slow compiler? And maybe the code can be accelerated with the help of some tricks. But these results still leave a lot of doubt, can be justified by the overhead of compilation. To bother with optimizing compilers is the last thing I want to do when it comes to application logic, and when the user would not notice the delay.
the
2.2. Myth 2: Lisp Needs Plenty of Data Types
Fibonacci function described in the above example can be sped up by declaring the variable N as an integer. But then this example will show how strongly influenced by Lisp support requirements of the compiler. The compiler can produce more efficient code if the data types are hard-coded. Common Lisp supports many different data types, including various integer types, fixed/floating point, fractional numbers, characters, strings, structures, Hashtable, and vector types in addition to lists.
On the other hand, Pico Lisp supports only three built-in data types — numbers, symbols and lists, and it well do only these types. The Lisp system works faster with a smaller number of data types, because fewer options you need to check at run time. Maybe it will entail a less efficient memory usage, but fewer types saves space due to the fact that fewer bits are needed for tags.
The main reason for the use of all three types of data lies in the simplicity, and the advantage of simplicity exceeds the benefit from a compensation in speed and memory footprint.
Actually, Pico Lisp at the lowest level uses only one type of data cell that is used to generate numbers, characters and lists. A small number or the minimum symbol occupy only one memory location, dynamically growing when needed. This memory model enables effective garbage collection and completely avoids fragmentation (as would, for example, with vectors).
At the highest level it is always possible to emulate other data types, using these three primitive data type. So, we emulate trees with lists, strings, classes and objects by means of symbols. While there are no performance issues, why complicate things?
the
2.3. Myth 3: Dynamic binding is bad
Pico Lisp uses a simple implementation of dynamic surface binding. The contents of the cell that holds the symbol value, remains at the entrance to the lambda expression or the environment binding, and then set the new value. When you return, the original value is restored. As a result, the current value of the symbol is determined dynamically by the history and condition of the execution, not the static checks of the lexical environment.
It is possible for an interpreted system is the most simple and quick strategy. To view the value of the cell does not require any searches (you only need access to the cell value) and all symbols (local or global) to be treated equally. On the other hand, the compiler can produce more efficient code for a lexical binding, that is, the compiled Lisp code is usually complicates things because of the support of several types of binding strategies.
Dynamic linking is a very powerful mechanism. To access the current value from any place, the variable itself and its value is always physically existing "real thing", not what "seems" ( as in the case with lexical binding, and some degree using the transit symbols in Pico Lisp (see below)).
Unfortunately, a great opportunity is impossible without high risk. The programmer must be familiar with the basic principles, to use their advantages and avoid traps. However, while we will adhere to agreements recommended by Pico Lisp, the risks will be minimal.
There are two types of situations when the results of calculations using dynamic linking, you can get out of control of the programmer:
-
the
- a symbol associated with himself, and we are trying to change the value of a symbol; the
- funarg problem (the functional argument), when the symbol value changes dynamically through code that is invisible in the environment of the current source code.
Such situations can be avoided by using transit symbols.
Transit symbols are symbols in Pico Lisp which look like strings (and are often used as strings), and that only temporarily interned at the time of execution of one source code file (or part thereof). Thus, they have lexical abilities, comparable with static identifiers in the C language, but their behavior is completely dynamic, because they are normal symbols in all other respects.
So, the rules are simple: whenever the function should change the value of the passed variable, or calculate the result of the passed expression Lisp (directly or indirectly), the parameters of this function must be recorded by transit symbols. Practical experience shows that such cases are rare in high-level processes of software development and are found mainly in the supporting libraries and system tools.
the
2.4. Myth 4: property Lists — it's bad
Properties — an elegant, intuitive way to associate information with symbols in addition to cell value/function. They are extremely flexible since the number and data type is statically fixed.
It seems that many think that the property lists are too old and primitive to use them in our time. Instead, it should be used more advanced data structures. Although this is true in some cases, depending on the total number of properties in a symbol, a threshold payback may be higher than expected.
Previous versions of Pico Lisp experimented with hash tables and self-balancing binary trees to store properties, but we have found that the usual lists more effective. We need to take into account the cumulative effect of the entire system, and overhead to support a large number of internal data structures (see above), and more complex search algorithms more often than when using a simple linear search. And when we also address the question of efficiency of memory usage and the advantages of property lists definitely win.
Pico Lisp implements properties in the form of a list of pairs key-value. The only concession in favour of optimizing the speed diagram of the "most recently used", a little accelerate repeated access, but we have no specific indication that it was actually necessary.
Another argument against properties is their stated global visibility. This is true in the same degree as what a global element in C-structure or instance variable in a Java object.
Of course, in the global symbol property also globally, but in a typical developing applications properties are stored in anonymous symbols, objects or database items that are available only in a clearly defined context. Therefore, the property "color" can be used in a sense in one context in a totally different sense in another context, without any mutual interference.
the
3. Application server
On the basis of that simple Pico Lisp machine we developed a vertically structured application server. It unifies database engine (based on PicoLisp-ovsky the implementation of persistent (persistent) objects as first-class data type) and an abstract GUI (generating, for example, HTML or Java applets).
A key element in this unified system is based on Lisp markup language that is used to implement the individual application modules.
Whenever the application server is prompted for a new kind of database, document, or report, or some other service, the source code file Lisp loaded and executed on the fly. This is similar to the URL request followed by sending a HTML file in a traditional web server.
However, the Lisp expressions that are evaluated in this scenario typically have the side effect of building and handling an interactive user interface.
These Lisp expressions describe the structure of GUI components, their behavior in response to user actions and interaction with database objects. In short, they provide a full description of the software module. To make this possible, we found it important to strictly adhere to the Principle of Locality, and to use the mechanisms "Prefix-classes" and "the Demons maintain the relationships" (the latter two described in another document).
the
3.1. The principle of Locality
As we said, business application development is a process of constant change. The principle of Locality was a great help in the development of such projects. This principle requires that all information relating to one module should be stored with this module in one place. This allows the programmer to focus on only one spot where it is stored.
Of course, all this seems quite obvious, but in contrast, the methodology of software development require us to encapsulate behavior and data and hide them from the rest of the application. Usually, this leads to the fact that the application logic is written in one place (the source file), but the functions, classes and methods for the implementation of this logic is defined somewhere else. Of course, this is a good recommendation, but it brings many problems, manifested in the need to constantly go to different storage locations: modifications and a context switch occur simultaneously in several places. If some feature is deprecated, some modules may become obsolete, too, but we forget to remove them.
Thus, we believe that the best is to create abstract libraries of functions, classes and methods — universal as possible, constant in time and different applications, and used to build a strict markup language, which has a high degree of expressiveness for building applications.
This language needs to have a compact syntax and allow to describe all static and dynamic aspects of the application. Locally, in one place. Without the need to define behavior in separate files.
the
3.2. Lisp
And that is the main reason we have argued from the outset that Lisp is the only language that suits us.
Only Lisp allows the same processing code and data, and it is a model-based application development on Pico Lisp. It allows intensive use of the functional units and calculated expressions, freely mixed with static data which you can transfer anywhere, and stored in internal data structures at run time.
As far as we know, in other languages this is impossible, at least with the same ease and elegance. To some extent this can be done using the scripting language using an interpreted text string, but this solution is rather limited and clumsy. And, as we described above, the system in a compiled Lisp can be too heavy and inflexible. In order for all these data structures and code snippets worked smoothly, the dynamic surface binding is a big advantage, because expressions can be evaluated without the need for a binding environment settings.
Another reason is that Lisp allows you to directly manipulate complex data structures such as symbols and nested lists, without having to explicitly declare, allocate, initialize, or release from memory these structures. This contributes to the compactness and readability of the code and gives the programmer a powerful tool of expression that allows you to perform different things in one line where other languages would require writing a separate module.
In addition, as Pico Lisp makes no formal distinction between database objects and internal symbols, all these benefits also apply to work with the database, resulting in direct connection of the operations with the GUI and database in a single local context using the same code!
the
4. Conclusion
The Lisp community seems to suffer from paranoia "ineffective" Lisp. This is probably due to the fact that for decades they were forced to defend their language against claims that "Lisp is slow" and "Lisp is bloated".
Some of it was true. But on today's hardware, the execution speed is irrelevant for many practical applications. And in those cases, the encoding of a few critical functions in C usually solves this problem.
Now focus on the more practical aspects. Some may be shocked at how small and fast can be alleged "ancient" Lisp system. Thus, we must be careful not to make Lisp really "bloated", overloading the core of the language more and more opportunities, and must be addressed to use simple solutions that give complete flexibility for the programmer.
Pico Lisp can be considered as a proof of concept of "Less can be more."
Комментарии
Отправить комментарий