I news aggregator did
Today I want to share with abrasheva history of the development of rating news aggregator TOP.ST.
the
It all started during the Christmas holidays, when the rest are already tired and before the beginning of the workday was still a lot of time. In the head, as usual, getting different ideas and one of them I wanted to implement. And not to sketch a concept and throw (I have it is difficult to count the number of such undertakings), and to go all the way from zero to launch, not postponing it indefinitely.
The idea of rating a news aggregator is not new. The network is as successful projects and not so good. All with their pros and cons. I love to read news, love to read them fresh, so I often use similar projects and gradually formed a vision of how I implemented this idea myself. I started.
The first thing I promised myself that I would not delay the development, as often happens with side projects and launch a working prototype in the shortest possible time. For this it was necessary to use the most familiar technology stack, in order not to waste time on the "entry". At that time, it was PHP + Phalcon Framework as backend, MySQL as database, Beanstalkd as a queue Manager.
I started with the development of the plan the application architecture. In the end it was 4, almost independent from each other parts:
the
All of these parts interact with one another through the queue Manager Beanstalkd and combined into a single virtual network via OpenVPN.
the
On the first day of work on the scripts I suffered the first disappointment. I faced serious performance issues and memory leaks to seemingly simple operations of parsing RSS-feeds, HTML parsing and some related tasks. As it turned out, many (even popular) third-party libraries are still not ready to work in daemon mode. When the script runs according to the scheme: start, worked out, given the content and died, all is well. But the work of a memory leak and a sudden drop starting to show themselves. Then I realized that if I start to deal with all this — objasnu for a long time, and this time there I uncovered your node.js.
I had a pretty great experience node.js but it so happened that by the time I wrote on it for a little over six months, so had to return "in the subject." Contrary to my fears, two days later, I again fully imbued with its atmosphere, as if there was this break and it began to spin with the new rpm.
the
The design I given very easy, as a designer I am not. I think I have a good sense of taste, I can distinguish a good interface from a bad one, I love it when everything is made beautiful, and even somewhere in the back of my head see "how it should be." But the removal of this vision for the limits of head given to me. But the design had to be done and make it like good. I spent a lot of time trying to create a "light" pleasant interface. And at that time to me it even seemed beautiful. But now I'm horrified to remember it. Looked like the first version
After the launch of the first version, when the head has cooled down a bit, I took another look at "it" and realized that it looks awful and it would not even myself. And I started from the beginning. Don't know how it happened, but what happened I liked and still like. The site received a new face and now it greets visitors.
the
About one and a half months it took me to create the first working version, which it was possible to gently release the light. By that time I already had purchased the domain and was free hosting, I thought of a suitable capacity. After deploying the system and collecting the primary information, I started to debug. About two weeks it took to identify the major pitfalls, debugging algorithms and collection data sources.
The website started working, but capacity was quickly missed. VPS with 4Gb memory and 4 CPU cores obviously can't cope with its task. And I faced a choice: to optimize or to add capacity. Optimization — the process is long, labor-intensive and not always predictable at the time. Getting ahead, say that the optimization I have done and are doing still. In the end, we managed to greatly reduce the overall load on the system, but the server still would not be enough, so then I decided we need moreminerals resources.
the
Prices on servers with the necessary characteristics to me at the time it seemed to me not justified. In simple words: so much I had to pay sorry and I started looking for a way out.
Since most of the load on the server to create the database and part of the system responsible for basic calculations and scheduling tasks, I decided to submit it to the server to his home. I at that time was a 2013 MacBook Air with Core i7 and 8Gb of memory on Board, which I never used due to switching to MacBook Pro Retina. I installed the server and connected via the VPN to the rest of the network. Fortunately I had stable Internet speed which is enough for normal tasks. MacBook Air, largely due to its fast SSD, it became perfectly cope with the task.
To make changes to the structure of the database, which by that time had grown to a few gigabytes, was not so painful, well, in General, physical access to the server added features, including faster backups. So the site worked for about a month, slowly dopilival and improving. The increase in volume of data processed has led to the fact that the load on the MacBook Air has grown considerably. He still no problems with her job, but began to warm up and thus fan noise. I felt sorry for him, still it is not designed for such a task, though, and with dignity passed them.
At this point I decided to move the server onto a machine more appropriate for these purposes. Was tempted to buy a powerful desktop hardware, but the fear of power outages has led me to buy a powerful laptop. The choice fell on the laptop. These laptops are unable to boast of a good exterior: screen, keyboard, case leaves much to be desired, but the ratio the price/filling they are very good. I took the model with 4-core Core i7 processor and 8Gb of memory. Memory I immediately expanded up to 16Gb and put a SSD Kingston KC300 180Gb on. It costs 35,000 rubles, after the fall of the ruble seemed to be a pretty budget solution.
The server works correctly for about four months 24x7 and fully copes with its task. To slightly cool the system I installed it between the columns so that the underside of the laptop could freely circulate air and lower the frequency of the processor. In this mode, even in summer, the CPU temperature rises above 65 degrees, which is quite a good indicator.
During this time I had several short power outages and the Internet, but because the work site is not directly dependent on this server, the visitors is not much of a hindrance. This is manifested only in the fact that the data are no longer updated during idle time, at short intervals not very critical and at this stage you can live with it.
the
At the moment, the service monitors publications in 28 countries. The interface is translated into 10 languages, the language is selected based on the browser settings. The site design is responsive, it looks equally good on desktop, tablet and mobile phone. For mobile devices it is possible to save a shortcut to a website operated in the application. The data on the website updated in real time and the interface is implemented in AngularJS.
The service became a member of BizSpark from Microsoft. Hosting Azure provided under the program, really helped and continues to help in the development of the project, for which I am very grateful to Microsoft. BizSpark is a really valuable contribution to the development of start-up projects.
the
The first announcement of the site was posted in site of the day portal Ferra.ru. Also several times I posted the link on Reddit, this hackernews, Twitter and took the Podium project Zuckerberg will Call. There service was attended fairly well and gave some valuable tips. At the moment it is all promotion. Over time, there are a few reviews of the website on foreign resources and from there came the first regular visitors are not from Russia. A few days ago a link to a website hit the top of the portal DesignerNews.
In General, the Analytics shows that a project like visitors. In recent months, average session duration of stay of the user on the website was 30 minutes, the bounce rate is around 10-12%, and the number of returning users is 65-70%. While 55% of traffic is direct, 35% — links from reviews and blogs. The remaining 20% — social networking and search. Interested users back to your site and become regular visitors, which is very pleasing. Although the absolute numbers of visitors is lower than I would like.
the
Plans to improve the functionality a lot and I slowly implement them, unfortunately not as quickly as we would like, because the project I do in my spare time.
Interior architecture me first want to move the whole system to use Docker containers to refactor some pieces of code that turned out to be unsatisfactory, to set up a centralized collection of logs from all working parts of the project and of course to improve the ranking algorithm and its speed.
From visible changes I plan to introduce categories or tags, so you can focus on the topics of news, to add the ability to subscribe to news digests, finally solve the problem of broken and duplicate links and try to introduce comments.
With the latter the most difficult to figure out how this ability to control trolls and other inadequate not built there another place to dump.
In dreams is to learn to combine news with a common theme in stories, but while it is postponed until better times.
Mobile application plans and they will appear as soon as there is obvious a need of the audience.
I noticed that the speed of project development is directly proportional to feedback from users. Every time I see a link to the website in some blog, browse for news online or even a letter from satisfied user on the mail — work begins to move with a new velocity. Feedback is a really important motivating factor.
the
I'll be brief, because the article and so got more than I expected. Thanks to all who read. I will try to answer all the questions in the comments who have no such opportunity, feel free to email me mail@top.st. Well, welcome.
Article based on information from habrahabr.ru
the
Start
It all started during the Christmas holidays, when the rest are already tired and before the beginning of the workday was still a lot of time. In the head, as usual, getting different ideas and one of them I wanted to implement. And not to sketch a concept and throw (I have it is difficult to count the number of such undertakings), and to go all the way from zero to launch, not postponing it indefinitely.
The idea of rating a news aggregator is not new. The network is as successful projects and not so good. All with their pros and cons. I love to read news, love to read them fresh, so I often use similar projects and gradually formed a vision of how I implemented this idea myself. I started.
The first thing I promised myself that I would not delay the development, as often happens with side projects and launch a working prototype in the shortest possible time. For this it was necessary to use the most familiar technology stack, in order not to waste time on the "entry". At that time, it was PHP + Phalcon Framework as backend, MySQL as database, Beanstalkd as a queue Manager.
I started with the development of the plan the application architecture. In the end it was 4, almost independent from each other parts:
the
-
the
- a Web server. The endpoint with which users interact. It needs to be as light as possible and to perform only the most necessary functions, but rather one function — to give content. If necessary, these servers can be multiple. the
- Worker. This part of the application responsible for the collection and primary analysis of the news and ratings. Worker should be easily scaled to increase the amount of information being processed. the
- a Database, task planner and main data processor. This is the heart of the application. It controls the state of the system, schedules tasks to sarkerov and prepares the data for the Web server. the
- administration Panel. Part of the application is designed to manage the system.
All of these parts interact with one another through the queue Manager Beanstalkd and combined into a single virtual network via OpenVPN.
the
Process started
On the first day of work on the scripts I suffered the first disappointment. I faced serious performance issues and memory leaks to seemingly simple operations of parsing RSS-feeds, HTML parsing and some related tasks. As it turned out, many (even popular) third-party libraries are still not ready to work in daemon mode. When the script runs according to the scheme: start, worked out, given the content and died, all is well. But the work of a memory leak and a sudden drop starting to show themselves. Then I realized that if I start to deal with all this — objasnu for a long time, and this time there I uncovered your node.js.
I had a pretty great experience node.js but it so happened that by the time I wrote on it for a little over six months, so had to return "in the subject." Contrary to my fears, two days later, I again fully imbued with its atmosphere, as if there was this break and it began to spin with the new rpm.
the
Design
The design I given very easy, as a designer I am not. I think I have a good sense of taste, I can distinguish a good interface from a bad one, I love it when everything is made beautiful, and even somewhere in the back of my head see "how it should be." But the removal of this vision for the limits of head given to me. But the design had to be done and make it like good. I spent a lot of time trying to create a "light" pleasant interface. And at that time to me it even seemed beautiful. But now I'm horrified to remember it. Looked like the first version
After the launch of the first version, when the head has cooled down a bit, I took another look at "it" and realized that it looks awful and it would not even myself. And I started from the beginning. Don't know how it happened, but what happened I liked and still like. The site received a new face and now it greets visitors.
the
First working version
About one and a half months it took me to create the first working version, which it was possible to gently release the light. By that time I already had purchased the domain and was free hosting, I thought of a suitable capacity. After deploying the system and collecting the primary information, I started to debug. About two weeks it took to identify the major pitfalls, debugging algorithms and collection data sources.
The website started working, but capacity was quickly missed. VPS with 4Gb memory and 4 CPU cores obviously can't cope with its task. And I faced a choice: to optimize or to add capacity. Optimization — the process is long, labor-intensive and not always predictable at the time. Getting ahead, say that the optimization I have done and are doing still. In the end, we managed to greatly reduce the overall load on the system, but the server still would not be enough, so then I decided we need more
the
Moving
Prices on servers with the necessary characteristics to me at the time it seemed to me not justified. In simple words: so much I had to pay sorry and I started looking for a way out.
Since most of the load on the server to create the database and part of the system responsible for basic calculations and scheduling tasks, I decided to submit it to the server to his home. I at that time was a 2013 MacBook Air with Core i7 and 8Gb of memory on Board, which I never used due to switching to MacBook Pro Retina. I installed the server and connected via the VPN to the rest of the network. Fortunately I had stable Internet speed which is enough for normal tasks. MacBook Air, largely due to its fast SSD, it became perfectly cope with the task.
To make changes to the structure of the database, which by that time had grown to a few gigabytes, was not so painful, well, in General, physical access to the server added features, including faster backups. So the site worked for about a month, slowly dopilival and improving. The increase in volume of data processed has led to the fact that the load on the MacBook Air has grown considerably. He still no problems with her job, but began to warm up and thus fan noise. I felt sorry for him, still it is not designed for such a task, though, and with dignity passed them.
At this point I decided to move the server onto a machine more appropriate for these purposes. Was tempted to buy a powerful desktop hardware, but the fear of power outages has led me to buy a powerful laptop. The choice fell on the laptop. These laptops are unable to boast of a good exterior: screen, keyboard, case leaves much to be desired, but the ratio the price/filling they are very good. I took the model with 4-core Core i7 processor and 8Gb of memory. Memory I immediately expanded up to 16Gb and put a SSD Kingston KC300 180Gb on. It costs 35,000 rubles, after the fall of the ruble seemed to be a pretty budget solution.
The server works correctly for about four months 24x7 and fully copes with its task. To slightly cool the system I installed it between the columns so that the underside of the laptop could freely circulate air and lower the frequency of the processor. In this mode, even in summer, the CPU temperature rises above 65 degrees, which is quite a good indicator.
During this time I had several short power outages and the Internet, but because the work site is not directly dependent on this server, the visitors is not much of a hindrance. This is manifested only in the fact that the data are no longer updated during idle time, at short intervals not very critical and at this stage you can live with it.
the
Pro
At the moment, the service monitors publications in 28 countries. The interface is translated into 10 languages, the language is selected based on the browser settings. The site design is responsive, it looks equally good on desktop, tablet and mobile phone. For mobile devices it is possible to save a shortcut to a website operated in the application. The data on the website updated in real time and the interface is implemented in AngularJS.
The service became a member of BizSpark from Microsoft. Hosting Azure provided under the program, really helped and continues to help in the development of the project, for which I am very grateful to Microsoft. BizSpark is a really valuable contribution to the development of start-up projects.
the
Audience
The first announcement of the site was posted in site of the day portal Ferra.ru. Also several times I posted the link on Reddit, this hackernews, Twitter and took the Podium project Zuckerberg will Call. There service was attended fairly well and gave some valuable tips. At the moment it is all promotion. Over time, there are a few reviews of the website on foreign resources and from there came the first regular visitors are not from Russia. A few days ago a link to a website hit the top of the portal DesignerNews.
In General, the Analytics shows that a project like visitors. In recent months, average session duration of stay of the user on the website was 30 minutes, the bounce rate is around 10-12%, and the number of returning users is 65-70%. While 55% of traffic is direct, 35% — links from reviews and blogs. The remaining 20% — social networking and search. Interested users back to your site and become regular visitors, which is very pleasing. Although the absolute numbers of visitors is lower than I would like.
the
Plans
Plans to improve the functionality a lot and I slowly implement them, unfortunately not as quickly as we would like, because the project I do in my spare time.
Interior architecture me first want to move the whole system to use Docker containers to refactor some pieces of code that turned out to be unsatisfactory, to set up a centralized collection of logs from all working parts of the project and of course to improve the ranking algorithm and its speed.
From visible changes I plan to introduce categories or tags, so you can focus on the topics of news, to add the ability to subscribe to news digests, finally solve the problem of broken and duplicate links and try to introduce comments.
With the latter the most difficult to figure out how this ability to control trolls and other inadequate not built there another place to dump.
In dreams is to learn to combine news with a common theme in stories, but while it is postponed until better times.
Mobile application plans and they will appear as soon as there is obvious a need of the audience.
I noticed that the speed of project development is directly proportional to feedback from users. Every time I see a link to the website in some blog, browse for news online or even a letter from satisfied user on the mail — work begins to move with a new velocity. Feedback is a really important motivating factor.
the
Conclusion
I'll be brief, because the article and so got more than I expected. Thanks to all who read. I will try to answer all the questions in the comments who have no such opportunity, feel free to email me mail@top.st. Well, welcome.
Комментарии
Отправить комментарий