{"id":3467,"date":"2012-12-23T08:21:36","date_gmt":"2012-12-23T13:21:36","guid":{"rendered":"http:\/\/kodegeek.com\/blog\/?p=3467"},"modified":"2014-07-04T10:55:12","modified_gmt":"2014-07-04T14:55:12","slug":"jugando-con-mongodb-recolectando-eventos-mongo-tail-log","status":"publish","type":"post","link":"http:\/\/kodegeek.com\/blog\/2012\/12\/23\/jugando-con-mongodb-recolectando-eventos-mongo-tail-log\/","title":{"rendered":"Jugando con MongoDB, recolectando eventos (Mongo Tail-Log)"},"content":{"rendered":"<p>Esta es una idea con la cual he venido jugando desde hace tiempo, la cual es capturar eventos usando <a href=\"http:\/\/mongodb.org\" target=\"_blank\">MongoDB<\/a> como base de datos. Me gusta la idea que se pueden controlar el espacio en disco usando un arreglo circular (en MongoDB se llaman &#8216;<a href=\"http:\/\/docs.mongodb.org\/manual\/core\/capped-collections\/\">capped collections<\/a>&#8216;), adem\u00e1s de que definir el esquema de los datos es muy sencillo.<\/p>\n<h1>Preparaci\u00f3n<\/h1>\n<p>Lo primero es bajarse e instalar MongoDB (yo utilizo OSX):<\/p>\n<pre lang=\"bash\">\r\ncd \/usr\/share && mkdir mongodb\r\ntar -xzvf \/Users\/josevnz\/Downloads\/mongodb-osx-x86_64-2.2.2.tgz\r\nsudo ln -s mongodb-osx-x86_64-2.2.2 mongo\r\ncd mongo && sudo mkdir data etc log scripts\r\n<\/pre>\n<p>Luego creamos un archivo de configuraci\u00f3n b\u00e1sico:<\/p>\n<pre lang=\"bash\">\r\nMacintosh:mongo josevnz$ cat etc\/mongodb.conf \r\nfork = true\r\nbind_ip = 127.0.0.1\r\nport = 27017\r\nquiet = true\r\ndbpath = \/usr\/share\/mongodb\/mongo\/data\r\nlogpath = \/usr\/share\/mongodb\/mongo\/log\/mongodb.log\r\npidfilepath = \/usr\/share\/mongodb\/mongo\/mongodb.pid\r\nlogappend = true\r\njournal = true\r\n<\/pre>\n<p>Y un par de &#8216;scripts&#8217; para hacernos la vida m\u00e1s sencilla:<\/p>\n<pre lang=\"bash\">\r\nMacintosh:mongo josevnz$ cat etc\/mongodb.conf \r\nfork = true\r\nbind_ip = 127.0.0.1\r\nport = 27017\r\nquiet = true\r\ndbpath = \/usr\/share\/mongodb\/mongo\/data\r\nlogpath = \/usr\/share\/mongodb\/mongo\/log\/mongodb.log\r\npidfilepath = \/usr\/share\/mongodb\/mongo\/mongodb.pid\r\nlogappend = true\r\njournal = true\r\n<\/pre>\n<p>Y la arrancamos:<\/p>\n<pre lang=\"bash\">\r\nMacintosh:~ josevnz$ sudo mongod -f $MONGO_HOME\/etc\/mongodb.conf \r\nPassword:\r\nforked process: 2782\r\nall output going to: \/usr\/share\/mongodb\/mongo\/log\/mongodb.log\r\nchild process started successfully, parent exiting\r\n<\/pre>\n<h1>Preparando la colecci\u00f3n<\/h1>\n<p>Ahora hay que preparar el sitio en donde vamos a poner los datos.<\/p>\n<pre lang=\"bash\">\r\nMacintosh:mongo josevnz$ mongo\r\nMongoDB shell version: 2.2.2\r\nconnecting to: test\r\nWelcome to the MongoDB shell.\r\nFor interactive help, type \"help\".\r\nFor more comprehensive documentation, see\r\n\thttp:\/\/docs.mongodb.org\/\r\nQuestions? Try the support group\r\n\thttp:\/\/groups.google.com\/group\/mongodb-user\r\n> use logdb\r\nswitched to db logdb\r\n> db.createCollection(\"logs\", {capped:true, size:100000})\r\n{ \"ok\" : 1 }\r\n> db.logs.isCapped()\r\ntrue\r\n<\/pre>\n<h1>C\u00f3digo del cliente que inserta los datos<\/h1>\n<pre lang=\"python\">\r\n#!\/usr\/bin\/env jython\r\n# Simple class that simulates an event writer\r\n# Author: josevnz@kodegeek.com\r\n# BLOG: http:\/\/kodegeek.com\/blog\r\n# Asumes that you created a database called 'logsdb' and a collection called 'logs':\r\n# use logdb\r\n# db.createCollection(\"logs\", {capped:true, size:100000})\r\n#\r\nfrom com.mongodb import Mongo, MongoException, WriteConcern, DB, DBCollection, BasicDBObject, DBObject, DBCursor, ServerAddress\r\nfrom java.util import Arrays, Date, Random\r\nfrom java.util.concurrent import Executors, TimeUnit\r\nfrom java.lang import Runnable, Thread\r\nimport sys, os\r\n\r\nclass EventWriter(Runnable):\r\n\r\n        def __init__(self, db):\r\n                self.db = db\r\n                self.col = db.getCollection(\"logs\")\r\n                self.random = Random(1973)\r\n\r\n        def run(self):\r\n                number = self.random.nextLong()\r\n                event = BasicDBObject('datetime', Date().toString()).append('text', 'This is an event, random # %d' % number)\r\n                print \"New event: %s\" % event\r\n                self.col.insert(event)\r\n\r\ndef main(args):\r\n\r\n        initialDelay = 0\r\n        delay = 5\r\n        # Do not use 'localhost', that makes the driver to report a stupid error and hung\r\n        list = Arrays.asList(ServerAddress(\"127.0.0.1\", 27017))\r\n        m = Mongo(list)\r\n        # m.setWriteConcern(WriteConcern.JOURNALED)\r\n        m.setWriteConcern(WriteConcern.NONE) # Do not care if the write makes it or not\r\n        db = m.getDB( \"logsdb\" )\r\n        print \"Connected to %s\" % db.getName()\r\n\r\n        command = EventWriter(db)\r\n        # Use a sigle thread for this example, but in reality the report uses a separate thread to avoid blocking the application\r\n        executor = Executors.newSingleThreadScheduledExecutor()\r\n        print \"Press Ctrl-C to abort this script, events will be written periodically into the database\"\r\n        future = executor.scheduleWithFixedDelay(command, initialDelay, delay, TimeUnit.SECONDS)\r\n        #sys.exit(0)\r\n\r\nif __name__ == \"__main__\":\r\n        main(sys.argv[1:])\r\n\r\n<\/pre>\n<p>La salida se ve asi:<br \/>\n<code><br \/>\nMacintosh:mongodb josevnz$.\/log_writer.py<br \/>\nConnected to logsdb<br \/>\nPress Ctrl-C to abort this script, events will be written periodically into the database<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:21 EST 2012\" , \"text\" : \"This is an event, random # -6901132129250388696\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:27 EST 2012\" , \"text\" : \"This is an event, random # 2141911474641068654\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:32 EST 2012\" , \"text\" : \"This is an event, random # -7447082860282012741\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:37 EST 2012\" , \"text\" : \"This is an event, random # 3042277681337134497\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:42 EST 2012\" , \"text\" : \"This is an event, random # -2682038860783877385\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:47 EST 2012\" , \"text\" : \"This is an event, random # -6576368686118448135\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:52 EST 2012\" , \"text\" : \"This is an event, random # -294840040020254100\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:13:57 EST 2012\" , \"text\" : \"This is an event, random # 4202626908153060298\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:14:02 EST 2012\" , \"text\" : \"This is an event, random # -6313895213434337152\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:14:07 EST 2012\" , \"text\" : \"This is an event, random # 983475561958631366\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:14:12 EST 2012\" , \"text\" : \"This is an event, random # -6651143639772223084\"}<br \/>\nNew event: { \"datetime\" : \"Thu Dec 13 10:14:17 EST 2012\" , \"text\" : \"This is an event, random # -7909942155638967101\"}<\/p>\n<p><\/code><\/p>\n<p>En otra ventana verificamos que en verdad estamos recibiendo eventos, usando la mongo Shell:<\/p>\n<pre lang=\"javascript\">\r\nMacintosh:mongodb josevnz$ mongo\r\nMongoDB shell version: 2.2.2\r\nconnecting to: test\r\n> use logsdb\r\nswitched to db logsdb\r\n\/\/ Imprime el \u00faltimo registro recibido\r\n> db.logs.find().skip(db.logs.count()-1).forEach(printjson)\r\n{\r\n\t\"_id\" : ObjectId(\"50d6ec7cef869f82b3f656ea\"),\r\n\t\"datetime\" : \"Sun Dec 23 06:35:24 EST 2012\",\r\n\t\"text\" : \"This is an event, random # -8356391245437638799\"\r\n}\r\n\r\n\/\/ Imprime el primer registro recibido\r\n> db.logs.findOne()\r\n{\r\n\t\"_id\" : ObjectId(\"50c9f092ef863d028e1ab74b\"),\r\n\t\"datetime\" : \"Thu Dec 13 10:13:21 EST 2012\",\r\n\t\"text\" : \"This is an event, random # -6901132129250388696\"\r\n}\r\n\r\n<\/pre>\n<h1>C\u00f3digo del cliente que lee continuamente de la base de datos<\/h1>\n<p>Siempre ayuda tener a la mano la equivalencia de <a href=\"http:\/\/docs.mongodb.org\/manual\/reference\/sql-comparison\/\">MongoDB a SQL<\/a>. Aqui queremos simular el mismo comportamiento de la herramienta de UNIX &#8216;tail&#8217;, asi que utilizamos algo llamado &#8216;<a href=\"http:\/\/www.mongodb.org\/display\/DOCS\/Tailable+Cursors\">Tailable cursors<\/a>&#8216;:<\/p>\n<pre lang=\"Python\">\r\n#!\/usr\/bin\/env jython\r\n# Simple class that simulates an event reader\r\n# Author: josevnz@kodegeek.com\r\n# BLOG: http:\/\/kodegeek.com\/blog\r\n# Asumes that you created a database called 'logsdb' and a collection called 'logs':\r\n# use logdb\r\n# db.createCollection(\"logs\", {capped:true, size:100000})\r\n# Or convert an existing one to capped: db.runCommand({\"convertToCapped\": \"logs\", size: 100000})\r\n# It is very important that you get a driver version more recent than 2.7.1 (Collection.isCapped is broken there)\r\n#\r\nfrom com.mongodb import Mongo, BasicDBObjectBuilder, DB, DBCollection, BasicDBObject, DBObject, ServerAddress, Bytes\r\nfrom java.util import Arrays, Date\r\nfrom java.util.concurrent import Executors, TimeUnit\r\nfrom java.lang import Runnable, Thread\r\nimport sys, os\r\n\r\nclass EventReader(Runnable):\r\n\r\n        def __init__(self, db):\r\n                self.db = db\r\n                if not db.collectionExists(\"logs\"):\r\n                        raise Exception(\"Logs doesn't exist, please create!\")\r\n                self.coll = db.getCollection(\"logs\")\r\n                if not self.coll.isCapped():\r\n                        raise Exception(\"Logs is not a capped collection!\")\r\n                self.sortBy = BasicDBObjectBuilder().start(\"$natural\", 1).get()\r\n                print \"Ready to read events...\"\r\n\r\n        def run(self):\r\n                lastVal = None # This could be refined to get the last event\r\n                cursor = self.coll.find(lastVal).sort(self.sortBy).addOption(Bytes.QUERYOPTION_TAILABLE).addOption(Bytes.QUERYOPTION_AWAITDATA)\r\n                while cursor.hasNext():\r\n                        print \"%s\" % cursor.next()\r\n\r\ndef main(args):\r\n\r\n        delay = 1\r\n\r\n        list = Arrays.asList(ServerAddress(\"127.0.0.1\", 27017))\r\n        m = Mongo(list)\r\n        db = m.getDB( \"logsdb\" )\r\n        print \"Connected to %s\" % db.getName()\r\n\r\n        command = EventReader(db)\r\n        executor = Executors.newSingleThreadScheduledExecutor()\r\n        print \"Press Ctrl-C to abort this script, reading events from the database\"\r\n        future = executor.schedule(command, delay, TimeUnit.SECONDS)\r\n        #sys.exit(0)\r\n\r\nif __name__ == \"__main__\":\r\n        main(sys.argv[1:])\r\n<\/pre>\n<p>Pienso implementar una herramienta que use esto en mi trabajo, pero el c\u00f3digo se ve f\u00e1cil de usar y promete mucho \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Esta es una idea con la cual he venido jugando desde hace tiempo, la cual es capturar eventos usando MongoDB como base de datos. Me gusta la idea que se pueden controlar el espacio en disco usando un arreglo circular (en MongoDB se llaman &#8216;capped collections&#8216;), adem\u00e1s de que definir el esquema de los datos <a class=\"read-more\" href=\"http:\/\/kodegeek.com\/blog\/2012\/12\/23\/jugando-con-mongodb-recolectando-eventos-mongo-tail-log\/\">[&hellip;]<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[9,194,239],"tags":[757,321,734,735],"_links":{"self":[{"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/posts\/3467"}],"collection":[{"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/comments?post=3467"}],"version-history":[{"count":33,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/posts\/3467\/revisions"}],"predecessor-version":[{"id":3674,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/posts\/3467\/revisions\/3674"}],"wp:attachment":[{"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/media?parent=3467"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/categories?post=3467"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/kodegeek.com\/blog\/wp-json\/wp\/v2\/tags?post=3467"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}