Recursos de programación de apache
Processing the unbounded streams of data in a distributed system sounds like a challenge. Fortunately, there is a tool that can make your way easier. Łukasz will share his experience as a "Storm Trooper", a user of Apache Storm framework, announced to be a first streaming engine to break the 1-microsecond latency barrier. His story will start by describing the processing model. He'll tell you how to build your distributed application using spouts, bolts, and topologies. Then he'll move to components that make your apps work in a distributed way. That's the part when three guys: Nimbuses, Supervisors, and Zookeepers join in and help to build a cluster. As a result, he'll be able to show you a demo app, running on Apache Storm. As you know, Storm Troopers are famous for missing targets. Łukasz will sum up the talk by sharing the drawbacks and ideas that he missed when he first met this technology. After the presentation, you can start playing with processing streams or compare your current approach with the Apache Storm model. And who knows, maybe you'll become a Storm Trooper.
Fundada por los creadores de Apache Kafka, Confluent ha construido una plataforma de Event Streaming que permite a las empresas acceder fácilmente a sus datos en forma de Streams en Tiempo Real. La Plataforma de Confluent es "Apache Kafka on Steroids": teniendo como base Apache Kafka, Confluent ofrece todas las funcionalidades que se necesitan para un despliegue productivo, crítico y seguro. Planteamos una sesión donde haremos una introducción al mundo del streaming de eventos. Desde sus capacidades de integración con sistemas de terceros (AWS, Hadoop, Elastic, Mongo, Debezium, MQTT, JMS ... ), pasando por sus capacidades de procesamiento con Kafka Streams y ksqlDB hasta su gestión y despliegue tanto en entornos Kubernetes como en formato SaaS en Confluent Cloud.
These are the best podcast/talks I've seen/listen to recently: The Beautiful Mess (John Cutler) [Agile, Company Culture, Management, Product] John explains how we must all embrace ‘the beautiful mess’ and learn to navigate change in order to be more successful. What Will The Next 10 Years Of Continuous Delivery Look Like? (Dave Farley, Jez Humble) [Agile, CD, Continuous Delivery, Devops, Microservices, Technical Practices, Technology Strategy] In the 10 years since the publication of the...
Apache Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines. Workflows are defined programmatically as directed acyclic graphs (DAG) of tasks, written in Python. At Idealista we use it on a daily basis for data ingestion pipelines. We’ll do a thorough review about managing dependencies, handling retries, alerting, etc. and all the drawbacks.
¿Y si lo escuchas mientras vas al trabajo o te pones en forma?: https://www.ivoox.com/46590288 ------------- En esta charla intentaré traducir a humano los complejos textos legales de las licencias de software. Analizaré el contenido de las licencias más utilizadas como la GPL v3, Mozilla, Apache, Mit, etc., tratando de aclarar las diferencias existentes entre cada una de ellas y sobretodo explicar a que te obligan, qué pasos tienes que seguir para respetarlas y las consecuencias de no hacerlo. ------------- Todos los vídeos de Commitconf 2019 en: https://lk.autentia.com/Commit19-YouTube ¡Conoce Autentia! Twitter: https://goo.gl/MU5pUQ Instagram: https://lk.autentia.com/instagram LinkedIn: https://goo.gl/2On7Fj/ Facebook: https://goo.gl/o8HrWX
The coming decade promises to be extremely exciting for astronomers and data/computer scientists alike with the coming of Large Synoptic Survey Telescope (LSST), James Webb Space Telescope, and others. These projects will produce a huge amounts of data that need to be searched, corellated, analyzed and learned from in order to find answers to the questions, such as “What are Dark Energy and Dark Matter?”, “How did our Universe form?”, “How many Earth-threatening asteroids are out there?” LSST with its unique architecture will go both “wide” and “deep”, meaning that it will acquire images of large parts of the sky capturing the most distant galaxies. It will continually scan the visible sky during the period of 10 years and will produce the first video of the Universe in history. These new and exciting times require new tools that will help astronomers perform these analytical tasks more efficiently. In collaboration with astronomers from the University of Washington I built AXS, Astronomy Extensions for Spark, a tool based on Apache Spark, designed for fast cross-matching of astronomical catalogs and easy astronomical data processing. In this talk I will go through details of AXS’ architecture and explain why it is so fast. #BIGTH19 #Analytics #MachineLearning #Spark Session presented at Big Things Conference 2019 by Petar Zečević, CTO at SV Group. 20th November 2019 Kinépolis, Madrid Do you want to know more? https://www.bigthingsconference.com/
In this talk, Theofilos Kakantousis present TFX on Hopsworks, a fully open-source platform for running TFX pipelines on any cloud or on-premise. Hopsworks is a project-based multi-tenant platform for both data parallel programming and horizontally scalable machine learning pipelines. Hopsworks supports Apache Flink as a runner for Beam jobs and TFX pipelines are supported through Airflow support in Hopsworks. We will demonstrate how to build a ML pipeline with TFX, Beam’s Python API and the Flink Runner by using Jupyter notebooks, explain how security is transparently enabled with short-lived TLS certificates, and go through all the pipeline steps, from Data Validation, to Transformation, Model training with TensorFlow, Model Analysis, Model Serving and Monitoring with Kubernetes. #BIGTH19 #BigData #DeepLearning Session presented at Big Things Conference 2019 by Theofilos Kakantousis, Data Engineer & COO at Logical Clocks. 21st November 2019 Kinépolis, Madrid Do you want to know more? https://www.bigthingsconference.com/
According to Wikipedia, an Event-driven Architecture, is a software architecture pattern that promotes the production, detection, consumption of, and reaction to events. There is a perfect pairing between microservice-based architectures, Domain Driven Design (DDD) and event-driven architectures. In this conference we will review what design principles are the catalyst for this symbiosis as well as practical examples in different areas including governance. Many business use cases can be articulated on top of these principles, abstracting them from both complexity and variability in the technological stack. As a good part of the audience will already be dealing with events and microservices, we will also explain other key concepts: - Designing a future-proof event taxonomy. - Strategies for event enrichment, starting with the definition of that concept. - Managing correlation or inference of events. - Benefits from an event schema registry using for example Apache Avro. - Traceability of events by design. - Data conciliation patterns, and when to avoid it. We will also take advantage of the opportunity to discuss about common challenges (and others not that common), frequent mistakes and how to avoid or mitigate them. To conclude, we will explain some use cases that we are solving superbly based on real-time events: Communications, Order Management, Business Activity Monitoring (BAM), KYC, GDPR... ------------- Todos los vídeos de Commitconf 2019 en: https://lk.autentia.com/Commit19-YouTube ¡Conoce Autentia! Twitter: https://goo.gl/MU5pUQ Instagram: https://lk.autentia.com/instagram LinkedIn: https://goo.gl/2On7Fj/ Facebook: https://goo.gl/o8HrWX
In this talk we discovered how to simplify data analysis over the cloud with Apache Kylin. Apache Kylin delivers game-changing extreme augmented OLAP technology for analyzing data instantly at petabyte scale that has been adopted by thousands organizations worldwide. Founded by the creators of Apache Kylin, Kyligence is on a mission to accelerate the productivity of its customers by automating data management, discovery, interaction, and insight generation – all without barriers. Kyligence provides an AI augmented data platform, powered by Apache Kylin, for analysts and data engineers to build and manage their data services from on-premises to multi-cloud. Session presented at Big Things Conference 2019 by Luke Han, Co-founder and CEO of Kyligence 20th November 2019 Kinépolis, Madrid