apache

Recursos de programación de apache
En la continuación de esta serie en la que construimos una API, hemos pasado por tener que tomar la decisión sobre qué stack utilizar, elegir si debíamos utilizar programación a medida, un CMS o un framework, las ventajas de utilizar las variable de entorno para almacenar información sensible y por último la utilización de espacio de nombres para facilitar la tarea de tener que hacer referencia a dependencias propias y de librerías. Si aún no os habéis topado con esta serie, os dejo los enlaces...
This talk examines business perspectives about the Ray Project from RISELab, hailed as a successor to Apache Spark. Ray is a simple-to-use open source library in Python or Java, which provides multiple patterns for distributed systems: mix and match as needed for a given business use case – without tight coupling of applications with underlying frameworks. Warning: this talk may change the way your organization approaches AI. #BIGTH20 #RayProject Session presented at Big Things Conference 2020 by Paco Nathan, Managing Partner at Derwen 16th November 2020 Home Edition
Machine Learning (ML) is separated into model training and model inference. ML frameworks typically use a data lake like HDFS or S3 to process historical data and train analytic models. Model inference and monitoring at production scale in real time is another common challenge using a data lake. But it’s possible to completely avoid such a data store, using an event streaming architecture. This talk compares the modern approach to traditional batch and big data alternatives and explains benefits like the simplified architecture, the ability of reprocessing events in the same order for training different models, and the possibility to build a scalable, mission-critical ML architecture for real time predictions with muss less headaches and problems. The talk explains how this can be achieved leveraging Apache Kafka, Tiered Storage and TensorFlow. Session presented at Big Things Conference 2020 by KAI WAEHNER Field CTO, Confluent 18th November 2020 Home Edition Do you want to know more? https://www.bigthingsconference.com/
La inmensa mayoría del contenido que se crea diariamente en Internet es desestructurado. Aproximadamente el 90% del mismo es texto. En la era de la web colaborativa, usamos el lenguaje constantemente, por ejemplo, para escribir una crítica de un producto, comentar una foto o escribir un tweet. En esta charla veremos algunas de las herramientas que ofrece el ecosistema Python para comprender, estructurar y extraer valor de un texto y veremos cómo el enfoque a la hora de atacar tareas de procesamiento de texto ha ido evolucionando en los últimos años hasta la tendencia actual basada en Transfer Learning. Además, lo haremos a través de un caso de uso concreto: detectar comentarios ofensivos o insultos a otros usuarios en redes sociales o foros. Bio: Rafa Haro trabaja actualmente como Search Architect en Copyright Clearance Center. Durante sus más de 14 años de experiencia en el desarrollo de software, ha trabajado principalmente en empresas relacionadas con el Procesamiento de Lenguaje Natural, Tecnologías Semánticas y Búsqueda Inteligente. Participa activamente además con diversas comunidades Open Source como Apache Software Foundation dónde es committer y PMC member de dos proyectos: Apache Stanbol y Apache Manifold.
In this session, we will demonstrate how common vulnerabilities in the Java and JavaScript eco-system are exploited on a daily base by live hacking real-world application libraries. All the examples used are commonly known exploits, some more famous than others, such as Apache Struts and Spring break remote code execution vulnerabilities. By exploiting them and showing you how you can be attacked, before showing you how to protect yourself, you will have a better understanding of why and how security focus and DevSecOps is essential for every developer. About: Brian Vermeer, Developer Advocate - Snyk Developer Advocate for Snyk and Software Engineer with over 10 years of hands-on experience in creating and maintaining Software. He is passionate about Java, (Pure) Functional Programming and Cybersecurity. Brian is an Oracle Groundbreaker Ambassador and regular international speaker on mostly Java-related conferences like JavaOne, Oracle Code One, Devoxx BE, Devoxx UK, JFokus, JavaZone and many more. Besides all that Brian is a military reserve for the Royal Netherlands Air Force and a Taekwondo Master / Teacher.
Processing the unbounded streams of data in a distributed system sounds like a challenge. Fortunately, there is a tool that can make your way easier. Łukasz will share his experience as a "Storm Trooper", a user of Apache Storm framework, announced to be a first streaming engine to break the 1-microsecond latency barrier. His story will start by describing the processing model. He'll tell you how to build your distributed application using spouts, bolts, and topologies. Then he'll move to components that make your apps work in a distributed way. That's the part when three guys: Nimbuses, Supervisors, and Zookeepers join in and help to build a cluster. As a result, he'll be able to show you a demo app, running on Apache Storm. As you know, Storm Troopers are famous for missing targets. Łukasz will sum up the talk by sharing the drawbacks and ideas that he missed when he first met this technology. After the presentation, you can start playing with processing streams or compare your current approach with the Apache Storm model. And who knows, maybe you'll become a Storm Trooper.
Fundada por los creadores de Apache Kafka, Confluent ha construido una plataforma de Event Streaming que permite a las empresas acceder fácilmente a sus datos en forma de Streams en Tiempo Real. La Plataforma de Confluent es "Apache Kafka on Steroids": teniendo como base Apache Kafka, Confluent ofrece todas las funcionalidades que se necesitan para un despliegue productivo, crítico y seguro. Planteamos una sesión donde haremos una introducción al mundo del streaming de eventos. Desde sus capacidades de integración con sistemas de terceros (AWS, Hadoop, Elastic, Mongo, Debezium, MQTT, JMS ... ), pasando por sus capacidades de procesamiento con Kafka Streams y ksqlDB hasta su gestión y despliegue tanto en entornos Kubernetes como en formato SaaS en Confluent Cloud.
These are the best podcast/talks I've seen/listen to recently: The Beautiful Mess (John Cutler) [Agile, Company Culture, Management, Product] John explains how we must all embrace ‘the beautiful mess’ and learn to navigate change in order to be more successful. What Will The Next 10 Years Of Continuous Delivery Look Like? (Dave Farley, Jez Humble) [Agile, CD, Continuous Delivery, Devops, Microservices, Technical Practices, Technology Strategy] In the 10 years since the publication of the...
These are the best podcast/talks I've seen/listen to recently: The Beautiful Mess (John Cutler) [Agile, Company Culture, Management, Product] John explains how we must all embrace ‘the beautiful mess’ and learn to navigate change in order to be more successful. What Will The Next 10 Years Of Continuous Delivery Look Like? (Dave Farley, Jez Humble) [Agile, CD, Continuous Delivery, Devops, Microservices, Technical Practices, Technology Strategy] In the 10 years since the publication of the...
Apache Airflow is a workflow automation and scheduling system that can be used to author and manage data pipelines. Workflows are defined programmatically as directed acyclic graphs (DAG) of tasks, written in Python. At Idealista we use it on a daily basis for data ingestion pipelines. We’ll do a thorough review about managing dependencies, handling retries, alerting, etc. and all the drawbacks.