Webinar Recording

Feel The Power - Julia for Energy

Webinar Recording

Feel The Power - Julia for Energy

Date Published

Jul 23, 2020

Jul 23, 2020

Speakers

Share

Share

Date Published

Jul 23, 2020

Speakers

Share

Summary

Dr. Matt Bauman, a long-time Julia Computing team member and expert in the Julia programming language, delivers an in-depth presentation on Julia’s capabilities and applications within the energy sector. He outlines Julia’s growth from its inception in 2012 to its current status as a robust programming language widely adopted in industry, especially for energy-related problems. The talk covers real-world case studies of Julia applied in energy grid optimization, financial modeling, and HVAC system design, demonstrating Julia’s expressivity, speed, and ease of use compared to legacy tools like MATLAB.

Bauman then explores Julia’s technical advantages, such as its highly performant and multi-threaded data processing capabilities, a vast package ecosystem (including Jump for optimization, Flux for deep learning, and DifferentialEquations.jl for modeling complex systems), and its unique ability to combine scientific computing and machine learning seamlessly. He highlights Julia’s support for scientific machine learning, where differential equations and neural networks are integrated to create more interpretable and data-efficient models, a critical advancement for energy modeling and simulations.

Further, Bauman discusses Julia’s reproducibility features through its package manager and project environments, enabling consistent results across platforms, including HPC clusters and cloud infrastructures. He introduces Julia Team and Julia Run, platforms designed to manage packages, documentation, and scalable deployment of Julia code in secure or cloud environments, facilitating enterprise-grade adoption.

The presentation concludes with a live demonstration of running distributed Monte Carlo simulations in Julia, showcasing Julia’s simplicity in expressing parallel computations and its ability to efficiently scale workloads. Bauman also answers audience questions about Julia’s compilation, HPC integration, GPU support, and resources for further learning, emphasizing Julia’s growing role in scientific and industrial domains.

Highlights

  • Julia combines speed and expressivity, enabling users to write fast, maintainable code in one language without switching to low-level interfaces.

  • Real-world energy use cases include grid optimization, financial simulations, and HVAC design, showing Julia’s versatility in energy sector challenges.

  • Scientific machine learning in Julia blends physics-based differential equations with neural networks, improving model interpretability and data efficiency.

  • Julia’s multi-threaded CSV parsing and data processing outperform Python and R by 10-20x in benchmarks, accelerating large-scale data workflows.

  • Julia Team and Julia Run platforms facilitate reproducible projects, private package management, and effortless scaling on cloud and HPC clusters.

  • GPU integration in Julia is seamless, supporting CUDA and accelerating deep learning and differential equation solvers without complex overhead.

  • JuliaHub and package naming conventions (.jl suffix) simplify discovery and use of Julia’s extensive open-source ecosystem.

Key Insights

Unified Code Base Boosts Productivity and Collaboration

Julia’s ability to replace traditional multi-language workflows with a single, high-performance language is transformative. For example, Venia’s transition from MATLAB and C mex files to pure Julia code eliminated complexity and enabled broader team collaboration across skill levels. This democratization of performance reduces development overhead and accelerates innovation in energy analytics and optimization.

Extensive Ecosystem Catalyzes Diverse Energy Applications

With over 3,500 packages and counting, Julia’s ecosystem covers optimization (Jump), machine learning (Flux), data manipulation (DataFrames), and differential equations (DifferentialEquations.jl), all natively integrated. This breadth allows energy companies to address problems ranging from grid stability to financial risk modeling using a coherent toolset, reducing friction and integration challenges typically seen in heterogeneous stacks.

Scientific Machine Learning Bridges HPC and AI

Julia’s support for embedding neural networks within differential equation models represents a paradigm shift. By combining physics-based modeling with data-driven approaches, energy researchers can build models that require less data, are more interpretable, and maintain physical fidelity. This hybrid approach is especially critical for complex systems where pure machine learning or pure simulation approaches fall short.

Superior Data Handling Accelerates Energy Analytics Pipelines

Benchmark comparisons reveal Julia’s CSV parsing and data processing outperform widely used languages by an order of magnitude or more, especially when multi-threaded. Given energy sector data sets often reach hundreds of gigabytes, this efficiency translates directly into faster insights, more interactive workflows, and reduced computational costs, all vital for real-time or near-real-time decision-making.

Reproducibility Ensures Trustworthy and Collaborative Research

Julia’s package manager automatically tracks dependencies down to exact versions and hashes, enabling perfect reproducibility across machines and platforms. This is crucial in energy research where regulatory compliance, auditability, and scientific rigor demand that results be replicable years later or by external reviewers. The upcoming data artifact management further enhances reproducibility by tracking input data alongside code.

Seamless Scaling from Laptop to Cloud and HPC

JuliaTeam and JuliaRun provide an integrated environment to develop, manage, and deploy Julia applications at scale. Their support for private packages, firewall-friendly operation, and flexible cluster configurations means energy companies can run massive parallel jobs on-premises or in the cloud with minimal hassle. The live demo of a distributed Monte Carlo simulation illustrates how simple it is to scale workloads from a few cores to many nodes.

Robust GPU Support Makes Julia Competitive in Deep Learning

With native CUDA support through CUDA.jl and integration with Flux, Julia enables GPU acceleration for both deep learning and differential equation solvers. This positions Julia as a serious contender against established GPU-accelerated frameworks like TensorFlow and PyTorch, especially since Julia’s design allows for more expressive and composable models with less boilerplate and greater transparency.

Conclusion

Matt Bauman’s presentation offers a comprehensive view of how Julia is uniquely positioned to address the computational challenges of the energy sector. From optimized grid management to financial simulations and HVAC system design, Julia’s speed, expressivity, and unified ecosystem provide tangible benefits. The language’s scientific machine learning capabilities represent a novel approach to modeling that blends domain knowledge with data-driven techniques, enabling better, faster, and more interpretable solutions. Coupled with Julia’s reproducibility features and scalable deployment platforms, this toolkit empowers energy companies to innovate with confidence, agility, and efficiency. As energy problems grow in complexity and scale, Julia’s blend of performance, usability, and flexibility makes it an increasingly compelling choice for researchers and industry practitioners alike.

Transcript

Introduction and session overview

00:00:00 - 00:01:41 Matt Bauman introduces himself, sharing his experience with Julia Computing and the Julia programming language since 2014. He explains that Julia is well-suited for various domains, including energy. During the presentation, he encourages participants to ask questions via chat, which he will address either during the talk if relevant or at the end in a dedicated Q&A session.

Julia in energy: case studies overview

00:01:15 - 00:03:36 The speaker introduces the session agenda, which includes discussing case studies of successful Julia applications, highlighting Julia's speed and expressiveness across the energy ecosystem, and explaining how Julia can be deployed at scale. The energy domain is described as vast, covering areas from materials discovery to grid optimization and predictive analytics, with Julia positioned as a capable tool for these challenges. Additionally, the speaker confirms that slide decks and recordings will be shared after the session.

00:03:01 - 00:05:31 Matt Bauman from Julia Computing, explains that Julia Computing supports the open-source Julia language, which was publicly released in 2012 and reached version 1.0 about two and a half years ago. Julia Computing was founded to provide enterprise customers with accountability and service for using Julia. The company has grown to employ over 30 people, including key contributors to the Julia ecosystem, and offers various packages and support for deploying Julia solutions. The speaker then transitions to sharing case studies of Julia's use in the energy sector, starting with the Canadian company Venia.

Case study: Venia's grid optimization

00:04:52 - 00:07:05 The company optimizes the electrical grid to reduce emissions and cut costs using a wide range of machine learning and data science techniques implemented entirely in Julia. They have been strong advocates and contributors to Julia since its early days, helping develop the language and numerous open-source packages. Transitioning from MATLAB, they found Julia to be more expressive and significantly faster, eliminating the need for cumbersome MATLAB mex files. Julia enables maintaining a single, efficient codebase accessible to both novice machine learning practitioners and expert optimizers without requiring multiple programming languages.

Case study: AOT Energy in finance

00:06:32 - 00:08:57 The segment discusses how Julia is successfully used by AOT Energy at the intersection of energy and finance for tasks like options pricing and Monte Carlo simulations. Julia is praised for combining ease of use and expressivity with high performance, making it accessible even to non-professional coders. One of Julia's core principles is that user-created objects have first-class status and performance equivalent to built-in features, with no hidden optimizations, promoting democratization of performance. Julia Computing's collaboration with ARPA-E aims to develop physics models with a focus on rapid development and efficiency.

Case study: ARPA-E HVAC modeling

00:08:20 - 00:10:38 The segment discusses a novel approach to engineering HVAC design using Julia, highlighting its ability to model complex interactions such as room dynamics, vents, ducts, and the non-linear effects of doors opening and closing. Julia's differential equations ecosystem enables robust and rapid simulation, facilitating quick iteration in HVAC engineering. This work, supported by a recent grant, aims to significantly reduce the costs of HVAC systems in large buildings like skyscrapers. Although still under development, the project is promising and exemplifies Julia's speed, expressiveness, and user-friendly IDE that lowers barriers for users transitioning from other programming languages.

Julia Pro IDE and package ecosystem

00:10:05 - 00:11:10 The Julia Pro IDE by Julia Computing offers a familiar and user-friendly development environment, featuring an editor pane, workspace, plots, console, an embedded debugger with breakpoints, and profiling tools to enhance the coding experience.

00:10:37 - 00:11:55 Julia benefits from a vast and growing package ecosystem with over 3,500 packages and 17 million downloads. Notably, the Jump package supports operations research by enabling users to solve constrained optimization problems through a domain-specific language interfacing with various solvers.

00:11:16 - 00:12:23 Jump facilitates working with multiple solvers for optimization problems in a unified language, allowing easy switching between solvers like Gurobi. Flux is introduced as Julia's deep learning package, comparable to TensorFlow or Torch, providing essential tools to build neural networks.

00:11:50 - 00:13:04 Flux supports differentiable programming by efficiently computing gradients crucial for machine learning optimization. Since Julia and its packages are written in Julia, Flux leverages this to offer robust and fast automatic differentiation capabilities.

00:12:26 - 00:13:38 Flux's advanced automatic differentiation techniques surpass those in other ecosystems, enabling more reliable and faster gradient calculations. Additionally, Julia's data science ecosystem is extensive, with strong support for table handling and differential equations, critical components of many data workflows.

Data processing and CSV benchmarks

00:13:02 - 00:15:25 The segment discusses the importance of loading data efficiently in data science workflows, highlighting the continued dominance of CSV files as a standard format for interoperability. It compares the speed of loading CSVs in Python, R, and Julia, noting that Julia significantly outperforms the others by 10 to 20 times, which is especially impactful when working with large datasets of hundreds of gigabytes. This performance advantage accelerates data processing tasks, making it easier and faster to begin analysis.

00:14:51 - 00:17:09 This section explains that unlike Python and R, which rely on CSV parsers written in C, Julia's CSV parser is implemented entirely in Julia. This offers users the benefit of both high performance and full access to the package's source code. A key factor behind Julia's speed is its advanced multi-threading capabilities, allowing it to utilize multiple CPU threads efficiently for faster data loading and processing, unlike Python's pandas parser.

00:16:32 - 00:19:56 The final segment covers benchmarks comparing Julia's data processing performance with other languages across common tasks like group-bys and joins. Julia performs on par with best-in-class tools such as Spark and ClickHouse, especially after the initial compilation overhead. Julia excels at handling medium-sized datasets (around 50 gigabytes) on a single machine with ample RAM, eliminating the need for distributed computing in many cases. It efficiently processes data with modest memory overhead, making it suitable for use on laptops, cloud instances, or high-performance nodes.

Differential equations ecosystem

00:19:23 - 00:21:01 The differential equations ecosystem discussed is considered the most robust across any programming language, backed by a comprehensive and fair comparison conducted by Christopher Caucus, a key contributor and user of multiple differential equation tools. This ecosystem can interface with nearly all other major libraries, as shown in a detailed, albeit complex, comparative slide where each column represents a different language or ecosystem.

00:21:02 - 00:22:13 The slide categorizes various features such as types of ODE solving methods, efficiency, and flexibility, showing the differential equations system's extensive support and adaptability. It integrates with other languages to ensure robust, validated, and fast solutions. This integration forms a foundation for scientific machine learning, which is especially relevant in energy research, where traditional high-performance computing models remain important.

Scientific machine learning in energy

00:21:40 - 00:22:48 The discussion begins with the use of high-performance computing (HPC) for simulations of physics models and molecular dynamics. It highlights that the fastest supercomputers, tracked by the Top500 organization, are mostly sponsored by state departments of energy in nuclear-capable countries. These supercomputers are primarily used to simulate nuclear explosions, which has historically driven the development and deployment of large HPC clusters.

00:22:14 - 00:23:28 HPC has traditionally been focused on simulations, but in the past two decades, there has been a surge in learned models using large neural networks that rely on data rather than fundamental physics. This marks a shift towards machine learning approaches that attempt to capture complex behaviors entirely from data, diverging from traditional simulation methods.

00:22:51 - 00:23:58 A new approach called scientific machine learning is emerging, blending traditional physics-based models with machine learning. This method uses fundamental knowledge augmented by neural networks to build more capable and robust models, combining the strengths of both simulation and learned data-driven techniques.

00:23:24 - 00:24:29 Scientific machine learning can embed neural networks inside differential equations. Using the Julia language and its neural network library Flux, a neural network can be treated as a function with tunable parameters integrated into a differential equation. This allows modeling parts of a system where the physics are unknown by using the neural network to represent those unknown interactions.

00:23:57 - 00:24:58 By combining neural networks with differential equations, models can be smaller and require less data, while remaining interpretable. The known dynamics are expressed through differential equations, while the neural network models the unknown components. This hybrid approach improves model interpretability and data efficiency.

00:24:28 - 00:25:28 Once the neural network is trained on existing data, it can be analyzed to understand how it transforms inputs, offering greater interpretability compared to very deep networks. This approach also reduces the amount of data needed for training, making it a more practical and insightful modeling technique.

00:24:58 - 00:25:56 Another application involves identifying coefficients within neural networks for equations like the shallow water or Burgers equations. Starting with random coefficients, the network can be trained to fit data and discover the correct coefficients, enabling the extraction of underlying physical parameters directly from data.

Neural networks with differential equations

00:25:27 - 00:27:13 The speaker demonstrates a trebuchet model represented by a set of differential equations, highlighting its fun and educational aspects. They discuss fitting data to the model, mentioning that the process took under 20 minutes with a small dataset, although they overfitted by training longer than necessary. The overall approach is tractable and easily performed on a laptop without high-performance computing resources.

00:27:11 - 00:28:46 The model training process includes helpful features like progress bars and time estimates, with an example estimating 23 minutes to complete. The speaker explains that while solving the differential equations for the trebuchet is complex and slow, training a neural network to approximate these equations can enable rapid predictions. This allows quick optimization of parameters such as counterweight, launch speed, and aiming angle to hit a target under varying conditions.

00:28:14 - 00:29:54 This neural network approximation approach extends beyond trebuchets to practical applications like power generation and turbine operation, where physics are well understood but computationally expensive. The data for training the network is generated by the ODE solver itself on demand. Because the system is differentiable, gradients can be computed efficiently, enabling fast and accurate parameter estimation based on physical models.

00:29:20 - 00:30:32 The neural network provides estimates for parameters such as counterweight to meet specific targets, which can then be validated by plugging them back into the differential equation model. The speaker concludes by mentioning an upcoming demonstration of Monte Carlo simulation, emphasizing its ease and speed within this framework.

Monte Carlo simulation example

00:29:55 - 00:33:17 The speaker explains a Monte Carlo simulation that estimates pi by throwing darts at a dartboard and comparing how many land inside a circle versus a square. This simple problem exemplifies embarrassingly parallel computations, allowing distribution of random number generation and calculations across many computers. The Julia code shown comprises six lines that define a function using a distributed for loop with a special macro to parallelize the task. Each iteration generates two random numbers to simulate dart positions, checks if they land inside the circle, and then uses a reduction operation to sum results efficiently. This concise example highlights Julia's capability for expressing distributed simulations easily, supporting complex computations, and promoting reproducible scientific analysis.

Reproducibility and package management

00:32:44 - 00:33:55 Julia's package ecosystem automatically tracks all dependencies used within a project, ensuring project-level reproducibility. It maintains two key configuration files, a project file and a manifest file, which help track dependencies and their changes. Users typically don't need to edit these files manually, but they allow for version control and consistent environment setup.

00:33:19 - 00:34:16 The project file records top-level dependencies and ensures that only those packages are used within a project. Julia actively manages this to prevent usage of packages outside the project scope. Meanwhile, the manifest file is automatically generated and tracks all dependencies, including indirect ones, with precise versioning and content hashes to guarantee reproducibility.

00:33:48 - 00:35:04 Julia's manifest file details every dependency down to exact versions and hashes, including second, third, and further order dependencies. This comprehensive tracking allows users to always restore the exact environment needed, making package reproducibility reliable and robust. Users do not need to understand the internal workings of the manifest file.

00:34:26 - 00:35:30 Julia enables cross-platform reproducibility by allowing users to share their project and manifest files across different operating systems, such as macOS, Linux, and Windows. This ensures all dependencies are set up identically on any machine, facilitating collaboration and consistent results regardless of the environment.

00:34:59 - 00:36:02 Julia guarantees numerical stability and reproducibility down to the last bit across platforms and operating systems, which is a rare and challenging feat for programming languages. Additionally, Julia is developing new features like data artifacts to further enhance reproducibility and data management.

Julia deployment at scale with JuliaRun

00:35:34 - 00:36:39 The system tracks input data to ensure pure reproducibility not only of source code but also of the data used in analyses. This cutting-edge technique allows specifying data reproducibly so that when computations run on the cloud, both the code and input data are precisely known and output data can be tagged accordingly.

00:36:07 - 00:37:29 These reproducibility features enable deploying Julia at scale on HPC clusters, whether on-premises or in the cloud. Consumer-level hardware from major cloud providers like Google, Microsoft, and AWS offers powerful resources. Julia Team facilitates easy and quick deployment on these platforms.

00:36:48 - 00:38:00 Julia Team provides a single source of truth for all packages, supporting collaboration on package development. It mixes private development with the public ecosystem, allowing creation and deployment of private packages at scale on the cloud. The platform offers conveniences like unified package documentation search.

00:37:23 - 00:38:20 The documentation search covers private packages within an enterprise as well as the public ecosystem. Users can maintain private forks and, crucially, Julia Team works seamlessly in secure, firewalled environments, enabling package management even in air-gapped networks.

00:37:52 - 00:39:11 Julia Team overcomes firewall restrictions that typically block package installation in secure environments. It builds on the platform to track code location and enables effortless deployment configured for on-premises HPC clusters or direct cloud deployment, including spinning up cloud machines on demand.00:38:36 - 00:39:57Julia Team allows distribution of code and data to cloud machines for seamless computation. The interface provides access to all packages and their documentation, facilitating easy discovery and use of packages like differential equations within the unified platform.

Julia Team: docs and private packages

00:39:22 - 00:41:38 The speaker demonstrates how to use documentation search within a firewalled environment, allowing access to both public and private packages locally. They highlight searching for a specific package called 'estimate pi' and reveal a private 'secret analysis' package with locally hosted, well-rendered documentation including LaTeX equations and plots. This example illustrates how documentation can be thoroughly integrated and displayed. The segment concludes by mentioning deployment of such applications in the cloud using Julia's application framework.

Cloud deployment and cluster configuration

00:41:05 - 00:42:46 The speaker demonstrates how to run a Julia job in the cloud by clicking 'run in cloud,' which automatically sets up the necessary cloud resources along with code, data, and artifacts for reproducibility. They explain the presence of project and manifest files to ensure consistent environments and mention tracking input data, although the current Monte Carlo simulation example does not require any. The user can adjust parameters such as the number of iterations to increase accuracy, and select either the release or development branches for running the job.

00:42:11 - 00:43:57 The interface allows detailed cluster configuration, including choosing the number of threads or cores per node, and estimating the cost of the job before running it. The user opts to run the job on five nodes with eight cores each, totaling 40 workers. After launching, the job status appears in a results table, showing inputs, outputs, and the running status, enabling easy monitoring of the cloud job's progress.

Job management and parallel performance

00:43:20 - 00:44:59 The speaker demonstrates a job running live, showing inputs such as the number of workers, threads, memory per worker, and iterations. They highlight previous runs with different worker configurations and explain how the task was efficiently parallelized to estimate pi with one billion iterations in under a second. Once the job completed, the workers were terminated. The demo concludes with thanks and an invitation for questions, noting the presence of the CEO of Julia Computing on the call.

Q&A: Julia applications and resources

00:44:29 - 00:46:59 The speaker addresses questions about Julia applications in economics and recommends visiting juliacomputing.com, especially the resources and webinars sections, for relevant content including a recent webinar on Julia for finance. They highlight available case studies and an energy ecosystem page as useful resources. The speaker acknowledges a question about using Jump for energy dispatch optimization, noting it as a good use case but without an immediate example, and suggests exploring it further.

Q&A: Compilation and distributed workloads

00:46:39 - 00:48:21 The discussion addresses how Julia handles compilation in distributed workloads. Julia compiles specialized code when a function is run for the first time, which can cause initial delays such as in generating the first plot. Recent improvements in Julia 1.5 significantly reduce this startup time from about 30 seconds to 10 seconds, with subsequent operations becoming nearly instantaneous.

00:47:47 - 00:49:44 In distributed computing, Julia compiles specialized functions on each worker by default, which can introduce overhead. To mitigate this, Julia supports pre-compilation, and Julia Run automates distributing precompiled packages to workers to reduce delays. Additionally, Julia supports custom transports for HPC systems through cluster managers, enabling integration with various distributed compute environments.

Q&A: HPC cluster support and transports

00:49:07 - 00:50:52 The discussion focuses on specifying the cluster manager, queue system, and transport system to be used between clusters, highlighting the importance of both transport and queuing systems. MPI is emphasized as a key component, well-supported and heavily utilized in the next-generation climate simulation project, Climber. The segment also touches on the challenges of HPC compute clusters, each having unique characteristics, and notes that Julia Run primarily targets a cloud compute model while also supporting specialized interconnects for certain clusters. Viewers are encouraged to reach out with questions about these setups.

Q&A: Finding Julia users in organization

00:50:29 - 00:52:17 The discussion addresses how organizations using Julia can connect internally through Julia Computing, which can facilitate introductions if permission is granted, despite challenges like corporate secrecy. Additionally, a tip is shared for effectively searching Julia-related information online, such as focusing on package names ending with .jl on GitHub to find relevant resources rather than unrelated results.

Q&A: Searching Julia resources effectively

00:51:42 - 00:53:36 The speaker explains a useful convention in Julia package naming, where packages typically have a '.jl' suffix. This helps when searching for new techniques or tools, as adding '.jl' to a search term can yield more relevant results. For example, searching 'mpi jl' instead of 'mpi julia' provides better matches. Both 'jl' and 'julia' as search keywords are effective for finding Julia code, especially for compute-related tools.00:52:58 - 00:54:03The speaker introduces juliahub.com as a public platform for searching the Julia ecosystem. This site contains the entire public Julia package ecosystem and allows searching documentation and packages. However, it currently lacks features for private or corporate packages, compute model integration, and firewall protection, making it ideal for individual users looking through public resources.

Q&A: Julia packages for power systems

00:53:30 - 00:55:18 The speaker explains how to use powersystems.jl by accessing its documentation via Julia packages to understand its purpose in power systems modeling and analysis. They highlight effective search techniques for Julia packages, recommending JuliaHub and Google as primary tools. Additionally, they mention the intention to add powersystems.jl to an energy-related resource page. The segment ends with a transition towards addressing a question about deep learning.

Q&A: Deep learning and GPU support

00:54:55 - 00:56:07 The speaker introduces Julia's GPU implementation for deep learning, highlighting its robust support for CUDA processing. The programming model is seamless, allowing users to specify arrays on the GPU easily.

00:56:08 - 00:57:14 All computations on GPU arrays, including broadcasting operations, are executed on the GPU, simplifying GPU computing. Various Julia packages natively support GPU arrays, enabling machine learning training with Flux and solving differential equations on the GPU, though not all differential equation solvers benefit equally from GPU acceleration.

00:56:42 - 00:57:46 Julia's GPU capabilities allow it to compete with major frameworks like PyTorch and TensorFlow, which have much larger resources. Julia offers more expressive and easier-to-understand solutions. The session concludes with thanks to the audience as it reaches the top of the hour.

Session closing and final remarks

00:57:14 - 00:57:53 The speaker thanks the audience for their great questions and apologizes if any were missed. They encourage attendees to reach out to them or anyone at Julia Computing for further inquiries and express gratitude for their attendance.

Transcript

Introduction and session overview

00:00:00 - 00:01:41 Matt Bauman introduces himself, sharing his experience with Julia Computing and the Julia programming language since 2014. He explains that Julia is well-suited for various domains, including energy. During the presentation, he encourages participants to ask questions via chat, which he will address either during the talk if relevant or at the end in a dedicated Q&A session.

Julia in energy: case studies overview

00:01:15 - 00:03:36 The speaker introduces the session agenda, which includes discussing case studies of successful Julia applications, highlighting Julia's speed and expressiveness across the energy ecosystem, and explaining how Julia can be deployed at scale. The energy domain is described as vast, covering areas from materials discovery to grid optimization and predictive analytics, with Julia positioned as a capable tool for these challenges. Additionally, the speaker confirms that slide decks and recordings will be shared after the session.

00:03:01 - 00:05:31 Matt Bauman from Julia Computing, explains that Julia Computing supports the open-source Julia language, which was publicly released in 2012 and reached version 1.0 about two and a half years ago. Julia Computing was founded to provide enterprise customers with accountability and service for using Julia. The company has grown to employ over 30 people, including key contributors to the Julia ecosystem, and offers various packages and support for deploying Julia solutions. The speaker then transitions to sharing case studies of Julia's use in the energy sector, starting with the Canadian company Venia.

Case study: Venia's grid optimization

00:04:52 - 00:07:05 The company optimizes the electrical grid to reduce emissions and cut costs using a wide range of machine learning and data science techniques implemented entirely in Julia. They have been strong advocates and contributors to Julia since its early days, helping develop the language and numerous open-source packages. Transitioning from MATLAB, they found Julia to be more expressive and significantly faster, eliminating the need for cumbersome MATLAB mex files. Julia enables maintaining a single, efficient codebase accessible to both novice machine learning practitioners and expert optimizers without requiring multiple programming languages.

Case study: AOT Energy in finance

00:06:32 - 00:08:57 The segment discusses how Julia is successfully used by AOT Energy at the intersection of energy and finance for tasks like options pricing and Monte Carlo simulations. Julia is praised for combining ease of use and expressivity with high performance, making it accessible even to non-professional coders. One of Julia's core principles is that user-created objects have first-class status and performance equivalent to built-in features, with no hidden optimizations, promoting democratization of performance. Julia Computing's collaboration with ARPA-E aims to develop physics models with a focus on rapid development and efficiency.

Case study: ARPA-E HVAC modeling

00:08:20 - 00:10:38 The segment discusses a novel approach to engineering HVAC design using Julia, highlighting its ability to model complex interactions such as room dynamics, vents, ducts, and the non-linear effects of doors opening and closing. Julia's differential equations ecosystem enables robust and rapid simulation, facilitating quick iteration in HVAC engineering. This work, supported by a recent grant, aims to significantly reduce the costs of HVAC systems in large buildings like skyscrapers. Although still under development, the project is promising and exemplifies Julia's speed, expressiveness, and user-friendly IDE that lowers barriers for users transitioning from other programming languages.

Julia Pro IDE and package ecosystem

00:10:05 - 00:11:10 The Julia Pro IDE by Julia Computing offers a familiar and user-friendly development environment, featuring an editor pane, workspace, plots, console, an embedded debugger with breakpoints, and profiling tools to enhance the coding experience.

00:10:37 - 00:11:55 Julia benefits from a vast and growing package ecosystem with over 3,500 packages and 17 million downloads. Notably, the Jump package supports operations research by enabling users to solve constrained optimization problems through a domain-specific language interfacing with various solvers.

00:11:16 - 00:12:23 Jump facilitates working with multiple solvers for optimization problems in a unified language, allowing easy switching between solvers like Gurobi. Flux is introduced as Julia's deep learning package, comparable to TensorFlow or Torch, providing essential tools to build neural networks.

00:11:50 - 00:13:04 Flux supports differentiable programming by efficiently computing gradients crucial for machine learning optimization. Since Julia and its packages are written in Julia, Flux leverages this to offer robust and fast automatic differentiation capabilities.

00:12:26 - 00:13:38 Flux's advanced automatic differentiation techniques surpass those in other ecosystems, enabling more reliable and faster gradient calculations. Additionally, Julia's data science ecosystem is extensive, with strong support for table handling and differential equations, critical components of many data workflows.

Data processing and CSV benchmarks

00:13:02 - 00:15:25 The segment discusses the importance of loading data efficiently in data science workflows, highlighting the continued dominance of CSV files as a standard format for interoperability. It compares the speed of loading CSVs in Python, R, and Julia, noting that Julia significantly outperforms the others by 10 to 20 times, which is especially impactful when working with large datasets of hundreds of gigabytes. This performance advantage accelerates data processing tasks, making it easier and faster to begin analysis.

00:14:51 - 00:17:09 This section explains that unlike Python and R, which rely on CSV parsers written in C, Julia's CSV parser is implemented entirely in Julia. This offers users the benefit of both high performance and full access to the package's source code. A key factor behind Julia's speed is its advanced multi-threading capabilities, allowing it to utilize multiple CPU threads efficiently for faster data loading and processing, unlike Python's pandas parser.

00:16:32 - 00:19:56 The final segment covers benchmarks comparing Julia's data processing performance with other languages across common tasks like group-bys and joins. Julia performs on par with best-in-class tools such as Spark and ClickHouse, especially after the initial compilation overhead. Julia excels at handling medium-sized datasets (around 50 gigabytes) on a single machine with ample RAM, eliminating the need for distributed computing in many cases. It efficiently processes data with modest memory overhead, making it suitable for use on laptops, cloud instances, or high-performance nodes.

Differential equations ecosystem

00:19:23 - 00:21:01 The differential equations ecosystem discussed is considered the most robust across any programming language, backed by a comprehensive and fair comparison conducted by Christopher Caucus, a key contributor and user of multiple differential equation tools. This ecosystem can interface with nearly all other major libraries, as shown in a detailed, albeit complex, comparative slide where each column represents a different language or ecosystem.

00:21:02 - 00:22:13 The slide categorizes various features such as types of ODE solving methods, efficiency, and flexibility, showing the differential equations system's extensive support and adaptability. It integrates with other languages to ensure robust, validated, and fast solutions. This integration forms a foundation for scientific machine learning, which is especially relevant in energy research, where traditional high-performance computing models remain important.

Scientific machine learning in energy

00:21:40 - 00:22:48 The discussion begins with the use of high-performance computing (HPC) for simulations of physics models and molecular dynamics. It highlights that the fastest supercomputers, tracked by the Top500 organization, are mostly sponsored by state departments of energy in nuclear-capable countries. These supercomputers are primarily used to simulate nuclear explosions, which has historically driven the development and deployment of large HPC clusters.

00:22:14 - 00:23:28 HPC has traditionally been focused on simulations, but in the past two decades, there has been a surge in learned models using large neural networks that rely on data rather than fundamental physics. This marks a shift towards machine learning approaches that attempt to capture complex behaviors entirely from data, diverging from traditional simulation methods.

00:22:51 - 00:23:58 A new approach called scientific machine learning is emerging, blending traditional physics-based models with machine learning. This method uses fundamental knowledge augmented by neural networks to build more capable and robust models, combining the strengths of both simulation and learned data-driven techniques.

00:23:24 - 00:24:29 Scientific machine learning can embed neural networks inside differential equations. Using the Julia language and its neural network library Flux, a neural network can be treated as a function with tunable parameters integrated into a differential equation. This allows modeling parts of a system where the physics are unknown by using the neural network to represent those unknown interactions.

00:23:57 - 00:24:58 By combining neural networks with differential equations, models can be smaller and require less data, while remaining interpretable. The known dynamics are expressed through differential equations, while the neural network models the unknown components. This hybrid approach improves model interpretability and data efficiency.

00:24:28 - 00:25:28 Once the neural network is trained on existing data, it can be analyzed to understand how it transforms inputs, offering greater interpretability compared to very deep networks. This approach also reduces the amount of data needed for training, making it a more practical and insightful modeling technique.

00:24:58 - 00:25:56 Another application involves identifying coefficients within neural networks for equations like the shallow water or Burgers equations. Starting with random coefficients, the network can be trained to fit data and discover the correct coefficients, enabling the extraction of underlying physical parameters directly from data.

Neural networks with differential equations

00:25:27 - 00:27:13 The speaker demonstrates a trebuchet model represented by a set of differential equations, highlighting its fun and educational aspects. They discuss fitting data to the model, mentioning that the process took under 20 minutes with a small dataset, although they overfitted by training longer than necessary. The overall approach is tractable and easily performed on a laptop without high-performance computing resources.

00:27:11 - 00:28:46 The model training process includes helpful features like progress bars and time estimates, with an example estimating 23 minutes to complete. The speaker explains that while solving the differential equations for the trebuchet is complex and slow, training a neural network to approximate these equations can enable rapid predictions. This allows quick optimization of parameters such as counterweight, launch speed, and aiming angle to hit a target under varying conditions.

00:28:14 - 00:29:54 This neural network approximation approach extends beyond trebuchets to practical applications like power generation and turbine operation, where physics are well understood but computationally expensive. The data for training the network is generated by the ODE solver itself on demand. Because the system is differentiable, gradients can be computed efficiently, enabling fast and accurate parameter estimation based on physical models.

00:29:20 - 00:30:32 The neural network provides estimates for parameters such as counterweight to meet specific targets, which can then be validated by plugging them back into the differential equation model. The speaker concludes by mentioning an upcoming demonstration of Monte Carlo simulation, emphasizing its ease and speed within this framework.

Monte Carlo simulation example

00:29:55 - 00:33:17 The speaker explains a Monte Carlo simulation that estimates pi by throwing darts at a dartboard and comparing how many land inside a circle versus a square. This simple problem exemplifies embarrassingly parallel computations, allowing distribution of random number generation and calculations across many computers. The Julia code shown comprises six lines that define a function using a distributed for loop with a special macro to parallelize the task. Each iteration generates two random numbers to simulate dart positions, checks if they land inside the circle, and then uses a reduction operation to sum results efficiently. This concise example highlights Julia's capability for expressing distributed simulations easily, supporting complex computations, and promoting reproducible scientific analysis.

Reproducibility and package management

00:32:44 - 00:33:55 Julia's package ecosystem automatically tracks all dependencies used within a project, ensuring project-level reproducibility. It maintains two key configuration files, a project file and a manifest file, which help track dependencies and their changes. Users typically don't need to edit these files manually, but they allow for version control and consistent environment setup.

00:33:19 - 00:34:16 The project file records top-level dependencies and ensures that only those packages are used within a project. Julia actively manages this to prevent usage of packages outside the project scope. Meanwhile, the manifest file is automatically generated and tracks all dependencies, including indirect ones, with precise versioning and content hashes to guarantee reproducibility.

00:33:48 - 00:35:04 Julia's manifest file details every dependency down to exact versions and hashes, including second, third, and further order dependencies. This comprehensive tracking allows users to always restore the exact environment needed, making package reproducibility reliable and robust. Users do not need to understand the internal workings of the manifest file.

00:34:26 - 00:35:30 Julia enables cross-platform reproducibility by allowing users to share their project and manifest files across different operating systems, such as macOS, Linux, and Windows. This ensures all dependencies are set up identically on any machine, facilitating collaboration and consistent results regardless of the environment.

00:34:59 - 00:36:02 Julia guarantees numerical stability and reproducibility down to the last bit across platforms and operating systems, which is a rare and challenging feat for programming languages. Additionally, Julia is developing new features like data artifacts to further enhance reproducibility and data management.

Julia deployment at scale with JuliaRun

00:35:34 - 00:36:39 The system tracks input data to ensure pure reproducibility not only of source code but also of the data used in analyses. This cutting-edge technique allows specifying data reproducibly so that when computations run on the cloud, both the code and input data are precisely known and output data can be tagged accordingly.

00:36:07 - 00:37:29 These reproducibility features enable deploying Julia at scale on HPC clusters, whether on-premises or in the cloud. Consumer-level hardware from major cloud providers like Google, Microsoft, and AWS offers powerful resources. Julia Team facilitates easy and quick deployment on these platforms.

00:36:48 - 00:38:00 Julia Team provides a single source of truth for all packages, supporting collaboration on package development. It mixes private development with the public ecosystem, allowing creation and deployment of private packages at scale on the cloud. The platform offers conveniences like unified package documentation search.

00:37:23 - 00:38:20 The documentation search covers private packages within an enterprise as well as the public ecosystem. Users can maintain private forks and, crucially, Julia Team works seamlessly in secure, firewalled environments, enabling package management even in air-gapped networks.

00:37:52 - 00:39:11 Julia Team overcomes firewall restrictions that typically block package installation in secure environments. It builds on the platform to track code location and enables effortless deployment configured for on-premises HPC clusters or direct cloud deployment, including spinning up cloud machines on demand.00:38:36 - 00:39:57Julia Team allows distribution of code and data to cloud machines for seamless computation. The interface provides access to all packages and their documentation, facilitating easy discovery and use of packages like differential equations within the unified platform.

Julia Team: docs and private packages

00:39:22 - 00:41:38 The speaker demonstrates how to use documentation search within a firewalled environment, allowing access to both public and private packages locally. They highlight searching for a specific package called 'estimate pi' and reveal a private 'secret analysis' package with locally hosted, well-rendered documentation including LaTeX equations and plots. This example illustrates how documentation can be thoroughly integrated and displayed. The segment concludes by mentioning deployment of such applications in the cloud using Julia's application framework.

Cloud deployment and cluster configuration

00:41:05 - 00:42:46 The speaker demonstrates how to run a Julia job in the cloud by clicking 'run in cloud,' which automatically sets up the necessary cloud resources along with code, data, and artifacts for reproducibility. They explain the presence of project and manifest files to ensure consistent environments and mention tracking input data, although the current Monte Carlo simulation example does not require any. The user can adjust parameters such as the number of iterations to increase accuracy, and select either the release or development branches for running the job.

00:42:11 - 00:43:57 The interface allows detailed cluster configuration, including choosing the number of threads or cores per node, and estimating the cost of the job before running it. The user opts to run the job on five nodes with eight cores each, totaling 40 workers. After launching, the job status appears in a results table, showing inputs, outputs, and the running status, enabling easy monitoring of the cloud job's progress.

Job management and parallel performance

00:43:20 - 00:44:59 The speaker demonstrates a job running live, showing inputs such as the number of workers, threads, memory per worker, and iterations. They highlight previous runs with different worker configurations and explain how the task was efficiently parallelized to estimate pi with one billion iterations in under a second. Once the job completed, the workers were terminated. The demo concludes with thanks and an invitation for questions, noting the presence of the CEO of Julia Computing on the call.

Q&A: Julia applications and resources

00:44:29 - 00:46:59 The speaker addresses questions about Julia applications in economics and recommends visiting juliacomputing.com, especially the resources and webinars sections, for relevant content including a recent webinar on Julia for finance. They highlight available case studies and an energy ecosystem page as useful resources. The speaker acknowledges a question about using Jump for energy dispatch optimization, noting it as a good use case but without an immediate example, and suggests exploring it further.

Q&A: Compilation and distributed workloads

00:46:39 - 00:48:21 The discussion addresses how Julia handles compilation in distributed workloads. Julia compiles specialized code when a function is run for the first time, which can cause initial delays such as in generating the first plot. Recent improvements in Julia 1.5 significantly reduce this startup time from about 30 seconds to 10 seconds, with subsequent operations becoming nearly instantaneous.

00:47:47 - 00:49:44 In distributed computing, Julia compiles specialized functions on each worker by default, which can introduce overhead. To mitigate this, Julia supports pre-compilation, and Julia Run automates distributing precompiled packages to workers to reduce delays. Additionally, Julia supports custom transports for HPC systems through cluster managers, enabling integration with various distributed compute environments.

Q&A: HPC cluster support and transports

00:49:07 - 00:50:52 The discussion focuses on specifying the cluster manager, queue system, and transport system to be used between clusters, highlighting the importance of both transport and queuing systems. MPI is emphasized as a key component, well-supported and heavily utilized in the next-generation climate simulation project, Climber. The segment also touches on the challenges of HPC compute clusters, each having unique characteristics, and notes that Julia Run primarily targets a cloud compute model while also supporting specialized interconnects for certain clusters. Viewers are encouraged to reach out with questions about these setups.

Q&A: Finding Julia users in organization

00:50:29 - 00:52:17 The discussion addresses how organizations using Julia can connect internally through Julia Computing, which can facilitate introductions if permission is granted, despite challenges like corporate secrecy. Additionally, a tip is shared for effectively searching Julia-related information online, such as focusing on package names ending with .jl on GitHub to find relevant resources rather than unrelated results.

Q&A: Searching Julia resources effectively

00:51:42 - 00:53:36 The speaker explains a useful convention in Julia package naming, where packages typically have a '.jl' suffix. This helps when searching for new techniques or tools, as adding '.jl' to a search term can yield more relevant results. For example, searching 'mpi jl' instead of 'mpi julia' provides better matches. Both 'jl' and 'julia' as search keywords are effective for finding Julia code, especially for compute-related tools.00:52:58 - 00:54:03The speaker introduces juliahub.com as a public platform for searching the Julia ecosystem. This site contains the entire public Julia package ecosystem and allows searching documentation and packages. However, it currently lacks features for private or corporate packages, compute model integration, and firewall protection, making it ideal for individual users looking through public resources.

Q&A: Julia packages for power systems

00:53:30 - 00:55:18 The speaker explains how to use powersystems.jl by accessing its documentation via Julia packages to understand its purpose in power systems modeling and analysis. They highlight effective search techniques for Julia packages, recommending JuliaHub and Google as primary tools. Additionally, they mention the intention to add powersystems.jl to an energy-related resource page. The segment ends with a transition towards addressing a question about deep learning.

Q&A: Deep learning and GPU support

00:54:55 - 00:56:07 The speaker introduces Julia's GPU implementation for deep learning, highlighting its robust support for CUDA processing. The programming model is seamless, allowing users to specify arrays on the GPU easily.

00:56:08 - 00:57:14 All computations on GPU arrays, including broadcasting operations, are executed on the GPU, simplifying GPU computing. Various Julia packages natively support GPU arrays, enabling machine learning training with Flux and solving differential equations on the GPU, though not all differential equation solvers benefit equally from GPU acceleration.

00:56:42 - 00:57:46 Julia's GPU capabilities allow it to compete with major frameworks like PyTorch and TensorFlow, which have much larger resources. Julia offers more expressive and easier-to-understand solutions. The session concludes with thanks to the audience as it reaches the top of the hour.

Session closing and final remarks

00:57:14 - 00:57:53 The speaker thanks the audience for their great questions and apologizes if any were missed. They encourage attendees to reach out to them or anyone at Julia Computing for further inquiries and express gratitude for their attendance.

Transcript

Introduction and session overview

00:00:00 - 00:01:41 Matt Bauman introduces himself, sharing his experience with Julia Computing and the Julia programming language since 2014. He explains that Julia is well-suited for various domains, including energy. During the presentation, he encourages participants to ask questions via chat, which he will address either during the talk if relevant or at the end in a dedicated Q&A session.

Julia in energy: case studies overview

00:01:15 - 00:03:36 The speaker introduces the session agenda, which includes discussing case studies of successful Julia applications, highlighting Julia's speed and expressiveness across the energy ecosystem, and explaining how Julia can be deployed at scale. The energy domain is described as vast, covering areas from materials discovery to grid optimization and predictive analytics, with Julia positioned as a capable tool for these challenges. Additionally, the speaker confirms that slide decks and recordings will be shared after the session.

00:03:01 - 00:05:31 Matt Bauman from Julia Computing, explains that Julia Computing supports the open-source Julia language, which was publicly released in 2012 and reached version 1.0 about two and a half years ago. Julia Computing was founded to provide enterprise customers with accountability and service for using Julia. The company has grown to employ over 30 people, including key contributors to the Julia ecosystem, and offers various packages and support for deploying Julia solutions. The speaker then transitions to sharing case studies of Julia's use in the energy sector, starting with the Canadian company Venia.

Case study: Venia's grid optimization

00:04:52 - 00:07:05 The company optimizes the electrical grid to reduce emissions and cut costs using a wide range of machine learning and data science techniques implemented entirely in Julia. They have been strong advocates and contributors to Julia since its early days, helping develop the language and numerous open-source packages. Transitioning from MATLAB, they found Julia to be more expressive and significantly faster, eliminating the need for cumbersome MATLAB mex files. Julia enables maintaining a single, efficient codebase accessible to both novice machine learning practitioners and expert optimizers without requiring multiple programming languages.

Case study: AOT Energy in finance

00:06:32 - 00:08:57 The segment discusses how Julia is successfully used by AOT Energy at the intersection of energy and finance for tasks like options pricing and Monte Carlo simulations. Julia is praised for combining ease of use and expressivity with high performance, making it accessible even to non-professional coders. One of Julia's core principles is that user-created objects have first-class status and performance equivalent to built-in features, with no hidden optimizations, promoting democratization of performance. Julia Computing's collaboration with ARPA-E aims to develop physics models with a focus on rapid development and efficiency.

Case study: ARPA-E HVAC modeling

00:08:20 - 00:10:38 The segment discusses a novel approach to engineering HVAC design using Julia, highlighting its ability to model complex interactions such as room dynamics, vents, ducts, and the non-linear effects of doors opening and closing. Julia's differential equations ecosystem enables robust and rapid simulation, facilitating quick iteration in HVAC engineering. This work, supported by a recent grant, aims to significantly reduce the costs of HVAC systems in large buildings like skyscrapers. Although still under development, the project is promising and exemplifies Julia's speed, expressiveness, and user-friendly IDE that lowers barriers for users transitioning from other programming languages.

Julia Pro IDE and package ecosystem

00:10:05 - 00:11:10 The Julia Pro IDE by Julia Computing offers a familiar and user-friendly development environment, featuring an editor pane, workspace, plots, console, an embedded debugger with breakpoints, and profiling tools to enhance the coding experience.

00:10:37 - 00:11:55 Julia benefits from a vast and growing package ecosystem with over 3,500 packages and 17 million downloads. Notably, the Jump package supports operations research by enabling users to solve constrained optimization problems through a domain-specific language interfacing with various solvers.

00:11:16 - 00:12:23 Jump facilitates working with multiple solvers for optimization problems in a unified language, allowing easy switching between solvers like Gurobi. Flux is introduced as Julia's deep learning package, comparable to TensorFlow or Torch, providing essential tools to build neural networks.

00:11:50 - 00:13:04 Flux supports differentiable programming by efficiently computing gradients crucial for machine learning optimization. Since Julia and its packages are written in Julia, Flux leverages this to offer robust and fast automatic differentiation capabilities.

00:12:26 - 00:13:38 Flux's advanced automatic differentiation techniques surpass those in other ecosystems, enabling more reliable and faster gradient calculations. Additionally, Julia's data science ecosystem is extensive, with strong support for table handling and differential equations, critical components of many data workflows.

Data processing and CSV benchmarks

00:13:02 - 00:15:25 The segment discusses the importance of loading data efficiently in data science workflows, highlighting the continued dominance of CSV files as a standard format for interoperability. It compares the speed of loading CSVs in Python, R, and Julia, noting that Julia significantly outperforms the others by 10 to 20 times, which is especially impactful when working with large datasets of hundreds of gigabytes. This performance advantage accelerates data processing tasks, making it easier and faster to begin analysis.

00:14:51 - 00:17:09 This section explains that unlike Python and R, which rely on CSV parsers written in C, Julia's CSV parser is implemented entirely in Julia. This offers users the benefit of both high performance and full access to the package's source code. A key factor behind Julia's speed is its advanced multi-threading capabilities, allowing it to utilize multiple CPU threads efficiently for faster data loading and processing, unlike Python's pandas parser.

00:16:32 - 00:19:56 The final segment covers benchmarks comparing Julia's data processing performance with other languages across common tasks like group-bys and joins. Julia performs on par with best-in-class tools such as Spark and ClickHouse, especially after the initial compilation overhead. Julia excels at handling medium-sized datasets (around 50 gigabytes) on a single machine with ample RAM, eliminating the need for distributed computing in many cases. It efficiently processes data with modest memory overhead, making it suitable for use on laptops, cloud instances, or high-performance nodes.

Differential equations ecosystem

00:19:23 - 00:21:01 The differential equations ecosystem discussed is considered the most robust across any programming language, backed by a comprehensive and fair comparison conducted by Christopher Caucus, a key contributor and user of multiple differential equation tools. This ecosystem can interface with nearly all other major libraries, as shown in a detailed, albeit complex, comparative slide where each column represents a different language or ecosystem.

00:21:02 - 00:22:13 The slide categorizes various features such as types of ODE solving methods, efficiency, and flexibility, showing the differential equations system's extensive support and adaptability. It integrates with other languages to ensure robust, validated, and fast solutions. This integration forms a foundation for scientific machine learning, which is especially relevant in energy research, where traditional high-performance computing models remain important.

Scientific machine learning in energy

00:21:40 - 00:22:48 The discussion begins with the use of high-performance computing (HPC) for simulations of physics models and molecular dynamics. It highlights that the fastest supercomputers, tracked by the Top500 organization, are mostly sponsored by state departments of energy in nuclear-capable countries. These supercomputers are primarily used to simulate nuclear explosions, which has historically driven the development and deployment of large HPC clusters.

00:22:14 - 00:23:28 HPC has traditionally been focused on simulations, but in the past two decades, there has been a surge in learned models using large neural networks that rely on data rather than fundamental physics. This marks a shift towards machine learning approaches that attempt to capture complex behaviors entirely from data, diverging from traditional simulation methods.

00:22:51 - 00:23:58 A new approach called scientific machine learning is emerging, blending traditional physics-based models with machine learning. This method uses fundamental knowledge augmented by neural networks to build more capable and robust models, combining the strengths of both simulation and learned data-driven techniques.

00:23:24 - 00:24:29 Scientific machine learning can embed neural networks inside differential equations. Using the Julia language and its neural network library Flux, a neural network can be treated as a function with tunable parameters integrated into a differential equation. This allows modeling parts of a system where the physics are unknown by using the neural network to represent those unknown interactions.

00:23:57 - 00:24:58 By combining neural networks with differential equations, models can be smaller and require less data, while remaining interpretable. The known dynamics are expressed through differential equations, while the neural network models the unknown components. This hybrid approach improves model interpretability and data efficiency.

00:24:28 - 00:25:28 Once the neural network is trained on existing data, it can be analyzed to understand how it transforms inputs, offering greater interpretability compared to very deep networks. This approach also reduces the amount of data needed for training, making it a more practical and insightful modeling technique.

00:24:58 - 00:25:56 Another application involves identifying coefficients within neural networks for equations like the shallow water or Burgers equations. Starting with random coefficients, the network can be trained to fit data and discover the correct coefficients, enabling the extraction of underlying physical parameters directly from data.

Neural networks with differential equations

00:25:27 - 00:27:13 The speaker demonstrates a trebuchet model represented by a set of differential equations, highlighting its fun and educational aspects. They discuss fitting data to the model, mentioning that the process took under 20 minutes with a small dataset, although they overfitted by training longer than necessary. The overall approach is tractable and easily performed on a laptop without high-performance computing resources.

00:27:11 - 00:28:46 The model training process includes helpful features like progress bars and time estimates, with an example estimating 23 minutes to complete. The speaker explains that while solving the differential equations for the trebuchet is complex and slow, training a neural network to approximate these equations can enable rapid predictions. This allows quick optimization of parameters such as counterweight, launch speed, and aiming angle to hit a target under varying conditions.

00:28:14 - 00:29:54 This neural network approximation approach extends beyond trebuchets to practical applications like power generation and turbine operation, where physics are well understood but computationally expensive. The data for training the network is generated by the ODE solver itself on demand. Because the system is differentiable, gradients can be computed efficiently, enabling fast and accurate parameter estimation based on physical models.

00:29:20 - 00:30:32 The neural network provides estimates for parameters such as counterweight to meet specific targets, which can then be validated by plugging them back into the differential equation model. The speaker concludes by mentioning an upcoming demonstration of Monte Carlo simulation, emphasizing its ease and speed within this framework.

Monte Carlo simulation example

00:29:55 - 00:33:17 The speaker explains a Monte Carlo simulation that estimates pi by throwing darts at a dartboard and comparing how many land inside a circle versus a square. This simple problem exemplifies embarrassingly parallel computations, allowing distribution of random number generation and calculations across many computers. The Julia code shown comprises six lines that define a function using a distributed for loop with a special macro to parallelize the task. Each iteration generates two random numbers to simulate dart positions, checks if they land inside the circle, and then uses a reduction operation to sum results efficiently. This concise example highlights Julia's capability for expressing distributed simulations easily, supporting complex computations, and promoting reproducible scientific analysis.

Reproducibility and package management

00:32:44 - 00:33:55 Julia's package ecosystem automatically tracks all dependencies used within a project, ensuring project-level reproducibility. It maintains two key configuration files, a project file and a manifest file, which help track dependencies and their changes. Users typically don't need to edit these files manually, but they allow for version control and consistent environment setup.

00:33:19 - 00:34:16 The project file records top-level dependencies and ensures that only those packages are used within a project. Julia actively manages this to prevent usage of packages outside the project scope. Meanwhile, the manifest file is automatically generated and tracks all dependencies, including indirect ones, with precise versioning and content hashes to guarantee reproducibility.

00:33:48 - 00:35:04 Julia's manifest file details every dependency down to exact versions and hashes, including second, third, and further order dependencies. This comprehensive tracking allows users to always restore the exact environment needed, making package reproducibility reliable and robust. Users do not need to understand the internal workings of the manifest file.

00:34:26 - 00:35:30 Julia enables cross-platform reproducibility by allowing users to share their project and manifest files across different operating systems, such as macOS, Linux, and Windows. This ensures all dependencies are set up identically on any machine, facilitating collaboration and consistent results regardless of the environment.

00:34:59 - 00:36:02 Julia guarantees numerical stability and reproducibility down to the last bit across platforms and operating systems, which is a rare and challenging feat for programming languages. Additionally, Julia is developing new features like data artifacts to further enhance reproducibility and data management.

Julia deployment at scale with JuliaRun

00:35:34 - 00:36:39 The system tracks input data to ensure pure reproducibility not only of source code but also of the data used in analyses. This cutting-edge technique allows specifying data reproducibly so that when computations run on the cloud, both the code and input data are precisely known and output data can be tagged accordingly.

00:36:07 - 00:37:29 These reproducibility features enable deploying Julia at scale on HPC clusters, whether on-premises or in the cloud. Consumer-level hardware from major cloud providers like Google, Microsoft, and AWS offers powerful resources. Julia Team facilitates easy and quick deployment on these platforms.

00:36:48 - 00:38:00 Julia Team provides a single source of truth for all packages, supporting collaboration on package development. It mixes private development with the public ecosystem, allowing creation and deployment of private packages at scale on the cloud. The platform offers conveniences like unified package documentation search.

00:37:23 - 00:38:20 The documentation search covers private packages within an enterprise as well as the public ecosystem. Users can maintain private forks and, crucially, Julia Team works seamlessly in secure, firewalled environments, enabling package management even in air-gapped networks.

00:37:52 - 00:39:11 Julia Team overcomes firewall restrictions that typically block package installation in secure environments. It builds on the platform to track code location and enables effortless deployment configured for on-premises HPC clusters or direct cloud deployment, including spinning up cloud machines on demand.00:38:36 - 00:39:57Julia Team allows distribution of code and data to cloud machines for seamless computation. The interface provides access to all packages and their documentation, facilitating easy discovery and use of packages like differential equations within the unified platform.

Julia Team: docs and private packages

00:39:22 - 00:41:38 The speaker demonstrates how to use documentation search within a firewalled environment, allowing access to both public and private packages locally. They highlight searching for a specific package called 'estimate pi' and reveal a private 'secret analysis' package with locally hosted, well-rendered documentation including LaTeX equations and plots. This example illustrates how documentation can be thoroughly integrated and displayed. The segment concludes by mentioning deployment of such applications in the cloud using Julia's application framework.

Cloud deployment and cluster configuration

00:41:05 - 00:42:46 The speaker demonstrates how to run a Julia job in the cloud by clicking 'run in cloud,' which automatically sets up the necessary cloud resources along with code, data, and artifacts for reproducibility. They explain the presence of project and manifest files to ensure consistent environments and mention tracking input data, although the current Monte Carlo simulation example does not require any. The user can adjust parameters such as the number of iterations to increase accuracy, and select either the release or development branches for running the job.

00:42:11 - 00:43:57 The interface allows detailed cluster configuration, including choosing the number of threads or cores per node, and estimating the cost of the job before running it. The user opts to run the job on five nodes with eight cores each, totaling 40 workers. After launching, the job status appears in a results table, showing inputs, outputs, and the running status, enabling easy monitoring of the cloud job's progress.

Job management and parallel performance

00:43:20 - 00:44:59 The speaker demonstrates a job running live, showing inputs such as the number of workers, threads, memory per worker, and iterations. They highlight previous runs with different worker configurations and explain how the task was efficiently parallelized to estimate pi with one billion iterations in under a second. Once the job completed, the workers were terminated. The demo concludes with thanks and an invitation for questions, noting the presence of the CEO of Julia Computing on the call.

Q&A: Julia applications and resources

00:44:29 - 00:46:59 The speaker addresses questions about Julia applications in economics and recommends visiting juliacomputing.com, especially the resources and webinars sections, for relevant content including a recent webinar on Julia for finance. They highlight available case studies and an energy ecosystem page as useful resources. The speaker acknowledges a question about using Jump for energy dispatch optimization, noting it as a good use case but without an immediate example, and suggests exploring it further.

Q&A: Compilation and distributed workloads

00:46:39 - 00:48:21 The discussion addresses how Julia handles compilation in distributed workloads. Julia compiles specialized code when a function is run for the first time, which can cause initial delays such as in generating the first plot. Recent improvements in Julia 1.5 significantly reduce this startup time from about 30 seconds to 10 seconds, with subsequent operations becoming nearly instantaneous.

00:47:47 - 00:49:44 In distributed computing, Julia compiles specialized functions on each worker by default, which can introduce overhead. To mitigate this, Julia supports pre-compilation, and Julia Run automates distributing precompiled packages to workers to reduce delays. Additionally, Julia supports custom transports for HPC systems through cluster managers, enabling integration with various distributed compute environments.

Q&A: HPC cluster support and transports

00:49:07 - 00:50:52 The discussion focuses on specifying the cluster manager, queue system, and transport system to be used between clusters, highlighting the importance of both transport and queuing systems. MPI is emphasized as a key component, well-supported and heavily utilized in the next-generation climate simulation project, Climber. The segment also touches on the challenges of HPC compute clusters, each having unique characteristics, and notes that Julia Run primarily targets a cloud compute model while also supporting specialized interconnects for certain clusters. Viewers are encouraged to reach out with questions about these setups.

Q&A: Finding Julia users in organization

00:50:29 - 00:52:17 The discussion addresses how organizations using Julia can connect internally through Julia Computing, which can facilitate introductions if permission is granted, despite challenges like corporate secrecy. Additionally, a tip is shared for effectively searching Julia-related information online, such as focusing on package names ending with .jl on GitHub to find relevant resources rather than unrelated results.

Q&A: Searching Julia resources effectively

00:51:42 - 00:53:36 The speaker explains a useful convention in Julia package naming, where packages typically have a '.jl' suffix. This helps when searching for new techniques or tools, as adding '.jl' to a search term can yield more relevant results. For example, searching 'mpi jl' instead of 'mpi julia' provides better matches. Both 'jl' and 'julia' as search keywords are effective for finding Julia code, especially for compute-related tools.00:52:58 - 00:54:03The speaker introduces juliahub.com as a public platform for searching the Julia ecosystem. This site contains the entire public Julia package ecosystem and allows searching documentation and packages. However, it currently lacks features for private or corporate packages, compute model integration, and firewall protection, making it ideal for individual users looking through public resources.

Q&A: Julia packages for power systems

00:53:30 - 00:55:18 The speaker explains how to use powersystems.jl by accessing its documentation via Julia packages to understand its purpose in power systems modeling and analysis. They highlight effective search techniques for Julia packages, recommending JuliaHub and Google as primary tools. Additionally, they mention the intention to add powersystems.jl to an energy-related resource page. The segment ends with a transition towards addressing a question about deep learning.

Q&A: Deep learning and GPU support

00:54:55 - 00:56:07 The speaker introduces Julia's GPU implementation for deep learning, highlighting its robust support for CUDA processing. The programming model is seamless, allowing users to specify arrays on the GPU easily.

00:56:08 - 00:57:14 All computations on GPU arrays, including broadcasting operations, are executed on the GPU, simplifying GPU computing. Various Julia packages natively support GPU arrays, enabling machine learning training with Flux and solving differential equations on the GPU, though not all differential equation solvers benefit equally from GPU acceleration.

00:56:42 - 00:57:46 Julia's GPU capabilities allow it to compete with major frameworks like PyTorch and TensorFlow, which have much larger resources. Julia offers more expressive and easier-to-understand solutions. The session concludes with thanks to the audience as it reaches the top of the hour.

Session closing and final remarks

00:57:14 - 00:57:53 The speaker thanks the audience for their great questions and apologizes if any were missed. They encourage attendees to reach out to them or anyone at Julia Computing for further inquiries and express gratitude for their attendance.

Contact Us

Want to get enterprise support, schedule a demo, or learn about how we can help build a custom solution? We are here to help.

Contact Us

Want to get enterprise support, schedule a demo, or learn about how we can help build a custom solution? We are here to help.

/

/

Feel The Power - Julia for Energy

/

/

Feel The Power - Julia for Energy