Latest Tutorials

Learn about the latest technologies from fellow newline community members!

  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL
  • React
  • Angular
  • Vue
  • Svelte
  • NextJS
  • Redux
  • Apollo
  • Storybook
  • D3
  • Testing Library
  • JavaScript
  • TypeScript
  • Node.js
  • Deno
  • Rust
  • Python
  • GraphQL

    How to Implement Tensor Parallelism for Faster Inference

    Implementing tensor parallelism accelerates large language model (LLM) inference by distributing computations across GPUs, reducing latency for real-world applications. Below is a structured breakdown of key insights and practical considerations for developers: Benefits: Challenges:
    Thumbnail Image of Tutorial How to Implement Tensor Parallelism for Faster Inference

      Retrieval‑Augmented Model Enhances TRIZ‑Based Patent Entity Recognition

      The retrieval-augmented model outperforms traditional TRIZ-based patent entity recognition methods by integrating dynamic contextual data during analysis. Traditional approaches rely on static rule-based systems or limited training datasets, which struggle with evolving patent terminology and…
      Thumbnail Image of Tutorial Retrieval‑Augmented Model Enhances TRIZ‑Based Patent Entity Recognition

      I got a job offer, thanks in a big part to your teaching. They sent a test as part of the interview process, and this was a huge help to implement my own Node server.

      This has been a really good investment!

      Advance your career with newline Pro.

      Only $40 per month for unlimited access to over 60+ books, guides and courses!

      Learn More

        Using Sharpness-Aware Minimization to Boost Deep Learning Models

        Sharpness-Aware Minimization (SAM) is an optimization technique designed to improve the generalization of deep learning models by flattening the loss landscape during training. Unlike traditional methods like Stochastic Gradient Descent (SGD) or Adam, SAM explicitly balances minimizing the loss and…
        Thumbnail Image of Tutorial Using Sharpness-Aware Minimization to Boost Deep Learning Models

          Tensor Parallelism Checklist: Maximize GPU Utilization

          Tensor parallelism splits model computations across GPUs to boost efficiency. Below is a comparison of key techniques: Tensor parallelism improves training speed by 2–4x compared to single-GPU setups, as seen in vLLM benchmarks. It also enhances model accuracy by maintaining full-precision…
          Thumbnail Image of Tutorial Tensor Parallelism Checklist: Maximize GPU Utilization

            What Is Tensor Parallelism and How to Apply It

            Watch: Scale ANY Model: PyTorch DDP, ZeRO, Pipeline & Tensor Parallelism Made Simple (2025 Guide) by Zachary Mueller Tensor Parallelism (TP) is a distributed computing strategy that splits large model tensors across multiple GPUs to reduce memory usage and accelerate training/inference. Unlike Data…
            Thumbnail Image of Tutorial What Is Tensor Parallelism and How to Apply It