All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zi Yan <ziy@nvidia.com>
To: David Rientjes <rientjes@google.com>
Cc: Alex Shi <alex.shi@linux.alibaba.com>,
	Hugh Dickins <hughd@google.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Song Liu <songliubraving@fb.com>,
	"Michal Hocko" <mhocko@suse.com>,
	Matthew Wilcox <willy@infradead.org>,
	Minchan Kim <minchan@kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>,
	Chris Kennelly <ckennelly@google.com>, <linux-mm@kvack.org>
Subject: Re: [RFC] Hugepage collapse in process context
Date: Wed, 17 Feb 2021 10:49:22 -0500	[thread overview]
Message-ID: <C8C89F13-3F04-456B-BA76-DE2C378D30BF@nvidia.com> (raw)
In-Reply-To: <d098c392-273a-36a4-1a29-59731cdf5d3d@google.com>

[-- Attachment #1: Type: text/plain, Size: 2560 bytes --]

On 16 Feb 2021, at 23:24, David Rientjes wrote:

> Hi everybody,
>
> Khugepaged is slow by default, it scans at most 4096 pages every 10s.
> That's normally fine as a system-wide setting, but some applications would
> benefit from a more aggressive approach (as long as they are willing to
> pay for it).
>
> Instead of adding priorities for eligible ranges of memory to khugepaged,
> temporarily speeding khugepaged up for the whole system, or sharding its
> work for memory belonging to a certain process, one approach would be to
> allow userspace to induce hugepage collapse.
>
> The benefit to this approach would be that this is done in process context
> so its cpu is charged to the process that is inducing the collapse.
> Khugepaged is not involved.
>
> Idea was to allow userspace to induce hugepage collapse through the new
> process_madvise() call.  This allows us to collapse hugepages on behalf of
> current or another process for a vectored set of ranges.
>
> This could be done through a new process_madvise() mode *or* it could be a
> flag to MADV_HUGEPAGE since process_madvise() allows for a flag parameter
> to be passed.  For example, MADV_F_SYNC.
>
> When done, this madvise call would allocate a hugepage on the right node
> and attempt to do the collapse in process context just as khugepaged would
> otherwise do.
>
> This would immediately be useful for a malloc implementation, for example,
> that has released its memory back to the system using MADV_DONTNEED and
> will subsequently refault the memory.  Rather than wait for khugepaged to
> come along 30m later, for example, and collapse this memory into a
> hugepage (which could take a much longer time on a very large system), an
> alternative would be to use this process_madvise() mode to induce the
> action up front.  In other words, say "I'm returning this memory to the
> application and it's going to be hot, so back it by a hugepage now rather
> than waiting until later."
>
> It would also be useful for read-only file-backed mappings for text
> segments.  Khugepaged should be happy, it's just less work done by generic
> kthreads that gets charged as an overall tax to everybody.
>
> Thoughts?

The idea sounds great to me.

One question on how it interacts with khugepaged: will the process be excluded
from khugepaged if this process_madvise() is used on it? Since it may save
khugepaged some additional scanning work if someone is actively collapsing
hugepages for this process.


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

  parent reply	other threads:[~2021-02-17 15:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-17  4:24 [RFC] Hugepage collapse in process context David Rientjes
2021-02-17  8:21 ` Michal Hocko
2021-02-18 13:43   ` Vlastimil Babka
2021-02-18 13:52     ` David Hildenbrand
2021-02-18 22:34       ` David Rientjes
2021-02-18 22:34         ` David Rientjes
2021-02-19 16:16         ` Zi Yan
2021-02-24  9:44         ` Alex Shi
2021-03-01 20:56           ` David Rientjes
2021-03-01 20:56             ` David Rientjes
2021-03-04 10:52             ` Alex Shi
2021-02-17 15:49 ` Zi Yan [this message]
2021-02-18  8:11 ` Song Liu
2021-02-18  8:39   ` Michal Hocko
2021-02-18  9:53     ` Song Liu
2021-02-18 10:01       ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C8C89F13-3F04-456B-BA76-DE2C378D30BF@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=alex.shi@linux.alibaba.com \
    --cc=ckennelly@google.com \
    --cc=hughd@google.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=rientjes@google.com \
    --cc=songliubraving@fb.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.