Summary of significant series in this pull request:

- The 4 patch series "mm: ksm: prevent KSM from breaking merging of new
   VMAs" from Lorenzo Stoakes addresses an issue with KSM's
   PR_SET_MEMORY_MERGE mode: newly mapped VMAs were not eligible for
   merging with existing adjacent VMAs.
 
 - The 4 patch series "mm/damon: introduce DAMON_STAT for simple and
   practical access monitoring" from SeongJae Park adds a new kernel module
   which simplifies the setup and usage of DAMON in production
   environments.
 
 - The 6 patch series "stop passing a writeback_control to swap/shmem
   writeout" from Christoph Hellwig is a cleanup to the writeback code
   which removes a couple of pointers from struct writeback_control.
 
 - The 7 patch series "drivers/base/node.c: optimization and cleanups"
   from Donet Tom contains largely uncorrelated cleanups to the NUMA node
   setup and management code.
 
 - The 4 patch series "mm: userfaultfd: assorted fixes and cleanups" from
   Tal Zussman does some maintenance work on the userfaultfd code.
 
 - The 5 patch series "Readahead tweaks for larger folios" from Ryan
   Roberts implements some tuneups for pagecache readahead when it is
   reading into order>0 folios.
 
 - The 4 patch series "selftests/mm: Tweaks to the cow test" from Mark
   Brown provides some cleanups and consistency improvements to the
   selftests code.
 
 - The 4 patch series "Optimize mremap() for large folios" from Dev Jain
   does that.  A 37% reduction in execution time was measured in a
   memset+mremap+munmap microbenchmark.
 
 - The 5 patch series "Remove zero_user()" from Matthew Wilcox expunges
   zero_user() in favor of the more modern memzero_page().
 
 - The 3 patch series "mm/huge_memory: vmf_insert_folio_*() and
   vmf_insert_pfn_pud() fixes" from David Hildenbrand addresses some warts
   which David noticed in the huge page code.  These were not known to be
   causing any issues at this time.
 
 - The 3 patch series "mm/damon: use alloc_migrate_target() for
   DAMOS_MIGRATE_{HOT,COLD" from SeongJae Park provides some cleanup and
   consolidation work in DAMON.
 
 - The 3 patch series "use vm_flags_t consistently" from Lorenzo Stoakes
   uses vm_flags_t in places where we were inappropriately using other
   types.
 
 - The 3 patch series "mm/memfd: Reserve hugetlb folios before
   allocation" from Vivek Kasireddy increases the reliability of large page
   allocation in the memfd code.
 
 - The 14 patch series "mm: Remove pXX_devmap page table bit and pfn_t
   type" from Alistair Popple removes several now-unneeded PFN_* flags.
 
 - The 5 patch series "mm/damon: decouple sysfs from core" from SeongJae
   Park implememnts some cleanup and maintainability work in the DAMON
   sysfs layer.
 
 - The 5 patch series "madvise cleanup" from Lorenzo Stoakes does quite a
   lot of cleanup/maintenance work in the madvise() code.
 
 - The 4 patch series "madvise anon_name cleanups" from Vlastimil Babka
   provides additional cleanups on top or Lorenzo's effort.
 
 - The 11 patch series "Implement numa node notifier" from Oscar Salvador
   creates a standalone notifier for NUMA node memory state changes.
   Previously these were lumped under the more general memory on/offline
   notifier.
 
 - The 6 patch series "Make MIGRATE_ISOLATE a standalone bit" from Zi Yan
   cleans up the pageblock isolation code and fixes a potential issue which
   doesn't seem to cause any problems in practice.
 
 - The 5 patch series "selftests/damon: add python and drgn based DAMON
   sysfs functionality tests" from SeongJae Park adds additional drgn- and
   python-based DAMON selftests which are more comprehensive than the
   existing selftest suite.
 
 - The 5 patch series "Misc rework on hugetlb faulting path" from Oscar
   Salvador fixes a rather obscure deadlock in the hugetlb fault code and
   follows that fix with a series of cleanups.
 
 - The 3 patch series "cma: factor out allocation logic from
   __cma_declare_contiguous_nid" from Mike Rapoport rationalizes and cleans
   up the highmem-specific code in the CMA allocator.
 
 - The 28 patch series "mm/migration: rework movable_ops page migration
   (part 1)" from David Hildenbrand provides cleanups and
   future-preparedness to the migration code.
 
 - The 2 patch series "mm/damon: add trace events for auto-tuned
   monitoring intervals and DAMOS quota" from SeongJae Park adds some
   tracepoints to some DAMON auto-tuning code.
 
 - The 6 patch series "mm/damon: fix misc bugs in DAMON modules" from
   SeongJae Park does that.
 
 - The 6 patch series "mm/damon: misc cleanups" from SeongJae Park also
   does what it claims.
 
 - The 4 patch series "mm: folio_pte_batch() improvements" from David
   Hildenbrand cleans up the large folio PTE batching code.
 
 - The 13 patch series "mm/damon/vaddr: Allow interleaving in
   migrate_{hot,cold} actions" from SeongJae Park facilitates dynamic
   alteration of DAMON's inter-node allocation policy.
 
 - The 3 patch series "Remove unmap_and_put_page()" from Vishal Moola
   provides a couple of page->folio conversions.
 
 - The 4 patch series "mm: per-node proactive reclaim" from Davidlohr
   Bueso implements a per-node control of proactive reclaim - beyond the
   current memcg-based implementation.
 
 - The 14 patch series "mm/damon: remove damon_callback" from SeongJae
   Park replaces the damon_callback interface with a more general and
   powerful damon_call()+damos_walk() interface.
 
 - The 10 patch series "mm/mremap: permit mremap() move of multiple VMAs"
   from Lorenzo Stoakes implements a number of mremap cleanups (of course)
   in preparation for adding new mremap() functionality: newly permit the
   remapping of multiple VMAs when the user is specifying MREMAP_FIXED.  It
   still excludes some specialized situations where this cannot be
   performed reliably.
 
 - The 3 patch series "drop hugetlb_free_pgd_range()" from Anthony Yznaga
   switches some sparc hugetlb code over to the generic version and removes
   the thus-unneeded hugetlb_free_pgd_range().
 
 - The 4 patch series "mm/damon/sysfs: support periodic and automated
   stats update" from SeongJae Park augments the present
   userspace-requested update of DAMON sysfs monitoring files.  Automatic
   update is now provided, along with a tunable to control the update
   interval.
 
 - The 4 patch series "Some randome fixes and cleanups to swapfile" from
   Kemeng Shi does what is claims.
 
 - The 4 patch series "mm: introduce snapshot_page" from Luiz Capitulino
   and David Hildenbrand provides (and uses) a means by which debug-style
   functions can grab a copy of a pageframe and inspect it locklessly
   without tripping over the races inherent in operating on the live
   pageframe directly.
 
 - The 6 patch series "use per-vma locks for /proc/pid/maps reads" from
   Suren Baghdasaryan addresses the large contention issues which can be
   triggered by reads from that procfs file.  Latencies are reduced by more
   than half in some situations.  The series also introduces several new
   selftests for the /proc/pid/maps interface.
 
 - The 6 patch series "__folio_split() clean up" from Zi Yan cleans up
   __folio_split()!
 
 - The 7 patch series "Optimize mprotect() for large folios" from Dev
   Jain provides some quite large (>3x) speedups to mprotect() when dealing
   with large folios.
 
 - The 2 patch series "selftests/mm: reuse FORCE_READ to replace "asm
   volatile("" : "+r" (XXX));" and some cleanup" from wang lian does some
   cleanup work in the selftests code.
 
 - The 3 patch series "tools/testing: expand mremap testing" from Lorenzo
   Stoakes extends the mremap() selftest in several ways, including adding
   more checking of Lorenzo's recently added "permit mremap() move of
   multiple VMAs" feature.
 
 - The 22 patch series "selftests/damon/sysfs.py: test all parameters"
   from SeongJae Park extends the DAMON sysfs interface selftest so that it
   tests all possible user-requested parameters.  Rather than the present
   minimal subset.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaIqcCgAKCRDdBJ7gKXxA
 jkVBAQCCn9DR1QP0CRk961ot0cKzOgioSc0aA03DPb2KXRt2kQEAzDAz0ARurFhL
 8BzbvI0c+4tntHLXvIlrC33n9KWAOQM=
 =XsFy
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:
 "As usual, many cleanups. The below blurbiage describes 42 patchsets.
  21 of those are partially or fully cleanup work. "cleans up",
  "cleanup", "maintainability", "rationalizes", etc.

  I never knew the MM code was so dirty.

  "mm: ksm: prevent KSM from breaking merging of new VMAs" (Lorenzo Stoakes)
     addresses an issue with KSM's PR_SET_MEMORY_MERGE mode: newly
     mapped VMAs were not eligible for merging with existing adjacent
     VMAs.

  "mm/damon: introduce DAMON_STAT for simple and practical access monitoring" (SeongJae Park)
     adds a new kernel module which simplifies the setup and usage of
     DAMON in production environments.

  "stop passing a writeback_control to swap/shmem writeout" (Christoph Hellwig)
     is a cleanup to the writeback code which removes a couple of
     pointers from struct writeback_control.

  "drivers/base/node.c: optimization and cleanups" (Donet Tom)
     contains largely uncorrelated cleanups to the NUMA node setup and
     management code.

  "mm: userfaultfd: assorted fixes and cleanups" (Tal Zussman)
     does some maintenance work on the userfaultfd code.

  "Readahead tweaks for larger folios" (Ryan Roberts)
     implements some tuneups for pagecache readahead when it is reading
     into order>0 folios.

  "selftests/mm: Tweaks to the cow test" (Mark Brown)
     provides some cleanups and consistency improvements to the
     selftests code.

  "Optimize mremap() for large folios" (Dev Jain)
     does that. A 37% reduction in execution time was measured in a
     memset+mremap+munmap microbenchmark.

  "Remove zero_user()" (Matthew Wilcox)
     expunges zero_user() in favor of the more modern memzero_page().

  "mm/huge_memory: vmf_insert_folio_*() and vmf_insert_pfn_pud() fixes" (David Hildenbrand)
     addresses some warts which David noticed in the huge page code.
     These were not known to be causing any issues at this time.

  "mm/damon: use alloc_migrate_target() for DAMOS_MIGRATE_{HOT,COLD" (SeongJae Park)
     provides some cleanup and consolidation work in DAMON.

  "use vm_flags_t consistently" (Lorenzo Stoakes)
     uses vm_flags_t in places where we were inappropriately using other
     types.

  "mm/memfd: Reserve hugetlb folios before allocation" (Vivek Kasireddy)
     increases the reliability of large page allocation in the memfd
     code.

  "mm: Remove pXX_devmap page table bit and pfn_t type" (Alistair Popple)
     removes several now-unneeded PFN_* flags.

  "mm/damon: decouple sysfs from core" (SeongJae Park)
     implememnts some cleanup and maintainability work in the DAMON
     sysfs layer.

  "madvise cleanup" (Lorenzo Stoakes)
     does quite a lot of cleanup/maintenance work in the madvise() code.

  "madvise anon_name cleanups" (Vlastimil Babka)
     provides additional cleanups on top or Lorenzo's effort.

  "Implement numa node notifier" (Oscar Salvador)
     creates a standalone notifier for NUMA node memory state changes.
     Previously these were lumped under the more general memory
     on/offline notifier.

  "Make MIGRATE_ISOLATE a standalone bit" (Zi Yan)
     cleans up the pageblock isolation code and fixes a potential issue
     which doesn't seem to cause any problems in practice.

  "selftests/damon: add python and drgn based DAMON sysfs functionality tests" (SeongJae Park)
     adds additional drgn- and python-based DAMON selftests which are
     more comprehensive than the existing selftest suite.

  "Misc rework on hugetlb faulting path" (Oscar Salvador)
     fixes a rather obscure deadlock in the hugetlb fault code and
     follows that fix with a series of cleanups.

  "cma: factor out allocation logic from __cma_declare_contiguous_nid" (Mike Rapoport)
     rationalizes and cleans up the highmem-specific code in the CMA
     allocator.

  "mm/migration: rework movable_ops page migration (part 1)" (David Hildenbrand)
     provides cleanups and future-preparedness to the migration code.

  "mm/damon: add trace events for auto-tuned monitoring intervals and DAMOS quota" (SeongJae Park)
     adds some tracepoints to some DAMON auto-tuning code.

  "mm/damon: fix misc bugs in DAMON modules" (SeongJae Park)
     does that.

  "mm/damon: misc cleanups" (SeongJae Park)
     also does what it claims.

  "mm: folio_pte_batch() improvements" (David Hildenbrand)
     cleans up the large folio PTE batching code.

  "mm/damon/vaddr: Allow interleaving in migrate_{hot,cold} actions" (SeongJae Park)
     facilitates dynamic alteration of DAMON's inter-node allocation
     policy.

  "Remove unmap_and_put_page()" (Vishal Moola)
     provides a couple of page->folio conversions.

  "mm: per-node proactive reclaim" (Davidlohr Bueso)
     implements a per-node control of proactive reclaim - beyond the
     current memcg-based implementation.

  "mm/damon: remove damon_callback" (SeongJae Park)
     replaces the damon_callback interface with a more general and
     powerful damon_call()+damos_walk() interface.

  "mm/mremap: permit mremap() move of multiple VMAs" (Lorenzo Stoakes)
     implements a number of mremap cleanups (of course) in preparation
     for adding new mremap() functionality: newly permit the remapping
     of multiple VMAs when the user is specifying MREMAP_FIXED. It still
     excludes some specialized situations where this cannot be performed
     reliably.

  "drop hugetlb_free_pgd_range()" (Anthony Yznaga)
     switches some sparc hugetlb code over to the generic version and
     removes the thus-unneeded hugetlb_free_pgd_range().

  "mm/damon/sysfs: support periodic and automated stats update" (SeongJae Park)
     augments the present userspace-requested update of DAMON sysfs
     monitoring files. Automatic update is now provided, along with a
     tunable to control the update interval.

  "Some randome fixes and cleanups to swapfile" (Kemeng Shi)
     does what is claims.

  "mm: introduce snapshot_page" (Luiz Capitulino and David Hildenbrand)
     provides (and uses) a means by which debug-style functions can grab
     a copy of a pageframe and inspect it locklessly without tripping
     over the races inherent in operating on the live pageframe
     directly.

  "use per-vma locks for /proc/pid/maps reads" (Suren Baghdasaryan)
     addresses the large contention issues which can be triggered by
     reads from that procfs file. Latencies are reduced by more than
     half in some situations. The series also introduces several new
     selftests for the /proc/pid/maps interface.

  "__folio_split() clean up" (Zi Yan)
     cleans up __folio_split()!

  "Optimize mprotect() for large folios" (Dev Jain)
     provides some quite large (>3x) speedups to mprotect() when dealing
     with large folios.

  "selftests/mm: reuse FORCE_READ to replace "asm volatile("" : "+r" (XXX));" and some cleanup" (wang lian)
     does some cleanup work in the selftests code.

  "tools/testing: expand mremap testing" (Lorenzo Stoakes)
     extends the mremap() selftest in several ways, including adding
     more checking of Lorenzo's recently added "permit mremap() move of
     multiple VMAs" feature.

  "selftests/damon/sysfs.py: test all parameters" (SeongJae Park)
     extends the DAMON sysfs interface selftest so that it tests all
     possible user-requested parameters. Rather than the present minimal
     subset"

* tag 'mm-stable-2025-07-30-15-25' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (370 commits)
  MAINTAINERS: add missing headers to mempory policy & migration section
  MAINTAINERS: add missing file to cgroup section
  MAINTAINERS: add MM MISC section, add missing files to MISC and CORE
  MAINTAINERS: add missing zsmalloc file
  MAINTAINERS: add missing files to page alloc section
  MAINTAINERS: add missing shrinker files
  MAINTAINERS: move memremap.[ch] to hotplug section
  MAINTAINERS: add missing mm_slot.h file THP section
  MAINTAINERS: add missing interval_tree.c to memory mapping section
  MAINTAINERS: add missing percpu-internal.h file to per-cpu section
  mm/page_alloc: remove trace_mm_alloc_contig_migrate_range_info()
  selftests/damon: introduce _common.sh to host shared function
  selftests/damon/sysfs.py: test runtime reduction of DAMON parameters
  selftests/damon/sysfs.py: test non-default parameters runtime commit
  selftests/damon/sysfs.py: generalize DAMON context commit assertion
  selftests/damon/sysfs.py: generalize monitoring attributes commit assertion
  selftests/damon/sysfs.py: generalize DAMOS schemes commit assertion
  selftests/damon/sysfs.py: test DAMOS filters commitment
  selftests/damon/sysfs.py: generalize DAMOS scheme commit assertion
  selftests/damon/sysfs.py: test DAMOS destinations commitment
  ...
This commit is contained in:
Linus Torvalds 2025-07-31 14:57:54 -07:00
commit beace86e61
329 changed files with 10740 additions and 5808 deletions

View File

@ -227,3 +227,12 @@ Contact: Jiaqi Yan <jiaqiyan@google.com>
Description:
Of the raw poisoned pages on a NUMA node, how many pages are
recovered by memory error recovery attempt.
What: /sys/devices/system/node/nodeX/reclaim
Date: June 2025
Contact: Linux Memory Management list <linux-mm@kvack.org>
Description:
Perform user-triggered proactive reclaim on a NUMA node.
This interface is equivalent to the memcg variant.
See Documentation/admin-guide/cgroup-v2.rst

View File

@ -44,6 +44,13 @@ Contact: SeongJae Park <sj@kernel.org>
Description: Reading this file returns the pid of the kdamond if it is
running.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/refresh_ms
Date: Jul 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing a value to this file sets the time interval for
automatic DAMON status file contents update. Writing '0'
disables the update. Reading this file returns the value.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/nr_contexts
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>
@ -431,6 +438,28 @@ Description: Directory for DAMON operations set layer-handled DAMOS filters.
/sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/filters
directory.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/dests/nr_dests
Date: Jul 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing a number 'N' to this file creates the number of
directories for setting action destinations of the scheme named
'0' to 'N-1' under the dests/ directory.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/dests/<D>/id
Date: Jul 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the id of
the DAMOS action destination. For DAMOS_MIGRATE_{HOT,COLD}
actions, the destination node's node id can be written and
read.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/dests/<D>/weight
Date: Jul 2025
Contact: SeongJae Park <sj@kernel.org>
Description: Writing to and reading from this file sets and gets the weight
of the DAMOS action destination to select as the destination of
each action among the destinations.
What: /sys/kernel/mm/damon/admin/kdamonds/<K>/contexts/<C>/schemes/<S>/stats/nr_tried
Date: Mar 2022
Contact: SeongJae Park <sj@kernel.org>

View File

@ -14,3 +14,4 @@ access monitoring and access-aware system operations.
usage
reclaim
lru_sort
stat

View File

@ -0,0 +1,69 @@
.. SPDX-License-Identifier: GPL-2.0
===================================
Data Access Monitoring Results Stat
===================================
Data Access Monitoring Results Stat (DAMON_STAT) is a static kernel module that
is aimed to be used for simple access pattern monitoring. It monitors accesses
on the system's entire physical memory using DAMON, and provides simplified
access monitoring results statistics, namely idle time percentiles and
estimated memory bandwidth.
Monitoring Accuracy and Overhead
================================
DAMON_STAT uses monitoring intervals :ref:`auto-tuning
<damon_design_monitoring_intervals_autotuning>` to make its accuracy high and
overhead minimum. It auto-tunes the intervals aiming 4 % of observable access
events to be captured in each snapshot, while limiting the resulting sampling
events to be 5 milliseconds in minimum and 10 seconds in maximum. On a few
production server systems, it resulted in consuming only 0.x % single CPU time,
while capturing reasonable quality of access patterns.
Interface: Module Parameters
============================
To use this feature, you should first ensure your system is running on a kernel
that is built with ``CONFIG_DAMON_STAT=y``. The feature can be enabled by
default at build time, by setting ``CONFIG_DAMON_STAT_ENABLED_DEFAULT`` true.
To let sysadmins enable or disable it at boot and/or runtime, and read the
monitoring results, DAMON_STAT provides module parameters. Following
sections are descriptions of the parameters.
enabled
-------
Enable or disable DAMON_STAT.
You can enable DAMON_STAT by setting the value of this parameter as ``Y``.
Setting it as ``N`` disables DAMON_STAT. The default value is set by
``CONFIG_DAMON_STAT_ENABLED_DEFAULT`` build config option.
estimated_memory_bandwidth
--------------------------
Estimated memory bandwidth consumption (bytes per second) of the system.
DAMON_STAT reads observed access events on the current DAMON results snapshot
and converts it to memory bandwidth consumption estimation in bytes per second.
The resulting metric is exposed to user via this read-only parameter. Because
DAMON uses sampling, this is only an estimation of the access intensity rather
than accurate memory bandwidth.
memory_idle_ms_percentiles
--------------------------
Per-byte idle time (milliseconds) percentiles of the system.
DAMON_STAT calculates how long each byte of the memory was not accessed until
now (idle time), based on the current DAMON results snapshot. If DAMON found a
region of access frequency (nr_accesses) larger than zero, every byte of the
region gets zero idle time. If a region has zero access frequency
(nr_accesses), how long the region was keeping the zero access frequency (age)
becomes the idle time of every byte of the region. Then, DAMON_STAT exposes
the percentiles of the idle time values via this read-only parameter. Reading
the parameter returns 101 idle time values in milliseconds, separated by comma.
Each value represents 0-th, 1st, 2nd, 3rd, ..., 99th and 100th percentile idle
times.

View File

@ -59,7 +59,7 @@ comma (",").
:ref:`/sys/kernel/mm/damon <sysfs_root>`/admin
:ref:`kdamonds <sysfs_kdamonds>`/nr_kdamonds
│ │ :ref:`0 <sysfs_kdamond>`/state,pid
│ │ :ref:`0 <sysfs_kdamond>`/state,pid,refresh_ms
│ │ │ :ref:`contexts <sysfs_contexts>`/nr_contexts
│ │ │ │ :ref:`0 <sysfs_context>`/avail_operations,operations
│ │ │ │ │ :ref:`monitoring_attrs <sysfs_monitoring_attrs>`/
@ -85,6 +85,8 @@ comma (",").
│ │ │ │ │ │ │ :ref:`watermarks <sysfs_watermarks>`/metric,interval_us,high,mid,low
│ │ │ │ │ │ │ :ref:`{core_,ops_,}filters <sysfs_filters>`/nr_filters
│ │ │ │ │ │ │ │ 0/type,matching,allow,memcg_path,addr_start,addr_end,target_idx,min,max
│ │ │ │ │ │ │ :ref:`dests <damon_sysfs_dests>`/nr_dests
│ │ │ │ │ │ │ │ 0/id,weight
│ │ │ │ │ │ │ :ref:`stats <sysfs_schemes_stats>`/nr_tried,sz_tried,nr_applied,sz_applied,sz_ops_filter_passed,qt_exceeds
│ │ │ │ │ │ │ :ref:`tried_regions <sysfs_schemes_tried_regions>`/total_bytes
│ │ │ │ │ │ │ │ 0/start,end,nr_accesses,age,sz_filter_passed
@ -121,8 +123,8 @@ kdamond.
kdamonds/<N>/
-------------
In each kdamond directory, two files (``state`` and ``pid``) and one directory
(``contexts``) exist.
In each kdamond directory, three files (``state``, ``pid`` and ``refresh_ms``)
and one directory (``contexts``) exist.
Reading ``state`` returns ``on`` if the kdamond is currently running, or
``off`` if it is not running.
@ -159,6 +161,13 @@ Users can write below commands for the kdamond to the ``state`` file.
If the state is ``on``, reading ``pid`` shows the pid of the kdamond thread.
Users can ask the kernel to periodically update files showing auto-tuned
parameters and DAMOS stats instead of manually writing
``update_tuned_intervals`` like keywords to ``state`` file. For this, users
should write the desired update time interval in milliseconds to ``refresh_ms``
file. If the interval is zero, the periodic update is disabled. Reading the
file shows currently set time interval.
``contexts`` directory contains files for controlling the monitoring contexts
that this kdamond will execute.
@ -307,10 +316,10 @@ to ``N-1``. Each directory represents each DAMON-based operation scheme.
schemes/<N>/
------------
In each scheme directory, seven directories (``access_pattern``, ``quotas``,
``watermarks``, ``core_filters``, ``ops_filters``, ``filters``, ``stats``, and
``tried_regions``) and three files (``action``, ``target_nid`` and
``apply_interval``) exist.
In each scheme directory, eight directories (``access_pattern``, ``quotas``,
``watermarks``, ``core_filters``, ``ops_filters``, ``filters``, ``dests``,
``stats``, and ``tried_regions``) and three files (``action``, ``target_nid``
and ``apply_interval``) exist.
The ``action`` file is for setting and getting the scheme's :ref:`action
<damon_design_damos_action>`. The keywords that can be written to and read
@ -484,6 +493,29 @@ Refer to the :ref:`DAMOS filters design documentation
of different ``allow`` works, when each of the filters are supported, and
differences on stats.
.. _damon_sysfs_dests:
schemes/<N>/dests/
------------------
Directory for specifying the destinations of given DAMON-based operation
scheme's action. This directory is ignored if the action of the given scheme
is not supporting multiple destinations. Only ``DAMOS_MIGRATE_{HOT,COLD}``
actions are supporting multiple destinations.
In the beginning, the directory has only one file, ``nr_dests``. Writing a
number (``N``) to the file creates the number of child directories named ``0``
to ``N-1``. Each directory represents each action destination.
Each destination directory contains two files, namely ``id`` and ``weight``.
Users can write and read the identifier of the destination to ``id`` file.
For ``DAMOS_MIGRATE_{HOT,COLD}`` actions, the migrate destination node's node
id should be written to ``id`` file. Users can write and read the weight of
the destination among the given destinations to the ``weight`` file. The
weight can be an arbitrary integer. When DAMOS apply the action to each entity
of the memory region, it will select the destination of the action based on the
relative weights of the destinations.
.. _sysfs_schemes_stats:
schemes/<N>/stats/

View File

@ -107,7 +107,7 @@ sysfs
Global THP controls
-------------------
Transparent Hugepage Support for anonymous memory can be entirely disabled
Transparent Hugepage Support for anonymous memory can be disabled
(mostly for debugging purposes) or only enabled inside MADV_HUGEPAGE
regions (to avoid the risk of consuming more memory resources) or enabled
system wide. This can be achieved per-supported-THP-size with one of::
@ -119,6 +119,11 @@ system wide. This can be achieved per-supported-THP-size with one of::
where <size> is the hugepage size being addressed, the available sizes
for which vary by system.
.. note:: Setting "never" in all sysfs THP controls does **not** disable
Transparent Huge Pages globally. This is because ``madvise(...,
MADV_COLLAPSE)`` ignores these settings and collapses ranges to
PMD-sized huge pages unconditionally.
For example::
echo always >/sys/kernel/mm/transparent_hugepage/hugepages-2048kB/enabled
@ -187,7 +192,9 @@ madvise
behaviour.
never
should be self-explanatory.
should be self-explanatory. Note that ``madvise(...,
MADV_COLLAPSE)`` can still cause transparent huge pages to be
obtained even if this mode is specified everywhere.
By default kernel tries to use huge, PMD-mappable zero page on read
page fault to anonymous mapping. It's possible to disable huge zero
@ -378,7 +385,9 @@ always
Attempt to allocate huge pages every time we need a new page;
never
Do not allocate huge pages;
Do not allocate huge pages. Note that ``madvise(..., MADV_COLLAPSE)``
can still cause transparent huge pages to be obtained even if this mode
is specified everywhere;
within_size
Only allocate huge page if it will be fully within i_size.
@ -434,7 +443,9 @@ inherit
have enabled="inherit" and all other hugepage sizes have enabled="never";
never
Do not allocate <size> huge pages;
Do not allocate <size> huge pages. Note that ``madvise(...,
MADV_COLLAPSE)`` can still cause transparent huge pages to be obtained
even if this mode is specified everywhere;
within_size
Only allocate <size> huge page if it will be fully within i_size.

View File

@ -9,6 +9,9 @@ Memory hotplug event notifier
Hotplugging events are sent to a notification queue.
Memory notifier
----------------
There are six types of notification defined in ``include/linux/memory.h``:
MEM_GOING_ONLINE
@ -56,20 +59,18 @@ The third argument (arg) passes a pointer of struct memory_notify::
struct memory_notify {
unsigned long start_pfn;
unsigned long nr_pages;
int status_change_nid_normal;
int status_change_nid;
}
- start_pfn is start_pfn of online/offline memory.
- nr_pages is # of pages of online/offline memory.
- status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
is (will be) set/clear, if this is -1, then nodemask status is not changed.
- status_change_nid is set node id when N_MEMORY of nodemask is (will be)
set/clear. It means a new(memoryless) node gets new memory by online and a
node loses all memory. If this is -1, then nodemask status is not changed.
If status_changed_nid* >= 0, callback should create/discard structures for the
node if necessary.
It is possible to get notified for MEM_CANCEL_ONLINE without having been notified
for MEM_GOING_ONLINE, and the same applies to MEM_CANCEL_OFFLINE and
MEM_GOING_OFFLINE.
This can happen when a consumer fails, meaning we break the callchain and we
stop calling the remaining consumers of the notifier.
It is then important that users of memory_notify make no assumptions and get
prepared to handle such cases.
The callback routine shall return one of the values
NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
@ -83,6 +84,78 @@ further processing of the notification queue.
NOTIFY_STOP stops further processing of the notification queue.
Numa node notifier
------------------
There are six types of notification defined in ``include/linux/node.h``:
NODE_ADDING_FIRST_MEMORY
Generated before memory becomes available to this node for the first time.
NODE_CANCEL_ADDING_FIRST_MEMORY
Generated if NODE_ADDING_FIRST_MEMORY fails.
NODE_ADDED_FIRST_MEMORY
Generated when memory has become available fo this node for the first time.
NODE_REMOVING_LAST_MEMORY
Generated when the last memory available to this node is about to be offlined.
NODE_CANCEL_REMOVING_LAST_MEMORY
Generated when NODE_CANCEL_REMOVING_LAST_MEMORY fails.
NODE_REMOVED_LAST_MEMORY
Generated when the last memory available to this node has been offlined.
A callback routine can be registered by calling::
hotplug_node_notifier(callback_func, priority)
Callback functions with higher values of priority are called before callback
functions with lower values.
A callback function must have the following prototype::
int callback_func(
struct notifier_block *self, unsigned long action, void *arg);
The first argument of the callback function (self) is a pointer to the block
of the notifier chain that points to the callback function itself.
The second argument (action) is one of the event types described above.
The third argument (arg) passes a pointer of struct node_notify::
struct node_notify {
int nid;
}
- nid is the node we are adding or removing memory to.
It is possible to get notified for NODE_CANCEL_ADDING_FIRST_MEMORY without
having been notified for NODE_ADDING_FIRST_MEMORY, and the same applies to
NODE_CANCEL_REMOVING_LAST_MEMORY and NODE_REMOVING_LAST_MEMORY.
This can happen when a consumer fails, meaning we break the callchain and we
stop calling the remaining consumers of the notifier.
It is then important that users of node_notify make no assumptions and get
prepared to handle such cases.
The callback routine shall return one of the values
NOTIFY_DONE, NOTIFY_OK, NOTIFY_BAD, NOTIFY_STOP
defined in ``include/linux/notifier.h``
NOTIFY_DONE and NOTIFY_OK have no effect on the further processing.
NOTIFY_BAD is used as response to the NODE_ADDING_FIRST_MEMORY,
NODE_REMOVING_LAST_MEMORY, NODE_ADDED_FIRST_MEMORY or
NODE_REMOVED_LAST_MEMORY action to cancel hotplugging.
It stops further processing of the notification queue.
NOTIFY_STOP stops further processing of the notification queue.
Please note that we should not fail for NODE_ADDED_FIRST_MEMORY /
NODE_REMOVED_FIRST_MEMORY, as memory_hotplug code cannot rollback at that
point anymore.
Locking Internals
=================

View File

@ -1196,12 +1196,14 @@ SecPageTables
Memory consumed by secondary page tables, this currently includes
KVM mmu and IOMMU allocations on x86 and arm64.
NFS_Unstable
Always zero. Previous counted pages which had been written to
Always zero. Previously counted pages which had been written to
the server, but has not been committed to stable storage.
Bounce
Memory used for block device "bounce buffers"
Always zero. Previously memory used for block device
"bounce buffers".
WritebackTmp
Memory used by FUSE for temporary writeback buffers
Always zero. Previously memory used by FUSE for temporary
writeback buffers.
CommitLimit
Based on the overcommit ratio ('vm.overcommit_ratio'),
this is the total amount of memory currently available to

View File

@ -30,8 +30,6 @@ PTE Page Table Helpers
+---------------------------+--------------------------------------------------+
| pte_protnone | Tests a PROT_NONE PTE |
+---------------------------+--------------------------------------------------+
| pte_devmap | Tests a ZONE_DEVICE mapped PTE |
+---------------------------+--------------------------------------------------+
| pte_soft_dirty | Tests a soft dirty PTE |
+---------------------------+--------------------------------------------------+
| pte_swp_soft_dirty | Tests a soft dirty swapped PTE |
@ -104,8 +102,6 @@ PMD Page Table Helpers
+---------------------------+--------------------------------------------------+
| pmd_protnone | Tests a PROT_NONE PMD |
+---------------------------+--------------------------------------------------+
| pmd_devmap | Tests a ZONE_DEVICE mapped PMD |
+---------------------------+--------------------------------------------------+
| pmd_soft_dirty | Tests a soft dirty PMD |
+---------------------------+--------------------------------------------------+
| pmd_swp_soft_dirty | Tests a soft dirty swapped PMD |
@ -177,8 +173,6 @@ PUD Page Table Helpers
+---------------------------+--------------------------------------------------+
| pud_write | Tests a writable PUD |
+---------------------------+--------------------------------------------------+
| pud_devmap | Tests a ZONE_DEVICE mapped PUD |
+---------------------------+--------------------------------------------------+
| pud_mkyoung | Creates a young PUD |
+---------------------------+--------------------------------------------------+
| pud_mkold | Creates an old PUD |
@ -242,13 +236,13 @@ SWAP Page Table Helpers
========================
+---------------------------+--------------------------------------------------+
| __pte_to_swp_entry | Creates a swapped entry (arch) from a mapped PTE |
| __pte_to_swp_entry | Creates a swp_entry_t (arch) from a swap PTE |
+---------------------------+--------------------------------------------------+
| __swp_to_pte_entry | Creates a mapped PTE from a swapped entry (arch) |
| __swp_entry_to_pte | Creates a swap PTE from a swp_entry_t (arch) |
+---------------------------+--------------------------------------------------+
| __pmd_to_swp_entry | Creates a swapped entry (arch) from a mapped PMD |
| __pmd_to_swp_entry | Creates a swp_entry_t (arch) from a swap PMD |
+---------------------------+--------------------------------------------------+
| __swp_to_pmd_entry | Creates a mapped PMD from a swapped entry (arch) |
| __swp_entry_to_pmd | Creates a swap PMD from a swp_entry_t (arch) |
+---------------------------+--------------------------------------------------+
| is_migration_entry | Tests a migration (read or write) swapped entry |
+-------------------------------+----------------------------------------------+

View File

@ -452,9 +452,9 @@ that supports each action are as below.
- ``lru_deprio``: Deprioritize the region on its LRU lists.
Supported by ``paddr`` operations set.
- ``migrate_hot``: Migrate the regions prioritizing warmer regions.
Supported by ``paddr`` operations set.
Supported by ``vaddr``, ``fvaddr`` and ``paddr`` operations set.
- ``migrate_cold``: Migrate the regions prioritizing colder regions.
Supported by ``paddr`` operations set.
Supported by ``vaddr``, ``fvaddr`` and ``paddr`` operations set.
- ``stat``: Do nothing but count the statistics.
Supported by all operations sets.

View File

@ -7,9 +7,9 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR'
section of 'MAINTAINERS' file.
The mailing lists for the subsystem are damon@lists.linux.dev and
linux-mm@kvack.org. Patches should be made against the `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted
to the mailing lists.
linux-mm@kvack.org. Patches should be made against the `mm-new tree
<https://git.kernel.org/akpm/mm/h/mm-new>`_ whenever possible and posted to the
mailing lists.
SCM Trees
---------
@ -17,17 +17,19 @@ SCM Trees
There are multiple Linux trees for DAMON development. Patches under
development or testing are queued in `damon/next
<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer.
Sufficiently reviewed patches will be queued in `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management
subsystem maintainer. After more sufficient tests, the patches will be queued
in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally
pull-requested to the mainline by the memory management subsystem maintainer.
Sufficiently reviewed patches will be queued in `mm-new
<https://git.kernel.org/akpm/mm/h/mm-new>`_ by the memory management subsystem
maintainer. As more sufficient tests are done, the patches will move to
`mm-unstable <https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and then to
`mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_. And finally those
will be pull-requested to the mainline by the memory management subsystem
maintainer.
Note again the patches for `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory
management subsystem maintainer. If the patches requires some patches in
`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged
in mm-unstable, please make sure the requirement is clearly specified.
Note again the patches for `mm-new tree
<https://git.kernel.org/akpm/mm/h/mm-new>`_ are queued by the memory management
subsystem maintainer. If the patches requires some patches in `damon/next tree
<https://git.kernel.org/sj/h/damon/next>`_ which not yet merged in mm-new,
please make sure the requirement is clearly specified.
Submit checklist addendum
-------------------------
@ -53,8 +55,9 @@ Further doing below and putting the results will be helpful.
Key cycle dates
---------------
Patches can be sent anytime. Key cycle dates of the `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable
Patches can be sent anytime. Key cycle dates of the `mm-new
<https://git.kernel.org/akpm/mm/h/mm-new>`_, `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory
management subsystem maintainer.

View File

@ -146,18 +146,33 @@ Steps:
18. The new page is moved to the LRU and can be scanned by the swapper,
etc. again.
Non-LRU page migration
======================
movable_ops page migration
==========================
Although migration originally aimed for reducing the latency of memory
accesses for NUMA, compaction also uses migration to create high-order
pages. For compaction purposes, it is also useful to be able to move
non-LRU pages, such as zsmalloc and virtio-balloon pages.
Selected typed, non-folio pages (e.g., pages inflated in a memory balloon,
zsmalloc pages) can be migrated using the movable_ops migration framework.
If a driver wants to make its pages movable, it should define a struct
movable_operations. It then needs to call __SetPageMovable() on each
page that it may be able to move. This uses the ``page->mapping`` field,
so this field is not available for the driver to use for other purposes.
The "struct movable_operations" provide callbacks specific to a page type
for isolating, migrating and un-isolating (putback) these pages.
Once a page is indicated as having movable_ops, that condition must not
change until the page was freed back to the buddy. This includes not
changing/clearing the page type and not changing/clearing the
PG_movable_ops page flag.
Arbitrary drivers cannot currently make use of this framework, as it
requires:
(a) a page type
(b) indicating them as possibly having movable_ops in page_has_movable_ops()
based on the page type
(c) returning the movable_ops from page_movable_ops() based on the page
type
(d) not reusing the PG_movable_ops and PG_movable_ops_isolated page flags
for other purposes
For example, balloon drivers can make use of this framework through the
balloon-compaction infrastructure residing in the core kernel.
Monitoring Migration
=====================

View File

@ -584,7 +584,7 @@ Compaction control
``compact_blockskip_flush``
Set to true when compaction migration scanner and free scanner meet, which
means the ``PB_migrate_skip`` bits should be cleared.
means the ``PB_compact_skip`` bits should be cleared.
``contiguous``
Set to true when the zone is contiguous (in other words, no hole).

View File

@ -303,7 +303,9 @@ There are four key operations typically performed on page tables:
1. **Traversing** page tables - Simply reading page tables in order to traverse
them. This only requires that the VMA is kept stable, so a lock which
establishes this suffices for traversal (there are also lockless variants
which eliminate even this requirement, such as :c:func:`!gup_fast`).
which eliminate even this requirement, such as :c:func:`!gup_fast`). There is
also a special case of page table traversal for non-VMA regions which we
consider separately below.
2. **Installing** page table mappings - Whether creating a new mapping or
modifying an existing one in such a way as to change its identity. This
requires that the VMA is kept stable via an mmap or VMA lock (explicitly not
@ -335,15 +337,13 @@ ahead and perform these operations on page tables (though internally, kernel
operations that perform writes also acquire internal page table locks to
serialise - see the page table implementation detail section for more details).
.. note:: We free empty PTE tables on zap under the RCU lock - this does not
change the aforementioned locking requirements around zapping.
When **installing** page table entries, the mmap or VMA lock must be held to
keep the VMA stable. We explore why this is in the page table locking details
section below.
.. warning:: Page tables are normally only traversed in regions covered by VMAs.
If you want to traverse page tables in areas that might not be
covered by VMAs, heavier locking is required.
See :c:func:`!walk_page_range_novma` for details.
**Freeing** page tables is an entirely internal memory management operation and
has special requirements (see the page freeing section below for more details).
@ -355,6 +355,44 @@ has special requirements (see the page freeing section below for more details).
from the reverse mappings, but no other VMAs can be permitted to be
accessible and span the specified range.
Traversing non-VMA page tables
------------------------------
We've focused above on traversal of page tables belonging to VMAs. It is also
possible to traverse page tables which are not represented by VMAs.
Kernel page table mappings themselves are generally managed but whatever part of
the kernel established them and the aforementioned locking rules do not apply -
for instance vmalloc has its own set of locks which are utilised for
establishing and tearing down page its page tables.
However, for convenience we provide the :c:func:`!walk_kernel_page_table_range`
function which is synchronised via the mmap lock on the :c:macro:`!init_mm`
kernel instantiation of the :c:struct:`!struct mm_struct` metadata object.
If an operation requires exclusive access, a write lock is used, but if not, a
read lock suffices - we assert only that at least a read lock has been acquired.
Since, aside from vmalloc and memory hot plug, kernel page tables are not torn
down all that often - this usually suffices, however any caller of this
functionality must ensure that any additionally required locks are acquired in
advance.
We also permit a truly unusual case is the traversal of non-VMA ranges in
**userland** ranges, as provided for by :c:func:`!walk_page_range_debug`.
This has only one user - the general page table dumping logic (implemented in
:c:macro:`!mm/ptdump.c`) - which seeks to expose all mappings for debug purposes
even if they are highly unusual (possibly architecture-specific) and are not
backed by a VMA.
We must take great care in this case, as the :c:func:`!munmap` implementation
detaches VMAs under an mmap write lock before tearing down page tables under a
downgraded mmap read lock.
This means such an operation could race with this, and thus an mmap **write**
lock is required.
Lock ordering
-------------
@ -461,6 +499,10 @@ Locking Implementation Details
Page table locking details
--------------------------
.. note:: This section explores page table locking requirements for page tables
encompassed by a VMA. See the above section on non-VMA page table
traversal for details on how we handle that case.
In addition to the locks described in the terminology section above, we have
additional locks dedicated to page tables:

View File

@ -62,7 +62,6 @@ memory_notify结构体的指针::
struct memory_notify {
unsigned long start_pfn;
unsigned long nr_pages;
int status_change_nid_normal;
int status_change_nid;
}
@ -70,8 +69,6 @@ memory_notify结构体的指针::
- nr_pages是在线/离线内存的页数。
- status_change_nid_normal是当nodemask的N_NORMAL_MEMORY被设置/清除时设置节
点id如果是-1则nodemask状态不改变。
- status_change_nid是当nodemask的N_MEMORY被设置/清除时设置的节点id。这
意味着一个新的(没上线的)节点通过联机获得新的内存,而一个节点失去了所有的内

View File

@ -6269,9 +6269,11 @@ L: cgroups@vger.kernel.org
L: linux-mm@kvack.org
S: Maintained
F: include/linux/memcontrol.h
F: include/linux/page_counter.h
F: mm/memcontrol.c
F: mm/memcontrol-v1.c
F: mm/memcontrol-v1.h
F: mm/page_counter.c
F: mm/swap_cgroup.c
F: samples/cgroup/*
F: tools/testing/selftests/cgroup/memcg_protection.m
@ -15928,6 +15930,8 @@ F: Documentation/admin-guide/mm/memory-hotplug.rst
F: Documentation/core-api/memory-hotplug.rst
F: drivers/base/memory.c
F: include/linux/memory_hotplug.h
F: include/linux/memremap.h
F: mm/memremap.c
F: mm/memory_hotplug.c
F: tools/testing/selftests/memory-hotplug/
@ -15938,23 +15942,8 @@ S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
T: quilt git://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new
F: Documentation/admin-guide/mm/
F: Documentation/mm/
F: include/linux/gfp.h
F: include/linux/gfp_types.h
F: include/linux/memory_hotplug.h
F: include/linux/memory-tiers.h
F: include/linux/mempolicy.h
F: include/linux/mempool.h
F: include/linux/memremap.h
F: include/linux/mmzone.h
F: include/linux/mmu_notifier.h
F: include/linux/pagewalk.h
F: include/trace/events/ksm.h
F: mm/
F: tools/mm/
F: tools/testing/selftests/mm/
N: include/linux/page[-_]*
MEMORY MANAGEMENT - CORE
M: Andrew Morton <akpm@linux-foundation.org>
@ -15969,18 +15958,40 @@ L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: include/linux/gfp.h
F: include/linux/gfp_types.h
F: include/linux/highmem.h
F: include/linux/memory.h
F: include/linux/mm.h
F: include/linux/mm_*.h
F: include/linux/mmzone.h
F: include/linux/mmdebug.h
F: include/linux/mmu_notifier.h
F: include/linux/pagewalk.h
F: include/linux/pgtable.h
F: include/linux/ptdump.h
F: include/linux/vmpressure.h
F: include/linux/vmstat.h
F: kernel/fork.c
F: mm/Kconfig
F: mm/debug.c
F: mm/folio-compat.c
F: mm/highmem.c
F: mm/init-mm.c
F: mm/internal.h
F: mm/maccess.c
F: mm/memory.c
F: mm/mmu_notifier.c
F: mm/mmzone.c
F: mm/pagewalk.c
F: mm/pgtable-generic.c
F: mm/ptdump.c
F: mm/sparse-vmemmap.c
F: mm/sparse.c
F: mm/util.c
F: mm/vmpressure.c
F: mm/vmstat.c
N: include/linux/page[-_]*
MEMORY MANAGEMENT - EXECMEM
M: Andrew Morton <akpm@linux-foundation.org>
@ -16020,6 +16031,7 @@ F: Documentation/mm/ksm.rst
F: include/linux/ksm.h
F: include/trace/events/ksm.h
F: mm/ksm.c
F: mm/mm_slot.h
MEMORY MANAGEMENT - MEMORY POLICY AND MIGRATION
M: Andrew Morton <akpm@linux-foundation.org>
@ -16037,11 +16049,49 @@ S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: include/linux/mempolicy.h
F: include/uapi/linux/mempolicy.h
F: include/linux/migrate.h
F: include/linux/migrate_mode.h
F: mm/mempolicy.c
F: mm/migrate.c
F: mm/migrate_device.c
MEMORY MANAGEMENT - MISC
M: Andrew Morton <akpm@linux-foundation.org>
M: David Hildenbrand <david@redhat.com>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Mike Rapoport <rppt@kernel.org>
R: Suren Baghdasaryan <surenb@google.com>
R: Michal Hocko <mhocko@suse.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: Documentation/admin-guide/mm/
F: Documentation/mm/
F: include/linux/cma.h
F: include/linux/dmapool.h
F: include/linux/ioremap.h
F: include/linux/memory-tiers.h
F: include/linux/page_idle.h
F: mm/backing-dev.c
F: mm/cma.c
F: mm/cma_debug.c
F: mm/cma_sysfs.c
F: mm/dmapool.c
F: mm/dmapool_test.c
F: mm/early_ioremap.c
F: mm/fadvise.c
F: mm/ioremap.c
F: mm/mapping_dirty_helpers.c
F: mm/memory-tiers.c
F: mm/page_idle.c
F: mm/pgalloc-track.h
F: mm/process_vm_access.c
F: tools/testing/selftests/mm/
MEMORY MANAGEMENT - NUMA MEMBLOCKS AND NUMA EMULATION
M: Andrew Morton <akpm@linux-foundation.org>
M: Mike Rapoport <rppt@kernel.org>
@ -16078,6 +16128,7 @@ F: include/linux/gfp.h
F: include/linux/page-isolation.h
F: mm/compaction.c
F: mm/debug_page_alloc.c
F: mm/debug_page_ref.c
F: mm/fail_page_alloc.c
F: mm/page_alloc.c
F: mm/page_ext.c
@ -16086,8 +16137,10 @@ F: mm/page_isolation.c
F: mm/page_owner.c
F: mm/page_poison.c
F: mm/page_reporting.c
F: mm/page_reporting.h
F: mm/show_mem.c
F: mm/shuffle.c
F: mm/shuffle.h
MEMORY MANAGEMENT - RECLAIM
M: Andrew Morton <akpm@linux-foundation.org>
@ -16165,6 +16218,7 @@ F: include/linux/khugepaged.h
F: include/trace/events/huge_memory.h
F: mm/huge_memory.c
F: mm/khugepaged.c
F: mm/mm_slot.h
F: tools/testing/selftests/mm/khugepaged.c
F: tools/testing/selftests/mm/split_huge_page_test.c
F: tools/testing/selftests/mm/transhuge-stress.c
@ -16207,6 +16261,7 @@ S: Maintained
W: http://www.linux-mm.org
T: git git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
F: include/trace/events/mmap.h
F: mm/interval_tree.c
F: mm/mincore.c
F: mm/mlock.c
F: mm/mmap.c
@ -19029,6 +19084,11 @@ F: Documentation/mm/page_table_check.rst
F: include/linux/page_table_check.h
F: mm/page_table_check.c
PAGE STATE DEBUG SCRIPT
M: Ye Liu <liuye@kylinos.cn>
S: Maintained
F: tools/mm/show_page_info.py
PANASONIC LAPTOP ACPI EXTRAS DRIVER
M: Kenneth Chan <kenneth.t.chan@gmail.com>
L: platform-driver-x86@vger.kernel.org
@ -19665,6 +19725,7 @@ F: arch/*/include/asm/percpu.h
F: include/linux/percpu*.h
F: lib/percpu*.c
F: mm/percpu*.c
F: mm/percpu-internal.h
PER-TASK DELAY ACCOUNTING
M: Balbir Singh <bsingharora@gmail.com>
@ -22866,7 +22927,9 @@ R: Muchun Song <muchun.song@linux.dev>
L: linux-mm@kvack.org
S: Maintained
F: Documentation/admin-guide/mm/shrinker_debugfs.rst
F: include/linux/list_lru.h
F: include/linux/shrinker.h
F: mm/list_lru.c
F: mm/shrinker.c
F: mm/shrinker_debug.c
@ -27700,6 +27763,7 @@ L: linux-mm@kvack.org
S: Maintained
F: Documentation/mm/zsmalloc.rst
F: include/linux/zsmalloc.h
F: mm/zpdesc.h
F: mm/zsmalloc.c
ZSTD

View File

@ -7,6 +7,7 @@ config ALPHA
select ARCH_HAS_DMA_OPS if PCI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select ARCH_MODULE_NEEDS_WEAK_PER_CPU if SMP
select ARCH_NO_PREEMPT
select ARCH_NO_SG_CHAIN
select ARCH_USE_CMPXCHG_LOCKREF

View File

@ -9,10 +9,9 @@
* way above 4G.
*
* Always use weak definitions for percpu variables in modules.
* Therefore, we have enabled CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU
* in the Kconfig.
*/
#if defined(MODULE) && defined(CONFIG_SMP)
#define ARCH_NEEDS_WEAK_PER_CPU
#endif
#include <asm-generic/percpu.h>

View File

@ -268,7 +268,7 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
int sig, code;
vm_fault_t fault;
unsigned int flags = FAULT_FLAG_DEFAULT;
unsigned long vm_flags = VM_ACCESS_FLAGS;
vm_flags_t vm_flags = VM_ACCESS_FLAGS;
if (kprobe_page_fault(regs, fsr))
return 0;

View File

@ -42,7 +42,6 @@ config ARM64
select ARCH_HAS_NONLEAF_PMD_YOUNG if ARM64_HAFT
select ARCH_HAS_PREEMPT_LAZY
select ARCH_HAS_PTDUMP
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
select ARCH_HAS_SETUP_DMA_OPS

View File

@ -11,10 +11,10 @@
#include <linux/shmem_fs.h>
#include <linux/types.h>
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
static inline vm_flags_t arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey)
{
unsigned long ret = 0;
vm_flags_t ret = 0;
if (system_supports_bti() && (prot & PROT_BTI))
ret |= VM_ARM64_BTI;
@ -34,8 +34,8 @@ static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
}
#define arch_calc_vm_prot_bits(prot, pkey) arch_calc_vm_prot_bits(prot, pkey)
static inline unsigned long arch_calc_vm_flag_bits(struct file *file,
unsigned long flags)
static inline vm_flags_t arch_calc_vm_flag_bits(struct file *file,
unsigned long flags)
{
/*
* Only allow MTE on anonymous mappings as these are guaranteed to be
@ -68,7 +68,7 @@ static inline bool arch_validate_prot(unsigned long prot,
}
#define arch_validate_prot(prot, addr) arch_validate_prot(prot, addr)
static inline bool arch_validate_flags(unsigned long vm_flags)
static inline bool arch_validate_flags(vm_flags_t vm_flags)
{
if (system_supports_mte()) {
/*

View File

@ -17,7 +17,6 @@
#define PTE_SWP_EXCLUSIVE (_AT(pteval_t, 1) << 2) /* only for swp ptes */
#define PTE_DIRTY (_AT(pteval_t, 1) << 55)
#define PTE_SPECIAL (_AT(pteval_t, 1) << 56)
#define PTE_DEVMAP (_AT(pteval_t, 1) << 57)
/*
* PTE_PRESENT_INVALID=1 & PTE_VALID=0 indicates that the pte's fields should be

View File

@ -190,7 +190,6 @@ static inline pteval_t __phys_to_pte_val(phys_addr_t phys)
#define pte_user(pte) (!!(pte_val(pte) & PTE_USER))
#define pte_user_exec(pte) (!(pte_val(pte) & PTE_UXN))
#define pte_cont(pte) (!!(pte_val(pte) & PTE_CONT))
#define pte_devmap(pte) (!!(pte_val(pte) & PTE_DEVMAP))
#define pte_tagged(pte) ((pte_val(pte) & PTE_ATTRINDX_MASK) == \
PTE_ATTRINDX(MT_NORMAL_TAGGED))
@ -372,11 +371,6 @@ static inline pmd_t pmd_mkcont(pmd_t pmd)
return __pmd(pmd_val(pmd) | PMD_SECT_CONT);
}
static inline pte_t pte_mkdevmap(pte_t pte)
{
return set_pte_bit(pte, __pgprot(PTE_DEVMAP | PTE_SPECIAL));
}
#ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP
static inline int pte_uffd_wp(pte_t pte)
{
@ -653,14 +647,6 @@ static inline pmd_t pmd_mkhuge(pmd_t pmd)
return __pmd((pmd_val(pmd) & ~mask) | val);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pmd_devmap(pmd) pte_devmap(pmd_pte(pmd))
#endif
static inline pmd_t pmd_mkdevmap(pmd_t pmd)
{
return pte_pmd(set_pte_bit(pmd_pte(pmd), __pgprot(PTE_DEVMAP)));
}
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
#define pmd_special(pte) (!!((pmd_val(pte) & PTE_SPECIAL)))
static inline pmd_t pmd_mkspecial(pmd_t pmd)
@ -1302,16 +1288,6 @@ static inline int pmdp_set_access_flags(struct vm_area_struct *vma,
return __ptep_set_access_flags(vma, address, (pte_t *)pmdp,
pmd_pte(entry), dirty);
}
static inline int pud_devmap(pud_t pud)
{
return 0;
}
static inline int pgd_devmap(pgd_t pgd)
{
return 0;
}
#endif
#ifdef CONFIG_PAGE_TABLE_CHECK
@ -1643,6 +1619,14 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
*/
#define arch_wants_old_prefaulted_pte cpu_has_hw_af
/*
* Request exec memory is read into pagecache in at least 64K folios. This size
* can be contpte-mapped when 4K base pages are in use (16 pages into 1 iTLB
* entry), and HPA can coalesce it (4 pages into 1 TLB entry) when 16K base
* pages are in use.
*/
#define exec_folio_order() ilog2(SZ_64K >> PAGE_SHIFT)
static inline bool pud_sect_supported(void)
{
return PAGE_SIZE == SZ_4K;
@ -1659,6 +1643,16 @@ extern void ptep_modify_prot_commit(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t new_pte);
#define modify_prot_start_ptes modify_prot_start_ptes
extern pte_t modify_prot_start_ptes(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep,
unsigned int nr);
#define modify_prot_commit_ptes modify_prot_commit_ptes
extern void modify_prot_commit_ptes(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t old_pte, pte_t pte,
unsigned int nr);
#ifdef CONFIG_ARM64_CONTPTE
/*

View File

@ -322,17 +322,6 @@ static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm)
return true;
}
/*
* If mprotect/munmap/etc occurs during TLB batched flushing, we need to ensure
* all the previously issued TLBIs targeting mm have completed. But since we
* can be executing on a remote CPU, a DSB cannot guarantee this like it can
* for arch_tlbbatch_flush(). Our only option is to flush the entire mm.
*/
static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
{
flush_tlb_mm(mm);
}
/*
* To support TLB batched flush for multiple pages unmapping, we only send
* the TLBI for each page in arch_tlbbatch_add_pending() and wait for the

View File

@ -555,7 +555,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
const struct fault_info *inf;
struct mm_struct *mm = current->mm;
vm_fault_t fault;
unsigned long vm_flags;
vm_flags_t vm_flags;
unsigned int mm_flags = FAULT_FLAG_DEFAULT;
unsigned long addr = untagged_addr(far);
struct vm_area_struct *vma;

View File

@ -81,7 +81,7 @@ static int __init adjust_protection_map(void)
}
arch_initcall(adjust_protection_map);
pgprot_t vm_get_page_prot(unsigned long vm_flags)
pgprot_t vm_get_page_prot(vm_flags_t vm_flags)
{
ptdesc_t prot;

View File

@ -26,6 +26,7 @@
#include <linux/set_memory.h>
#include <linux/kfence.h>
#include <linux/pkeys.h>
#include <linux/mm_inline.h>
#include <asm/barrier.h>
#include <asm/cputype.h>
@ -720,7 +721,7 @@ void mark_rodata_ro(void)
static void __init declare_vma(struct vm_struct *vma,
void *va_start, void *va_end,
unsigned long vm_flags)
vm_flags_t vm_flags)
{
phys_addr_t pa_start = __pa_symbol(va_start);
unsigned long size = va_end - va_start;
@ -1524,24 +1525,41 @@ static int __init prevent_bootmem_remove_init(void)
early_initcall(prevent_bootmem_remove_init);
#endif
pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
pte_t modify_prot_start_ptes(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, unsigned int nr)
{
pte_t pte = get_and_clear_full_ptes(vma->vm_mm, addr, ptep, nr, /* full = */ 0);
if (alternative_has_cap_unlikely(ARM64_WORKAROUND_2645198)) {
/*
* Break-before-make (BBM) is required for all user space mappings
* when the permission changes from executable to non-executable
* in cases where cpu is affected with errata #2645198.
*/
if (pte_user_exec(ptep_get(ptep)))
return ptep_clear_flush(vma, addr, ptep);
if (pte_accessible(vma->vm_mm, pte) && pte_user_exec(pte))
__flush_tlb_range(vma, addr, nr * PAGE_SIZE,
PAGE_SIZE, true, 3);
}
return ptep_get_and_clear(vma->vm_mm, addr, ptep);
return pte;
}
pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep)
{
return modify_prot_start_ptes(vma, addr, ptep, 1);
}
void modify_prot_commit_ptes(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, pte_t old_pte, pte_t pte,
unsigned int nr)
{
set_ptes(vma->vm_mm, addr, ptep, pte, nr);
}
void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep,
pte_t old_pte, pte_t pte)
{
set_pte_at(vma->vm_mm, addr, ptep, pte);
modify_prot_commit_ptes(vma, addr, ptep, old_pte, pte, 1);
}
/*

View File

@ -1,6 +1,5 @@
// SPDX-License-Identifier: GPL-2.0
#include <linux/debugfs.h>
#include <linux/memory_hotplug.h>
#include <linux/seq_file.h>
#include <asm/ptdump.h>
@ -9,9 +8,7 @@ static int ptdump_show(struct seq_file *m, void *v)
{
struct ptdump_info *info = m->private;
get_online_mems();
ptdump_walk(m, info);
put_online_mems();
return 0;
}
DEFINE_SHOW_ATTRIBUTE(ptdump);

View File

@ -24,7 +24,6 @@ config LOONGARCH
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PREEMPT_LAZY
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_SET_DIRECT_MAP

View File

@ -10,20 +10,6 @@
uint64_t pmd_to_entrylo(unsigned long pmd_val);
#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
static inline int prepare_hugepage_range(struct file *file,
unsigned long addr,
unsigned long len)
{
unsigned long task_size = STACK_TOP;
if (len > task_size)
return -ENOMEM;
if (task_size - len < addr)
return -EINVAL;
return 0;
}
#define __HAVE_ARCH_HUGE_PTE_CLEAR
static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, unsigned long sz)

View File

@ -22,7 +22,6 @@
#define _PAGE_PFN_SHIFT 12
#define _PAGE_SWP_EXCLUSIVE_SHIFT 23
#define _PAGE_PFN_END_SHIFT 48
#define _PAGE_DEVMAP_SHIFT 59
#define _PAGE_PRESENT_INVALID_SHIFT 60
#define _PAGE_NO_READ_SHIFT 61
#define _PAGE_NO_EXEC_SHIFT 62
@ -36,7 +35,6 @@
#define _PAGE_MODIFIED (_ULCAST_(1) << _PAGE_MODIFIED_SHIFT)
#define _PAGE_PROTNONE (_ULCAST_(1) << _PAGE_PROTNONE_SHIFT)
#define _PAGE_SPECIAL (_ULCAST_(1) << _PAGE_SPECIAL_SHIFT)
#define _PAGE_DEVMAP (_ULCAST_(1) << _PAGE_DEVMAP_SHIFT)
/* We borrow bit 23 to store the exclusive marker in swap PTEs. */
#define _PAGE_SWP_EXCLUSIVE (_ULCAST_(1) << _PAGE_SWP_EXCLUSIVE_SHIFT)
@ -76,8 +74,8 @@
#define __READABLE (_PAGE_VALID)
#define __WRITEABLE (_PAGE_DIRTY | _PAGE_WRITE)
#define _PAGE_CHG_MASK (_PAGE_MODIFIED | _PAGE_SPECIAL | _PAGE_DEVMAP | _PFN_MASK | _CACHE_MASK | _PAGE_PLV)
#define _HPAGE_CHG_MASK (_PAGE_MODIFIED | _PAGE_SPECIAL | _PAGE_DEVMAP | _PFN_MASK | _CACHE_MASK | _PAGE_PLV | _PAGE_HUGE)
#define _PAGE_CHG_MASK (_PAGE_MODIFIED | _PAGE_SPECIAL | _PFN_MASK | _CACHE_MASK | _PAGE_PLV)
#define _HPAGE_CHG_MASK (_PAGE_MODIFIED | _PAGE_SPECIAL | _PFN_MASK | _CACHE_MASK | _PAGE_PLV | _PAGE_HUGE)
#define PAGE_NONE __pgprot(_PAGE_PROTNONE | _PAGE_NO_READ | \
_PAGE_USER | _CACHE_CC)

View File

@ -409,9 +409,6 @@ static inline int pte_special(pte_t pte) { return pte_val(pte) & _PAGE_SPECIAL;
static inline pte_t pte_mkspecial(pte_t pte) { pte_val(pte) |= _PAGE_SPECIAL; return pte; }
#endif /* CONFIG_ARCH_HAS_PTE_SPECIAL */
static inline int pte_devmap(pte_t pte) { return !!(pte_val(pte) & _PAGE_DEVMAP); }
static inline pte_t pte_mkdevmap(pte_t pte) { pte_val(pte) |= _PAGE_DEVMAP; return pte; }
#define pte_accessible pte_accessible
static inline unsigned long pte_accessible(struct mm_struct *mm, pte_t a)
{
@ -540,17 +537,6 @@ static inline pmd_t pmd_mkyoung(pmd_t pmd)
return pmd;
}
static inline int pmd_devmap(pmd_t pmd)
{
return !!(pmd_val(pmd) & _PAGE_DEVMAP);
}
static inline pmd_t pmd_mkdevmap(pmd_t pmd)
{
pmd_val(pmd) |= _PAGE_DEVMAP;
return pmd;
}
static inline struct page *pmd_page(pmd_t pmd)
{
if (pmd_trans_huge(pmd))
@ -606,11 +592,6 @@ static inline long pmd_protnone(pmd_t pmd)
#define pmd_leaf(pmd) ((pmd_val(pmd) & _PAGE_HUGE) != 0)
#define pud_leaf(pud) ((pud_val(pud) & _PAGE_HUGE) != 0)
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define pud_devmap(pud) (0)
#define pgd_devmap(pgd) (0)
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
/*
* We provide our own get_unmapped area to cope with the virtual aliasing
* constraints placed on us by the cache architecture.

View File

@ -118,7 +118,7 @@ static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask, pgp
return 0;
mmap_write_lock(&init_mm);
ret = walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL, &masks);
ret = walk_kernel_page_table_range(start, end, &pageattr_ops, NULL, &masks);
mmap_write_unlock(&init_mm);
flush_tlb_kernel_range(start, end);

View File

@ -11,20 +11,6 @@
#include <asm/page.h>
#define __HAVE_ARCH_PREPARE_HUGEPAGE_RANGE
static inline int prepare_hugepage_range(struct file *file,
unsigned long addr,
unsigned long len)
{
unsigned long task_size = STACK_TOP;
if (len > task_size)
return -ENOMEM;
if (task_size - len < addr)
return -EINVAL;
return 0;
}
#define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR
static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm,
unsigned long addr, pte_t *ptep,

View File

@ -72,7 +72,7 @@ void *arch_dma_set_uncached(void *cpu_addr, size_t size)
* them and setting the cache-inhibit bit.
*/
mmap_write_lock(&init_mm);
error = walk_page_range_novma(&init_mm, va, va + size,
error = walk_kernel_page_table_range(va, va + size,
&set_nocache_walk_ops, NULL, NULL);
mmap_write_unlock(&init_mm);
@ -87,7 +87,7 @@ void arch_dma_clear_uncached(void *cpu_addr, size_t size)
mmap_write_lock(&init_mm);
/* walk_page_range shouldn't be able to fail here */
WARN_ON(walk_page_range_novma(&init_mm, va, va + size,
WARN_ON(walk_kernel_page_table_range(va, va + size,
&clear_nocache_walk_ops, NULL, NULL));
mmap_write_unlock(&init_mm);
}

View File

@ -147,7 +147,6 @@ config PPC
select ARCH_HAS_PMEM_API
select ARCH_HAS_PREEMPT_LAZY
select ARCH_HAS_PTDUMP
select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
select ARCH_HAS_SET_MEMORY

View File

@ -168,12 +168,6 @@ extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
extern int hash__has_transparent_hugepage(void);
#endif
static inline pmd_t hash__pmd_mkdevmap(pmd_t pmd)
{
BUG();
return pmd;
}
#endif /* !__ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_HASH_4K_H */

View File

@ -259,7 +259,7 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
*/
static inline int hash__pmd_trans_huge(pmd_t pmd)
{
return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP)) ==
return !!((pmd_val(pmd) & (_PAGE_PTE | H_PAGE_THP_HUGE)) ==
(_PAGE_PTE | H_PAGE_THP_HUGE));
}
@ -281,11 +281,6 @@ extern pmd_t hash__pmdp_huge_get_and_clear(struct mm_struct *mm,
extern int hash__has_transparent_hugepage(void);
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
static inline pmd_t hash__pmd_mkdevmap(pmd_t pmd)
{
return __pmd(pmd_val(pmd) | (_PAGE_PTE | H_PAGE_THP_HUGE | _PAGE_DEVMAP));
}
#endif /* __ASSEMBLY__ */
#endif /* _ASM_POWERPC_BOOK3S_64_HASH_64K_H */

View File

@ -88,7 +88,6 @@
#define _PAGE_SOFT_DIRTY _RPAGE_SW3 /* software: software dirty tracking */
#define _PAGE_SPECIAL _RPAGE_SW2 /* software: special page */
#define _PAGE_DEVMAP _RPAGE_SW1 /* software: ZONE_DEVICE page */
/*
* Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
@ -109,7 +108,7 @@
*/
#define _HPAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
_PAGE_ACCESSED | H_PAGE_THP_HUGE | _PAGE_PTE | \
_PAGE_SOFT_DIRTY | _PAGE_DEVMAP)
_PAGE_SOFT_DIRTY)
/*
* user access blocked by key
*/
@ -123,7 +122,7 @@
*/
#define _PAGE_CHG_MASK (PTE_RPN_MASK | _PAGE_HPTEFLAGS | _PAGE_DIRTY | \
_PAGE_ACCESSED | _PAGE_SPECIAL | _PAGE_PTE | \
_PAGE_SOFT_DIRTY | _PAGE_DEVMAP)
_PAGE_SOFT_DIRTY)
/*
* We define 2 sets of base prot bits, one for basic pages (ie,
@ -609,24 +608,6 @@ static inline pte_t pte_mkhuge(pte_t pte)
return pte;
}
static inline pte_t pte_mkdevmap(pte_t pte)
{
return __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_SPECIAL | _PAGE_DEVMAP));
}
/*
* This is potentially called with a pmd as the argument, in which case it's not
* safe to check _PAGE_DEVMAP unless we also confirm that _PAGE_PTE is set.
* That's because the bit we use for _PAGE_DEVMAP is not reserved for software
* use in page directory entries (ie. non-ptes).
*/
static inline int pte_devmap(pte_t pte)
{
__be64 mask = cpu_to_be64(_PAGE_DEVMAP | _PAGE_PTE);
return (pte_raw(pte) & mask) == mask;
}
static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
{
/* FIXME!! check whether this need to be a conditional */
@ -1379,36 +1360,6 @@ static inline bool arch_needs_pgtable_deposit(void)
}
extern void serialize_against_pte_lookup(struct mm_struct *mm);
static inline pmd_t pmd_mkdevmap(pmd_t pmd)
{
if (radix_enabled())
return radix__pmd_mkdevmap(pmd);
return hash__pmd_mkdevmap(pmd);
}
static inline pud_t pud_mkdevmap(pud_t pud)
{
if (radix_enabled())
return radix__pud_mkdevmap(pud);
BUG();
return pud;
}
static inline int pmd_devmap(pmd_t pmd)
{
return pte_devmap(pmd_pte(pmd));
}
static inline int pud_devmap(pud_t pud)
{
return pte_devmap(pud_pte(pud));
}
static inline int pgd_devmap(pgd_t pgd)
{
return 0;
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION

View File

@ -5,7 +5,7 @@
#include <asm/book3s/64/hash-pkey.h>
static inline u64 vmflag_to_pte_pkey_bits(u64 vm_flags)
static inline u64 vmflag_to_pte_pkey_bits(vm_flags_t vm_flags)
{
if (!mmu_has_feature(MMU_FTR_PKEY))
return 0x0UL;

View File

@ -264,7 +264,7 @@ static inline int radix__p4d_bad(p4d_t p4d)
static inline int radix__pmd_trans_huge(pmd_t pmd)
{
return (pmd_val(pmd) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
return (pmd_val(pmd) & _PAGE_PTE) == _PAGE_PTE;
}
static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
@ -274,7 +274,7 @@ static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
static inline int radix__pud_trans_huge(pud_t pud)
{
return (pud_val(pud) & (_PAGE_PTE | _PAGE_DEVMAP)) == _PAGE_PTE;
return (pud_val(pud) & _PAGE_PTE) == _PAGE_PTE;
}
static inline pud_t radix__pud_mkhuge(pud_t pud)
@ -315,16 +315,6 @@ static inline int radix__has_transparent_pud_hugepage(void)
}
#endif
static inline pmd_t radix__pmd_mkdevmap(pmd_t pmd)
{
return __pmd(pmd_val(pmd) | (_PAGE_PTE | _PAGE_DEVMAP));
}
static inline pud_t radix__pud_mkdevmap(pud_t pud)
{
return __pud(pud_val(pud) | (_PAGE_PTE | _PAGE_DEVMAP));
}
struct vmem_altmap;
struct dev_pagemap;
extern int __meminit radix__vmemmap_create_mapping(unsigned long start,

View File

@ -14,7 +14,7 @@
#include <asm/cpu_has_feature.h>
#include <asm/firmware.h>
static inline unsigned long arch_calc_vm_prot_bits(unsigned long prot,
static inline vm_flags_t arch_calc_vm_prot_bits(unsigned long prot,
unsigned long pkey)
{
#ifdef CONFIG_PPC_MEM_KEYS

View File

@ -30,9 +30,9 @@ extern u32 reserved_allocation_mask; /* bits set for reserved keys */
#endif
static inline u64 pkey_to_vmflag_bits(u16 pkey)
static inline vm_flags_t pkey_to_vmflag_bits(u16 pkey)
{
return (((u64)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
return (((vm_flags_t)pkey << VM_PKEY_SHIFT) & ARCH_VM_PKEY_FLAGS);
}
static inline int vma_pkey(struct vm_area_struct *vma)

View File

@ -393,7 +393,7 @@ static int kvmppc_memslot_page_merge(struct kvm *kvm,
{
unsigned long gfn = memslot->base_gfn;
unsigned long end, start = gfn_to_hva(kvm, gfn);
unsigned long vm_flags;
vm_flags_t vm_flags;
int ret = 0;
struct vm_area_struct *vma;
int merge_flag = (merge) ? MADV_MERGEABLE : MADV_UNMERGEABLE;

View File

@ -54,7 +54,7 @@ int __hash_page_thp(unsigned long ea, unsigned long access, unsigned long vsid,
/*
* Make sure this is thp or devmap entry
*/
if (!(old_pmd & (H_PAGE_THP_HUGE | _PAGE_DEVMAP)))
if (!(old_pmd & H_PAGE_THP_HUGE))
return 0;
rflags = htab_convert_pte_flags(new_pmd, flags);

View File

@ -195,7 +195,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
unsigned long old;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
WARN_ON(!hash__pmd_trans_huge(*pmdp));
assert_spin_locked(pmd_lockptr(mm, pmdp));
#endif
@ -227,7 +227,6 @@ pmd_t hash__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addres
VM_BUG_ON(address & ~HPAGE_PMD_MASK);
VM_BUG_ON(pmd_trans_huge(*pmdp));
VM_BUG_ON(pmd_devmap(*pmdp));
pmd = *pmdp;
pmd_clear(pmdp);

View File

@ -74,7 +74,7 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid,
} while(!pte_xchg(ptep, __pte(old_pte), __pte(new_pte)));
/* Make sure this is a hugetlb entry */
if (old_pte & (H_PAGE_THP_HUGE | _PAGE_DEVMAP))
if (old_pte & H_PAGE_THP_HUGE)
return 0;
rflags = htab_convert_pte_flags(new_pte, flags);

View File

@ -62,7 +62,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
{
int changed;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
WARN_ON(!pmd_trans_huge(*pmdp));
assert_spin_locked(pmd_lockptr(vma->vm_mm, pmdp));
#endif
changed = !pmd_same(*(pmdp), entry);
@ -82,7 +82,6 @@ int pudp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
{
int changed;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!pud_devmap(*pudp));
assert_spin_locked(pud_lockptr(vma->vm_mm, pudp));
#endif
changed = !pud_same(*(pudp), entry);
@ -204,8 +203,8 @@ pmd_t pmdp_huge_get_and_clear_full(struct vm_area_struct *vma,
{
pmd_t pmd;
VM_BUG_ON(addr & ~HPAGE_PMD_MASK);
VM_BUG_ON((pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) &&
!pmd_devmap(*pmdp)) || !pmd_present(*pmdp));
VM_BUG_ON((pmd_present(*pmdp) && !pmd_trans_huge(*pmdp)) ||
!pmd_present(*pmdp));
pmd = pmdp_huge_get_and_clear(vma->vm_mm, addr, pmdp);
/*
* if it not a fullmm flush, then we can possibly end up converting
@ -223,8 +222,7 @@ pud_t pudp_huge_get_and_clear_full(struct vm_area_struct *vma,
pud_t pud;
VM_BUG_ON(addr & ~HPAGE_PMD_MASK);
VM_BUG_ON((pud_present(*pudp) && !pud_devmap(*pudp)) ||
!pud_present(*pudp));
VM_BUG_ON(!pud_present(*pudp));
pud = pudp_huge_get_and_clear(vma->vm_mm, addr, pudp);
/*
* if it not a fullmm flush, then we can possibly end up converting
@ -644,7 +642,7 @@ unsigned long memremap_compat_align(void)
EXPORT_SYMBOL_GPL(memremap_compat_align);
#endif
pgprot_t vm_get_page_prot(unsigned long vm_flags)
pgprot_t vm_get_page_prot(vm_flags_t vm_flags)
{
unsigned long prot;

View File

@ -1433,7 +1433,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
unsigned long old;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
WARN_ON(!radix__pmd_trans_huge(*pmdp));
assert_spin_locked(pmd_lockptr(mm, pmdp));
#endif
@ -1450,7 +1450,7 @@ unsigned long radix__pud_hugepage_update(struct mm_struct *mm, unsigned long add
unsigned long old;
#ifdef CONFIG_DEBUG_VM
WARN_ON(!pud_devmap(*pudp));
WARN_ON(!pud_trans_huge(*pudp));
assert_spin_locked(pud_lockptr(mm, pudp));
#endif
@ -1468,7 +1468,6 @@ pmd_t radix__pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long addre
VM_BUG_ON(address & ~HPAGE_PMD_MASK);
VM_BUG_ON(radix__pmd_trans_huge(*pmdp));
VM_BUG_ON(pmd_devmap(*pmdp));
/*
* khugepaged calls this for normal pmd
*/

View File

@ -509,7 +509,7 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea,
return NULL;
#endif
if (pmd_trans_huge(pmd) || pmd_devmap(pmd)) {
if (pmd_trans_huge(pmd)) {
if (is_thp)
*is_thp = true;
ret_pte = (pte_t *)pmdp;

View File

@ -532,7 +532,6 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
spin_lock_irqsave(&b_dev_info->pages_lock, flags);
balloon_page_insert(b_dev_info, newpage);
balloon_page_delete(page);
b_dev_info->isolated_pages--;
spin_unlock_irqrestore(&b_dev_info->pages_lock, flags);
@ -542,6 +541,7 @@ static int cmm_migratepage(struct balloon_dev_info *b_dev_info,
*/
plpar_page_set_active(page);
balloon_page_finalize(page);
/* balloon page list reference */
put_page(page);

View File

@ -29,7 +29,7 @@ struct pci_controller *init_phb_dynamic(struct device_node *dn)
nid = of_node_to_nid(dn);
if (likely((nid) >= 0)) {
if (!node_online(nid)) {
if (__register_one_node(nid)) {
if (register_one_node(nid)) {
pr_err("PCI: Failed to register node %d\n", nid);
} else {
update_numa_distance(dn);

View File

@ -43,7 +43,6 @@ config RISCV
select ARCH_HAS_PREEMPT_LAZY
select ARCH_HAS_PREPARE_SYNC_CORE_CMD
select ARCH_HAS_PTDUMP if MMU
select ARCH_HAS_PTE_DEVMAP if 64BIT && MMU
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP if MMU
select ARCH_HAS_SET_MEMORY if MMU

View File

@ -397,24 +397,8 @@ static inline struct page *pgd_page(pgd_t pgd)
p4d_t *p4d_offset(pgd_t *pgd, unsigned long address);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
static inline int pte_devmap(pte_t pte);
static inline pte_t pmd_pte(pmd_t pmd);
static inline pte_t pud_pte(pud_t pud);
static inline int pmd_devmap(pmd_t pmd)
{
return pte_devmap(pmd_pte(pmd));
}
static inline int pud_devmap(pud_t pud)
{
return pte_devmap(pud_pte(pud));
}
static inline int pgd_devmap(pgd_t pgd)
{
return 0;
}
#endif
#endif /* _ASM_RISCV_PGTABLE_64_H */

View File

@ -19,7 +19,6 @@
#define _PAGE_SOFT (3 << 8) /* Reserved for software */
#define _PAGE_SPECIAL (1 << 8) /* RSW: 0x1 */
#define _PAGE_DEVMAP (1 << 9) /* RSW, devmap */
#define _PAGE_TABLE _PAGE_PRESENT
/*

View File

@ -409,13 +409,6 @@ static inline int pte_special(pte_t pte)
return pte_val(pte) & _PAGE_SPECIAL;
}
#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
static inline int pte_devmap(pte_t pte)
{
return pte_val(pte) & _PAGE_DEVMAP;
}
#endif
/* static inline pte_t pte_rdprotect(pte_t pte) */
static inline pte_t pte_wrprotect(pte_t pte)
@ -457,11 +450,6 @@ static inline pte_t pte_mkspecial(pte_t pte)
return __pte(pte_val(pte) | _PAGE_SPECIAL);
}
static inline pte_t pte_mkdevmap(pte_t pte)
{
return __pte(pte_val(pte) | _PAGE_DEVMAP);
}
static inline pte_t pte_mkhuge(pte_t pte)
{
return pte;
@ -790,11 +778,6 @@ static inline pmd_t pmd_mkdirty(pmd_t pmd)
return pte_pmd(pte_mkdirty(pmd_pte(pmd)));
}
static inline pmd_t pmd_mkdevmap(pmd_t pmd)
{
return pte_pmd(pte_mkdevmap(pmd_pte(pmd)));
}
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
static inline bool pmd_special(pmd_t pmd)
{
@ -946,11 +929,6 @@ static inline pud_t pud_mkhuge(pud_t pud)
return pud;
}
static inline pud_t pud_mkdevmap(pud_t pud)
{
return pte_pud(pte_mkdevmap(pud_pte(pud)));
}
static inline int pudp_set_access_flags(struct vm_area_struct *vma,
unsigned long address, pud_t *pudp,
pud_t entry, int dirty)

View File

@ -63,7 +63,6 @@ void flush_pud_tlb_range(struct vm_area_struct *vma, unsigned long start,
bool arch_tlbbatch_should_defer(struct mm_struct *mm);
void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
struct mm_struct *mm, unsigned long start, unsigned long end);
void arch_flush_tlb_batched_pending(struct mm_struct *mm);
void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
extern unsigned long tlb_flush_all_threshold;

View File

@ -299,7 +299,7 @@ static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask,
if (ret)
goto unlock;
ret = walk_page_range_novma(&init_mm, lm_start, lm_end,
ret = walk_kernel_page_table_range(lm_start, lm_end,
&pageattr_ops, NULL, &masks);
if (ret)
goto unlock;
@ -317,13 +317,13 @@ static int __set_memory(unsigned long addr, int numpages, pgprot_t set_mask,
if (ret)
goto unlock;
ret = walk_page_range_novma(&init_mm, lm_start, lm_end,
ret = walk_kernel_page_table_range(lm_start, lm_end,
&pageattr_ops, NULL, &masks);
if (ret)
goto unlock;
}
ret = walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL,
ret = walk_kernel_page_table_range(start, end, &pageattr_ops, NULL,
&masks);
unlock:
@ -335,7 +335,7 @@ unlock:
*/
flush_tlb_all();
#else
ret = walk_page_range_novma(&init_mm, start, end, &pageattr_ops, NULL,
ret = walk_kernel_page_table_range(start, end, &pageattr_ops, NULL,
&masks);
mmap_write_unlock(&init_mm);

View File

@ -6,7 +6,6 @@
#include <linux/efi.h>
#include <linux/init.h>
#include <linux/debugfs.h>
#include <linux/memory_hotplug.h>
#include <linux/seq_file.h>
#include <linux/ptdump.h>
@ -413,9 +412,7 @@ bool ptdump_check_wx(void)
static int ptdump_show(struct seq_file *m, void *v)
{
get_online_mems();
ptdump_walk(m, m->private);
put_online_mems();
return 0;
}

View File

@ -234,11 +234,6 @@ void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end);
}
void arch_flush_tlb_batched_pending(struct mm_struct *mm)
{
flush_tlb_mm(mm);
}
void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
{
__flush_tlb_range(NULL, &batch->cpumask,

View File

@ -131,6 +131,7 @@ config S390
select ARCH_INLINE_WRITE_UNLOCK_IRQ
select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
select ARCH_MHP_MEMMAP_ON_MEMORY_ENABLE
select ARCH_MODULE_NEEDS_WEAK_PER_CPU
select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEBUG_PAGEALLOC

View File

@ -16,10 +16,9 @@
* For 64 bit module code, the module may be more than 4G above the
* per cpu area, use weak definitions to force the compiler to
* generate external references.
* Therefore, we have enabled CONFIG_ARCH_MODULE_NEEDS_WEAK_PER_CPU
* in the Kconfig.
*/
#if defined(MODULE)
#define ARCH_NEEDS_WEAK_PER_CPU
#endif
/*
* We use a compare-and-swap loop since that uses less cpu cycles than

View File

@ -247,11 +247,9 @@ static int ptdump_show(struct seq_file *m, void *v)
.marker = markers,
};
get_online_mems();
mutex_lock(&cpa_mutex);
ptdump_walk_pgd(&st.ptdump, &init_mm, NULL);
mutex_unlock(&cpa_mutex);
put_online_mems();
return 0;
}
DEFINE_SHOW_ATTRIBUTE(ptdump);

View File

@ -96,6 +96,7 @@ config SPARC64
select HAVE_ARCH_AUDITSYSCALL
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_HUGETLBFS
select HAVE_NMI
select HAVE_REGS_AND_STACK_ACCESS_API
select ARCH_USE_QUEUED_RWLOCKS

View File

@ -50,11 +50,6 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma,
return changed;
}
#define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE
void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr,
unsigned long end, unsigned long floor,
unsigned long ceiling);
#include <asm-generic/hugetlb.h>
#endif /* _ASM_SPARC64_HUGETLB_H */

View File

@ -28,7 +28,7 @@ static inline void ipi_set_tstate_mcde(void *arg)
}
#define arch_calc_vm_prot_bits(prot, pkey) sparc_calc_vm_prot_bits(prot)
static inline unsigned long sparc_calc_vm_prot_bits(unsigned long prot)
static inline vm_flags_t sparc_calc_vm_prot_bits(unsigned long prot)
{
if (adi_capable() && (prot & PROT_ADI)) {
struct pt_regs *regs;
@ -58,7 +58,7 @@ static inline int sparc_validate_prot(unsigned long prot, unsigned long addr)
/* arch_validate_flags() - Ensure combination of flags is valid for a
* VMA.
*/
static inline bool arch_validate_flags(unsigned long vm_flags)
static inline bool arch_validate_flags(vm_flags_t vm_flags)
{
/* If ADI is being enabled on this VMA, check for ADI
* capability on the platform and ensure VMA is suitable

View File

@ -295,122 +295,3 @@ pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr,
return entry;
}
static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long addr)
{
pgtable_t token = pmd_pgtable(*pmd);
pmd_clear(pmd);
pte_free_tlb(tlb, token, addr);
mm_dec_nr_ptes(tlb->mm);
}
static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud,
unsigned long addr, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pmd_t *pmd;
unsigned long next;
unsigned long start;
start = addr;
pmd = pmd_offset(pud, addr);
do {
next = pmd_addr_end(addr, end);
if (pmd_none(*pmd))
continue;
if (is_hugetlb_pmd(*pmd))
pmd_clear(pmd);
else
hugetlb_free_pte_range(tlb, pmd, addr);
} while (pmd++, addr = next, addr != end);
start &= PUD_MASK;
if (start < floor)
return;
if (ceiling) {
ceiling &= PUD_MASK;
if (!ceiling)
return;
}
if (end - 1 > ceiling - 1)
return;
pmd = pmd_offset(pud, start);
pud_clear(pud);
pmd_free_tlb(tlb, pmd, start);
mm_dec_nr_pmds(tlb->mm);
}
static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pud_t *pud;
unsigned long next;
unsigned long start;
start = addr;
pud = pud_offset(p4d, addr);
do {
next = pud_addr_end(addr, end);
if (pud_none_or_clear_bad(pud))
continue;
if (is_hugetlb_pud(*pud))
pud_clear(pud);
else
hugetlb_free_pmd_range(tlb, pud, addr, next, floor,
ceiling);
} while (pud++, addr = next, addr != end);
start &= PGDIR_MASK;
if (start < floor)
return;
if (ceiling) {
ceiling &= PGDIR_MASK;
if (!ceiling)
return;
}
if (end - 1 > ceiling - 1)
return;
pud = pud_offset(p4d, start);
p4d_clear(p4d);
pud_free_tlb(tlb, pud, start);
mm_dec_nr_puds(tlb->mm);
}
void hugetlb_free_pgd_range(struct mmu_gather *tlb,
unsigned long addr, unsigned long end,
unsigned long floor, unsigned long ceiling)
{
pgd_t *pgd;
p4d_t *p4d;
unsigned long next;
addr &= PMD_MASK;
if (addr < floor) {
addr += PMD_SIZE;
if (!addr)
return;
}
if (ceiling) {
ceiling &= PMD_MASK;
if (!ceiling)
return;
}
if (end - 1 > ceiling - 1)
end -= PMD_SIZE;
if (addr > end - 1)
return;
pgd = pgd_offset(tlb->mm, addr);
p4d = p4d_offset(pgd, addr);
do {
next = p4d_addr_end(addr, end);
if (p4d_none_or_clear_bad(p4d))
continue;
hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling);
} while (p4d++, addr = next, addr != end);
}

View File

@ -3201,7 +3201,7 @@ void copy_highpage(struct page *to, struct page *from)
}
EXPORT_SYMBOL(copy_highpage);
pgprot_t vm_get_page_prot(unsigned long vm_flags)
pgprot_t vm_get_page_prot(vm_flags_t vm_flags)
{
unsigned long prot = pgprot_val(protection_map[vm_flags &
(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);

View File

@ -99,7 +99,6 @@ config X86
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_PREEMPT_LAZY
select ARCH_HAS_PTE_DEVMAP if X86_64
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_HW_PTE_YOUNG
select ARCH_HAS_NONLEAF_PMD_YOUNG if PGTABLE_LEVELS > 2
@ -124,6 +123,7 @@ config X86
select ARCH_SUPPORTS_ACPI
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEBUG_PAGEALLOC
select ARCH_SUPPORTS_HUGETLBFS
select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <= 4096

View File

@ -301,16 +301,15 @@ static inline bool pmd_leaf(pmd_t pte)
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
/* NOTE: when predicate huge page, consider also pmd_devmap, or use pmd_leaf */
static inline int pmd_trans_huge(pmd_t pmd)
{
return (pmd_val(pmd) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE;
return (pmd_val(pmd) & _PAGE_PSE) == _PAGE_PSE;
}
#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
static inline int pud_trans_huge(pud_t pud)
{
return (pud_val(pud) & (_PAGE_PSE|_PAGE_DEVMAP)) == _PAGE_PSE;
return (pud_val(pud) & _PAGE_PSE) == _PAGE_PSE;
}
#endif
@ -320,24 +319,6 @@ static inline int has_transparent_hugepage(void)
return boot_cpu_has(X86_FEATURE_PSE);
}
#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
static inline int pmd_devmap(pmd_t pmd)
{
return !!(pmd_val(pmd) & _PAGE_DEVMAP);
}
#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
static inline int pud_devmap(pud_t pud)
{
return !!(pud_val(pud) & _PAGE_DEVMAP);
}
#else
static inline int pud_devmap(pud_t pud)
{
return 0;
}
#endif
#ifdef CONFIG_ARCH_SUPPORTS_PMD_PFNMAP
static inline bool pmd_special(pmd_t pmd)
{
@ -361,12 +342,6 @@ static inline pud_t pud_mkspecial(pud_t pud)
return pud_set_flags(pud, _PAGE_SPECIAL);
}
#endif /* CONFIG_ARCH_SUPPORTS_PUD_PFNMAP */
static inline int pgd_devmap(pgd_t pgd)
{
return 0;
}
#endif
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
static inline pte_t pte_set_flags(pte_t pte, pteval_t set)
@ -527,11 +502,6 @@ static inline pte_t pte_mkspecial(pte_t pte)
return pte_set_flags(pte, _PAGE_SPECIAL);
}
static inline pte_t pte_mkdevmap(pte_t pte)
{
return pte_set_flags(pte, _PAGE_SPECIAL|_PAGE_DEVMAP);
}
/* See comments above mksaveddirty_shift() */
static inline pmd_t pmd_mksaveddirty(pmd_t pmd)
{
@ -603,11 +573,6 @@ static inline pmd_t pmd_mkwrite_shstk(pmd_t pmd)
return pmd_set_flags(pmd, _PAGE_DIRTY);
}
static inline pmd_t pmd_mkdevmap(pmd_t pmd)
{
return pmd_set_flags(pmd, _PAGE_DEVMAP);
}
static inline pmd_t pmd_mkhuge(pmd_t pmd)
{
return pmd_set_flags(pmd, _PAGE_PSE);
@ -673,11 +638,6 @@ static inline pud_t pud_mkdirty(pud_t pud)
return pud_mksaveddirty(pud);
}
static inline pud_t pud_mkdevmap(pud_t pud)
{
return pud_set_flags(pud, _PAGE_DEVMAP);
}
static inline pud_t pud_mkhuge(pud_t pud)
{
return pud_set_flags(pud, _PAGE_PSE);
@ -1008,13 +968,6 @@ static inline int pte_present(pte_t a)
return pte_flags(a) & (_PAGE_PRESENT | _PAGE_PROTNONE);
}
#ifdef CONFIG_ARCH_HAS_PTE_DEVMAP
static inline int pte_devmap(pte_t a)
{
return (pte_flags(a) & _PAGE_DEVMAP) == _PAGE_DEVMAP;
}
#endif
#define pte_accessible pte_accessible
static inline bool pte_accessible(struct mm_struct *mm, pte_t a)
{

View File

@ -34,7 +34,6 @@
#define _PAGE_BIT_UFFD_WP _PAGE_BIT_SOFTW2 /* userfaultfd wrprotected */
#define _PAGE_BIT_SOFT_DIRTY _PAGE_BIT_SOFTW3 /* software dirty tracking */
#define _PAGE_BIT_KERNEL_4K _PAGE_BIT_SOFTW3 /* page must not be converted to large */
#define _PAGE_BIT_DEVMAP _PAGE_BIT_SOFTW4
#ifdef CONFIG_X86_64
#define _PAGE_BIT_SAVED_DIRTY _PAGE_BIT_SOFTW5 /* Saved Dirty bit (leaf) */
@ -121,11 +120,9 @@
#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
#define _PAGE_NX (_AT(pteval_t, 1) << _PAGE_BIT_NX)
#define _PAGE_DEVMAP (_AT(u64, 1) << _PAGE_BIT_DEVMAP)
#define _PAGE_SOFTW4 (_AT(pteval_t, 1) << _PAGE_BIT_SOFTW4)
#else
#define _PAGE_NX (_AT(pteval_t, 0))
#define _PAGE_DEVMAP (_AT(pteval_t, 0))
#define _PAGE_SOFTW4 (_AT(pteval_t, 0))
#endif
@ -154,7 +151,7 @@
#define _COMMON_PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
_PAGE_SPECIAL | _PAGE_ACCESSED | \
_PAGE_DIRTY_BITS | _PAGE_SOFT_DIRTY | \
_PAGE_DEVMAP | _PAGE_CC | _PAGE_UFFD_WP)
_PAGE_CC | _PAGE_UFFD_WP)
#define _PAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PAT)
#define _HPAGE_CHG_MASK (_COMMON_PAGE_CHG_MASK | _PAGE_PSE | _PAGE_PAT_LARGE)

View File

@ -356,11 +356,6 @@ static inline void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *b
mmu_notifier_arch_invalidate_secondary_tlbs(mm, 0, -1UL);
}
static inline void arch_flush_tlb_batched_pending(struct mm_struct *mm)
{
flush_tlb_mm(mm);
}
extern void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
static inline bool pte_flags_need_flush(unsigned long oldflags,

View File

@ -279,7 +279,7 @@ static struct sgx_encl_page *__sgx_encl_load_page(struct sgx_encl *encl,
static struct sgx_encl_page *sgx_encl_load_page_in_vma(struct sgx_encl *encl,
unsigned long addr,
unsigned long vm_flags)
vm_flags_t vm_flags)
{
unsigned long vm_prot_bits = vm_flags & VM_ACCESS_FLAGS;
struct sgx_encl_page *entry;
@ -520,9 +520,9 @@ static void sgx_vma_open(struct vm_area_struct *vma)
* Return: 0 on success, -EACCES otherwise
*/
int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
unsigned long end, unsigned long vm_flags)
unsigned long end, vm_flags_t vm_flags)
{
unsigned long vm_prot_bits = vm_flags & VM_ACCESS_FLAGS;
vm_flags_t vm_prot_bits = vm_flags & VM_ACCESS_FLAGS;
struct sgx_encl_page *page;
unsigned long count = 0;
int ret = 0;
@ -605,7 +605,7 @@ static int sgx_encl_debug_write(struct sgx_encl *encl, struct sgx_encl_page *pag
*/
static struct sgx_encl_page *sgx_encl_reserve_page(struct sgx_encl *encl,
unsigned long addr,
unsigned long vm_flags)
vm_flags_t vm_flags)
{
struct sgx_encl_page *entry;

View File

@ -101,7 +101,7 @@ static inline int sgx_encl_find(struct mm_struct *mm, unsigned long addr,
}
int sgx_encl_may_map(struct sgx_encl *encl, unsigned long start,
unsigned long end, unsigned long vm_flags);
unsigned long end, vm_flags_t vm_flags);
bool current_is_ksgxd(void);
void sgx_encl_release(struct kref *ref);

View File

@ -36,7 +36,6 @@
#include <linux/debugfs.h>
#include <linux/ioport.h>
#include <linux/kernel.h>
#include <linux/pfn_t.h>
#include <linux/slab.h>
#include <linux/io.h>
#include <linux/mm.h>

View File

@ -32,7 +32,7 @@ void add_encrypt_protection_map(void)
protection_map[i] = pgprot_encrypted(protection_map[i]);
}
pgprot_t vm_get_page_prot(unsigned long vm_flags)
pgprot_t vm_get_page_prot(vm_flags_t vm_flags)
{
unsigned long val = pgprot_val(protection_map[vm_flags &
(VM_READ|VM_WRITE|VM_EXEC|VM_SHARED)]);

View File

@ -653,13 +653,13 @@ static void bio_truncate(struct bio *bio, unsigned new_size)
bio_for_each_segment(bv, bio, iter) {
if (done + bv.bv_len > new_size) {
unsigned offset;
size_t offset;
if (!truncated)
offset = new_size - done;
else
offset = 0;
zero_user(bv.bv_page, bv.bv_offset + offset,
memzero_page(bv.bv_page, bv.bv_offset + offset,
bv.bv_len - offset);
truncated = true;
}

View File

@ -962,10 +962,10 @@ static int hmat_callback(struct notifier_block *self,
unsigned long action, void *arg)
{
struct memory_target *target;
struct memory_notify *mnb = arg;
int pxm, nid = mnb->status_change_nid;
struct node_notify *nn = arg;
int pxm, nid = nn->nid;
if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
if (action != NODE_ADDED_FIRST_MEMORY)
return NOTIFY_OK;
pxm = node_to_pxm(nid);
@ -1118,7 +1118,7 @@ static __init int hmat_init(void)
hmat_register_targets();
/* Keep the table and structures if the notifier may use them */
if (hotplug_memory_notifier(hmat_callback, HMAT_CALLBACK_PRI))
if (hotplug_node_notifier(hmat_callback, HMAT_CALLBACK_PRI))
goto out_put;
if (!hmat_set_default_dram_perf())

View File

@ -22,6 +22,7 @@
#include <linux/stat.h>
#include <linux/slab.h>
#include <linux/xarray.h>
#include <linux/export.h>
#include <linux/atomic.h>
#include <linux/uaccess.h>
@ -48,22 +49,8 @@ int mhp_online_type_from_str(const char *str)
#define to_memory_block(dev) container_of(dev, struct memory_block, dev)
static int sections_per_block;
static inline unsigned long memory_block_id(unsigned long section_nr)
{
return section_nr / sections_per_block;
}
static inline unsigned long pfn_to_block_id(unsigned long pfn)
{
return memory_block_id(pfn_to_section_nr(pfn));
}
static inline unsigned long phys_to_block_id(unsigned long phys)
{
return pfn_to_block_id(PFN_DOWN(phys));
}
int sections_per_block;
EXPORT_SYMBOL(sections_per_block);
static int memory_subsys_online(struct device *dev);
static int memory_subsys_offline(struct device *dev);
@ -683,7 +670,7 @@ int __weak arch_get_memory_phys_device(unsigned long start_pfn)
*
* Called under device_hotplug_lock.
*/
static struct memory_block *find_memory_block_by_id(unsigned long block_id)
struct memory_block *find_memory_block_by_id(unsigned long block_id)
{
struct memory_block *mem;

View File

@ -21,6 +21,7 @@
#include <linux/pm_runtime.h>
#include <linux/swap.h>
#include <linux/slab.h>
#include <linux/memblock.h>
static const struct bus_type node_subsys = {
.name = "node",
@ -111,6 +112,27 @@ static const struct attribute_group *node_access_node_groups[] = {
NULL,
};
#ifdef CONFIG_MEMORY_HOTPLUG
static BLOCKING_NOTIFIER_HEAD(node_chain);
int register_node_notifier(struct notifier_block *nb)
{
return blocking_notifier_chain_register(&node_chain, nb);
}
EXPORT_SYMBOL(register_node_notifier);
void unregister_node_notifier(struct notifier_block *nb)
{
blocking_notifier_chain_unregister(&node_chain, nb);
}
EXPORT_SYMBOL(unregister_node_notifier);
int node_notify(unsigned long val, void *v)
{
return blocking_notifier_call_chain(&node_chain, val, v);
}
#endif
static void node_remove_accesses(struct node *node)
{
struct node_access_nodes *c, *cnext;
@ -478,7 +500,7 @@ static ssize_t node_read_meminfo(struct device *dev,
nid, K(node_page_state(pgdat, NR_SECONDARY_PAGETABLE)),
nid, 0UL,
nid, 0UL,
nid, K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
nid, 0UL,
nid, K(sreclaimable +
node_page_state(pgdat, NR_KERNEL_MISC_RECLAIMABLE)),
nid, K(sreclaimable + sunreclaimable),
@ -637,6 +659,7 @@ static int register_node(struct node *node, int num)
} else {
hugetlb_register_node(node);
compaction_register_node(node);
reclaim_register_node(node);
}
return error;
@ -653,6 +676,7 @@ void unregister_node(struct node *node)
{
hugetlb_unregister_node(node);
compaction_unregister_node(node);
reclaim_unregister_node(node);
node_remove_accesses(node);
node_remove_caches(node);
device_unregister(&node->dev);
@ -756,15 +780,6 @@ int unregister_cpu_under_node(unsigned int cpu, unsigned int nid)
}
#ifdef CONFIG_MEMORY_HOTPLUG
static int __ref get_nid_for_pfn(unsigned long pfn)
{
#ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT
if (system_state < SYSTEM_RUNNING)
return early_pfn_to_nid(pfn);
#endif
return pfn_to_nid(pfn);
}
static void do_register_memory_block_under_node(int nid,
struct memory_block *mem_blk,
enum meminit_context context)
@ -791,46 +806,6 @@ static void do_register_memory_block_under_node(int nid,
ret);
}
/* register memory section under specified node if it spans that node */
static int register_mem_block_under_node_early(struct memory_block *mem_blk,
void *arg)
{
unsigned long memory_block_pfns = memory_block_size_bytes() / PAGE_SIZE;
unsigned long start_pfn = section_nr_to_pfn(mem_blk->start_section_nr);
unsigned long end_pfn = start_pfn + memory_block_pfns - 1;
int nid = *(int *)arg;
unsigned long pfn;
for (pfn = start_pfn; pfn <= end_pfn; pfn++) {
int page_nid;
/*
* memory block could have several absent sections from start.
* skip pfn range from absent section
*/
if (!pfn_in_present_section(pfn)) {
pfn = round_down(pfn + PAGES_PER_SECTION,
PAGES_PER_SECTION) - 1;
continue;
}
/*
* We need to check if page belongs to nid only at the boot
* case because node's ranges can be interleaved.
*/
page_nid = get_nid_for_pfn(pfn);
if (page_nid < 0)
continue;
if (page_nid != nid)
continue;
do_register_memory_block_under_node(nid, mem_blk, MEMINIT_EARLY);
return 0;
}
/* mem section does not span the specified node */
return 0;
}
/*
* During hotplug we know that all pages in the memory block belong to the same
* node.
@ -859,24 +834,44 @@ void unregister_memory_block_under_nodes(struct memory_block *mem_blk)
kobject_name(&node_devices[mem_blk->nid]->dev.kobj));
}
void register_memory_blocks_under_node(int nid, unsigned long start_pfn,
unsigned long end_pfn,
enum meminit_context context)
/* register all memory blocks under the corresponding nodes */
static void register_memory_blocks_under_nodes(void)
{
walk_memory_blocks_func_t func;
struct memblock_region *r;
if (context == MEMINIT_HOTPLUG)
func = register_mem_block_under_node_hotplug;
else
func = register_mem_block_under_node_early;
for_each_mem_region(r) {
const unsigned long start_block_id = phys_to_block_id(r->base);
const unsigned long end_block_id = phys_to_block_id(r->base + r->size - 1);
const int nid = memblock_get_region_node(r);
unsigned long block_id;
if (!node_online(nid))
continue;
for (block_id = start_block_id; block_id <= end_block_id; block_id++) {
struct memory_block *mem;
mem = find_memory_block_by_id(block_id);
if (!mem)
continue;
do_register_memory_block_under_node(nid, mem, MEMINIT_EARLY);
put_device(&mem->dev);
}
}
}
void register_memory_blocks_under_node_hotplug(int nid, unsigned long start_pfn,
unsigned long end_pfn)
{
walk_memory_blocks(PFN_PHYS(start_pfn), PFN_PHYS(end_pfn - start_pfn),
(void *)&nid, func);
(void *)&nid, register_mem_block_under_node_hotplug);
return;
}
#endif /* CONFIG_MEMORY_HOTPLUG */
int __register_one_node(int nid)
int register_one_node(int nid)
{
int error;
int cpu;
@ -980,11 +975,13 @@ void __init node_dev_init(void)
/*
* Create all node devices, which will properly link the node
* to applicable memory block devices and already created cpu devices.
* to already created cpu devices.
*/
for_each_online_node(i) {
ret = register_one_node(i);
ret = register_one_node(i);
if (ret)
panic("%s() failed to add node: %d\n", __func__, ret);
}
register_memory_blocks_under_nodes();
}

View File

@ -1179,7 +1179,7 @@ static int copy_from_nullb(struct nullb *nullb, struct page *dest,
memcpy_page(dest, off + count, t_page->page, offset,
temp);
else
zero_user(dest, off + count, temp);
memzero_page(dest, off + count, temp);
count += temp;
sector += temp >> SECTOR_SHIFT;

View File

@ -2451,12 +2451,12 @@ static int cxl_region_perf_attrs_callback(struct notifier_block *nb,
unsigned long action, void *arg)
{
struct cxl_region *cxlr = container_of(nb, struct cxl_region,
memory_notifier);
struct memory_notify *mnb = arg;
int nid = mnb->status_change_nid;
node_notifier);
struct node_notify *nn = arg;
int nid = nn->nid;
int region_nid;
if (nid == NUMA_NO_NODE || action != MEM_ONLINE)
if (action != NODE_ADDED_FIRST_MEMORY)
return NOTIFY_DONE;
/*
@ -3527,7 +3527,7 @@ static void shutdown_notifiers(void *_cxlr)
{
struct cxl_region *cxlr = _cxlr;
unregister_memory_notifier(&cxlr->memory_notifier);
unregister_node_notifier(&cxlr->node_notifier);
unregister_mt_adistance_algorithm(&cxlr->adist_notifier);
}
@ -3566,9 +3566,9 @@ out:
if (rc)
return rc;
cxlr->memory_notifier.notifier_call = cxl_region_perf_attrs_callback;
cxlr->memory_notifier.priority = CXL_CALLBACK_PRI;
register_memory_notifier(&cxlr->memory_notifier);
cxlr->node_notifier.notifier_call = cxl_region_perf_attrs_callback;
cxlr->node_notifier.priority = CXL_CALLBACK_PRI;
register_node_notifier(&cxlr->node_notifier);
cxlr->adist_notifier.notifier_call = cxl_region_calculate_adistance;
cxlr->adist_notifier.priority = 100;

View File

@ -514,7 +514,7 @@ enum cxl_partition_mode {
* @flags: Region state flags
* @params: active + config params for the region
* @coord: QoS access coordinates for the region
* @memory_notifier: notifier for setting the access coordinates to node
* @node_notifier: notifier for setting the access coordinates to node
* @adist_notifier: notifier for calculating the abstract distance of node
*/
struct cxl_region {
@ -527,7 +527,7 @@ struct cxl_region {
unsigned long flags;
struct cxl_region_params params;
struct access_coordinate coord[ACCESS_COORDINATE_MAX];
struct notifier_block memory_notifier;
struct notifier_block node_notifier;
struct notifier_block adist_notifier;
};

View File

@ -4,7 +4,6 @@
#include <linux/pagemap.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/pfn_t.h>
#include <linux/cdev.h>
#include <linux/slab.h>
#include <linux/dax.h>
@ -73,7 +72,7 @@ __weak phys_addr_t dax_pgoff_to_phys(struct dev_dax *dev_dax, pgoff_t pgoff,
return -1;
}
static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn,
static void dax_set_mapping(struct vm_fault *vmf, unsigned long pfn,
unsigned long fault_size)
{
unsigned long i, nr_pages = fault_size / PAGE_SIZE;
@ -89,7 +88,7 @@ static void dax_set_mapping(struct vm_fault *vmf, pfn_t pfn,
ALIGN_DOWN(vmf->address, fault_size));
for (i = 0; i < nr_pages; i++) {
struct folio *folio = pfn_folio(pfn_t_to_pfn(pfn) + i);
struct folio *folio = pfn_folio(pfn + i);
if (folio->mapping)
continue;
@ -104,7 +103,7 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax,
{
struct device *dev = &dev_dax->dev;
phys_addr_t phys;
pfn_t pfn;
unsigned long pfn;
unsigned int fault_size = PAGE_SIZE;
if (check_vma(dev_dax, vmf->vma, __func__))
@ -125,11 +124,11 @@ static vm_fault_t __dev_dax_pte_fault(struct dev_dax *dev_dax,
return VM_FAULT_SIGBUS;
}
pfn = phys_to_pfn_t(phys, 0);
pfn = PHYS_PFN(phys);
dax_set_mapping(vmf, pfn, fault_size);
return vmf_insert_page_mkwrite(vmf, pfn_t_to_page(pfn),
return vmf_insert_page_mkwrite(vmf, pfn_to_page(pfn),
vmf->flags & FAULT_FLAG_WRITE);
}
@ -140,7 +139,7 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax,
struct device *dev = &dev_dax->dev;
phys_addr_t phys;
pgoff_t pgoff;
pfn_t pfn;
unsigned long pfn;
unsigned int fault_size = PMD_SIZE;
if (check_vma(dev_dax, vmf->vma, __func__))
@ -169,11 +168,11 @@ static vm_fault_t __dev_dax_pmd_fault(struct dev_dax *dev_dax,
return VM_FAULT_SIGBUS;
}
pfn = phys_to_pfn_t(phys, 0);
pfn = PHYS_PFN(phys);
dax_set_mapping(vmf, pfn, fault_size);
return vmf_insert_folio_pmd(vmf, page_folio(pfn_t_to_page(pfn)),
return vmf_insert_folio_pmd(vmf, page_folio(pfn_to_page(pfn)),
vmf->flags & FAULT_FLAG_WRITE);
}
@ -185,7 +184,7 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax,
struct device *dev = &dev_dax->dev;
phys_addr_t phys;
pgoff_t pgoff;
pfn_t pfn;
unsigned long pfn;
unsigned int fault_size = PUD_SIZE;
@ -215,11 +214,11 @@ static vm_fault_t __dev_dax_pud_fault(struct dev_dax *dev_dax,
return VM_FAULT_SIGBUS;
}
pfn = phys_to_pfn_t(phys, 0);
pfn = PHYS_PFN(phys);
dax_set_mapping(vmf, pfn, fault_size);
return vmf_insert_folio_pud(vmf, page_folio(pfn_t_to_page(pfn)),
return vmf_insert_folio_pud(vmf, page_folio(pfn_to_page(pfn)),
vmf->flags & FAULT_FLAG_WRITE);
}
#else

View File

@ -2,7 +2,6 @@
#include <linux/platform_device.h>
#include <linux/memregion.h>
#include <linux/module.h>
#include <linux/pfn_t.h>
#include <linux/dax.h>
#include "../bus.h"

View File

@ -5,7 +5,6 @@
#include <linux/memory.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/pfn_t.h>
#include <linux/slab.h>
#include <linux/dax.h>
#include <linux/fs.h>

View File

@ -2,7 +2,6 @@
/* Copyright(c) 2016 - 2018 Intel Corporation. All rights reserved. */
#include <linux/memremap.h>
#include <linux/module.h>
#include <linux/pfn_t.h>
#include "../nvdimm/pfn.h"
#include "../nvdimm/nd.h"
#include "bus.h"

View File

@ -7,7 +7,6 @@
#include <linux/mount.h>
#include <linux/pseudo_fs.h>
#include <linux/magic.h>
#include <linux/pfn_t.h>
#include <linux/cdev.h>
#include <linux/slab.h>
#include <linux/uio.h>
@ -148,7 +147,7 @@ enum dax_device_flags {
* pages accessible at the device relative @pgoff.
*/
long dax_direct_access(struct dax_device *dax_dev, pgoff_t pgoff, long nr_pages,
enum dax_access_mode mode, void **kaddr, pfn_t *pfn)
enum dax_access_mode mode, void **kaddr, unsigned long *pfn)
{
long avail;

View File

@ -7,7 +7,6 @@
#include <linux/dma-buf.h>
#include <linux/pfn_t.h>
#include <linux/shmem_fs.h>
#include <linux/module.h>

View File

@ -6,7 +6,6 @@
**************************************************************************/
#include <linux/fb.h>
#include <linux/pfn_t.h>
#include <drm/drm_crtc_helper.h>
#include <drm/drm_drv.h>
@ -33,7 +32,7 @@ static vm_fault_t psb_fbdev_vm_fault(struct vm_fault *vmf)
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
for (i = 0; i < page_num; ++i) {
err = vmf_insert_mixed(vma, address, __pfn_to_pfn_t(pfn, PFN_DEV));
err = vmf_insert_mixed(vma, address, pfn);
if (unlikely(err & VM_FAULT_ERROR))
break;
address += PAGE_SIZE;

View File

@ -5,7 +5,6 @@
#include <linux/anon_inodes.h>
#include <linux/mman.h>
#include <linux/pfn_t.h>
#include <linux/sizes.h>
#include <drm/drm_cache.h>

View File

@ -303,7 +303,6 @@ void __shmem_writeback(size_t size, struct address_space *mapping)
.nr_to_write = SWAP_CLUSTER_MAX,
.range_start = 0,
.range_end = LLONG_MAX,
.for_reclaim = 1,
};
struct folio *folio = NULL;
int error = 0;
@ -318,7 +317,7 @@ void __shmem_writeback(size_t size, struct address_space *mapping)
if (folio_mapped(folio))
folio_redirty_for_writepage(&wbc, folio);
else
error = shmem_writeout(folio, &wbc);
error = shmem_writeout(folio, NULL, NULL);
}
}

View File

@ -9,7 +9,6 @@
#include <linux/spinlock.h>
#include <linux/shmem_fs.h>
#include <linux/dma-buf.h>
#include <linux/pfn_t.h>
#include <drm/drm_prime.h>
#include <drm/drm_file.h>

View File

@ -8,7 +8,6 @@
#include <linux/seq_file.h>
#include <linux/shmem_fs.h>
#include <linux/spinlock.h>
#include <linux/pfn_t.h>
#include <linux/vmalloc.h>
#include <drm/drm_prime.h>
@ -371,8 +370,7 @@ static vm_fault_t omap_gem_fault_1d(struct drm_gem_object *obj,
VERB("Inserting %p pfn %lx, pa %lx", (void *)vmf->address,
pfn, pfn << PAGE_SHIFT);
return vmf_insert_mixed(vma, vmf->address,
__pfn_to_pfn_t(pfn, PFN_DEV));
return vmf_insert_mixed(vma, vmf->address, pfn);
}
/* Special handling for the case of faulting in 2d tiled buffers */
@ -467,8 +465,7 @@ static vm_fault_t omap_gem_fault_2d(struct drm_gem_object *obj,
pfn, pfn << PAGE_SHIFT);
for (i = n; i > 0; i--) {
ret = vmf_insert_mixed(vma,
vaddr, __pfn_to_pfn_t(pfn, PFN_DEV));
ret = vmf_insert_mixed(vma, vaddr, pfn);
if (ret & VM_FAULT_ERROR)
break;
pfn += priv->usergart[fmt].stride_pfn;

View File

@ -114,15 +114,8 @@ ttm_backup_backup_page(struct file *backup, struct page *page,
if (writeback && !folio_mapped(to_folio) &&
folio_clear_dirty_for_io(to_folio)) {
struct writeback_control wbc = {
.sync_mode = WB_SYNC_NONE,
.nr_to_write = SWAP_CLUSTER_MAX,
.range_start = 0,
.range_end = LLONG_MAX,
.for_reclaim = 1,
};
folio_set_reclaim(to_folio);
ret = shmem_writeout(to_folio, &wbc);
ret = shmem_writeout(to_folio, NULL, NULL);
if (!folio_test_writeback(to_folio))
folio_clear_reclaim(to_folio);
/*

View File

@ -16,7 +16,6 @@
*/
#include <linux/dma-buf.h>
#include <linux/pfn_t.h>
#include <linux/vmalloc.h>
#include "v3d_drv.h"

View File

@ -19,7 +19,6 @@
#include <linux/io.h>
#include <linux/workqueue.h>
#include <linux/dma-mapping.h>
#include <linux/pfn_t.h>
#ifdef CONFIG_X86
#include <asm/set_memory.h>
@ -1618,7 +1617,7 @@ static vm_fault_t msc_mmap_fault(struct vm_fault *vmf)
return VM_FAULT_SIGBUS;
get_page(page);
return vmf_insert_mixed(vmf->vma, vmf->address, page_to_pfn_t(page));
return vmf_insert_mixed(vmf->vma, vmf->address, page_to_pfn(page));
}
static const struct vm_operations_struct msc_mmap_ops = {

View File

@ -170,7 +170,7 @@ static struct dax_device *linear_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
static long linear_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
long nr_pages, enum dax_access_mode mode, void **kaddr,
pfn_t *pfn)
unsigned long *pfn)
{
struct dax_device *dax_dev = linear_dax_pgoff(ti, &pgoff);

View File

@ -893,7 +893,7 @@ static struct dax_device *log_writes_dax_pgoff(struct dm_target *ti,
static long log_writes_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
long nr_pages, enum dax_access_mode mode, void **kaddr,
pfn_t *pfn)
unsigned long *pfn)
{
struct dax_device *dax_dev = log_writes_dax_pgoff(ti, &pgoff);

View File

@ -316,7 +316,7 @@ static struct dax_device *stripe_dax_pgoff(struct dm_target *ti, pgoff_t *pgoff)
static long stripe_dax_direct_access(struct dm_target *ti, pgoff_t pgoff,
long nr_pages, enum dax_access_mode mode, void **kaddr,
pfn_t *pfn)
unsigned long *pfn)
{
struct dax_device *dax_dev = stripe_dax_pgoff(ti, &pgoff);

Some files were not shown because too many files have changed in this diff Show More