% Generated by roxygen2: do not edit by hand % Please edit documentation in R/dataset-scan.R \name{map_batches} \alias{map_batches} \title{Apply a function to a stream of RecordBatches} \usage{ map_batches(X, FUN, ..., .data.frame = TRUE) } \arguments{ \item{X}{A \code{Dataset} or \code{arrow_dplyr_query} object, as returned by the \code{dplyr} methods on \code{Dataset}.} \item{FUN}{A function or \code{purrr}-style lambda expression to apply to each batch} \item{...}{Additional arguments passed to \code{FUN}} \item{.data.frame}{logical: collect the resulting chunks into a single \code{data.frame}? Default \code{TRUE}} } \description{ As an alternative to calling \code{collect()} on a \code{Dataset} query, you can use this function to access the stream of \code{RecordBatch}es in the \code{Dataset}. This lets you aggregate on each chunk and pull the intermediate results into a \code{data.frame} for further aggregation, even if you couldn't fit the whole \code{Dataset} result in memory. } \details{ This is experimental and not recommended for production use. }