diff options
Diffstat (limited to '')
-rw-r--r-- | src/arrow/r/man/RecordBatch.Rd | 92 |
1 files changed, 92 insertions, 0 deletions
diff --git a/src/arrow/r/man/RecordBatch.Rd b/src/arrow/r/man/RecordBatch.Rd new file mode 100644 index 000000000..ff08c2158 --- /dev/null +++ b/src/arrow/r/man/RecordBatch.Rd @@ -0,0 +1,92 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/record-batch.R +\docType{class} +\name{RecordBatch} +\alias{RecordBatch} +\alias{record_batch} +\title{RecordBatch class} +\usage{ +record_batch(..., schema = NULL) +} +\arguments{ +\item{...}{A \code{data.frame} or a named set of Arrays or vectors. If given a +mixture of data.frames and vectors, the inputs will be autospliced together +(see examples). Alternatively, you can provide a single Arrow IPC +\code{InputStream}, \code{Message}, \code{Buffer}, or R \code{raw} object containing a \code{Buffer}.} + +\item{schema}{a \link{Schema}, or \code{NULL} (the default) to infer the schema from +the data in \code{...}. When providing an Arrow IPC buffer, \code{schema} is required.} +} +\description{ +A record batch is a collection of equal-length arrays matching +a particular \link{Schema}. It is a table-like data structure that is semantically +a sequence of \link[=Field]{fields}, each a contiguous Arrow \link{Array}. +} +\section{S3 Methods and Usage}{ + +Record batches are data-frame-like, and many methods you expect to work on +a \code{data.frame} are implemented for \code{RecordBatch}. This includes \code{[}, \code{[[}, +\code{$}, \code{names}, \code{dim}, \code{nrow}, \code{ncol}, \code{head}, and \code{tail}. You can also pull +the data from an Arrow record batch into R with \code{as.data.frame()}. See the +examples. + +A caveat about the \code{$} method: because \code{RecordBatch} is an \code{R6} object, +\code{$} is also used to access the object's methods (see below). Methods take +precedence over the table's columns. So, \code{batch$Slice} would return the +"Slice" method function even if there were a column in the table called +"Slice". +} + +\section{R6 Methods}{ + +In addition to the more R-friendly S3 methods, a \code{RecordBatch} object has +the following R6 methods that map onto the underlying C++ methods: +\itemize{ +\item \verb{$Equals(other)}: Returns \code{TRUE} if the \code{other} record batch is equal +\item \verb{$column(i)}: Extract an \code{Array} by integer position from the batch +\item \verb{$column_name(i)}: Get a column's name by integer position +\item \verb{$names()}: Get all column names (called by \code{names(batch)}) +\item \verb{$RenameColumns(value)}: Set all column names (called by \code{names(batch) <- value}) +\item \verb{$GetColumnByName(name)}: Extract an \code{Array} by string name +\item \verb{$RemoveColumn(i)}: Drops a column from the batch by integer position +\item \verb{$SelectColumns(indices)}: Return a new record batch with a selection of columns, expressed as 0-based integers. +\item \verb{$Slice(offset, length = NULL)}: Create a zero-copy view starting at the +indicated integer offset and going for the given length, or to the end +of the table if \code{NULL}, the default. +\item \verb{$Take(i)}: return an \code{RecordBatch} with rows at positions given by +integers (R vector or Array Array) \code{i}. +\item \verb{$Filter(i, keep_na = TRUE)}: return an \code{RecordBatch} with rows at positions where logical +vector (or Arrow boolean Array) \code{i} is \code{TRUE}. +\item \verb{$SortIndices(names, descending = FALSE)}: return an \code{Array} of integer row +positions that can be used to rearrange the \code{RecordBatch} in ascending or +descending order by the first named column, breaking ties with further named +columns. \code{descending} can be a logical vector of length one or of the same +length as \code{names}. +\item \verb{$serialize()}: Returns a raw vector suitable for interprocess communication +\item \verb{$cast(target_schema, safe = TRUE, options = cast_options(safe))}: Alter +the schema of the record batch. +} + +There are also some active bindings +\itemize{ +\item \verb{$num_columns} +\item \verb{$num_rows} +\item \verb{$schema} +\item \verb{$metadata}: Returns the key-value metadata of the \code{Schema} as a named list. +Modify or replace by assigning in (\code{batch$metadata <- new_metadata}). +All list elements are coerced to string. See \code{schema()} for more information. +\item \verb{$columns}: Returns a list of \code{Array}s +} +} + +\examples{ +\dontshow{if (arrow_available()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf} +batch <- record_batch(name = rownames(mtcars), mtcars) +dim(batch) +dim(head(batch)) +names(batch) +batch$mpg +batch[["cyl"]] +as.data.frame(batch[4:8, c("gear", "hp", "wt")]) +\dontshow{\}) # examplesIf} +} |