File references are strings of bytes, that can be encountered in the file_reference fields of document and photo objects.
They must be cached by the client, along with the source where the document/photo object was found, in order to be refetched when the file reference expires.
Example implementation of a reference database: MadelineProto, android, telegram desktop, tdlib.
Implementation and maintenance of the file reference database may be fully automated by using the file reference database map file.
Latest JSON map file for the current layer » - The constructor predicate is serialized under the _ key, bytes fields are serialized as {"_": "bytes", "bytes": "<base64 encoded value>"}
Serialized TL version of the latest map file for the current layer »
See here » for the TL schema of the map file.
The map file is automatically generated and validated by the file reference generator tool »; this tool can be manually run on newer or experimental layers, to generate the file reference database map file for a given API schema: see here » for more info.
To implement it, start by setting up the following two tables:
The map file contains and specifies, for the current layer:
TL schemas for:
These TL schemas can be used as a guideline for the file reference and file source table DBMS schema, or by directly serializing file IDs and file sources if a simple string => string KV DB is used.
The following extractors:
Extractors should be used by hooking the TL serialization and deserialization logic in order to invoke the correct extractor(s) when a certain constructor is serialized and deserialized.
The extraction logic can also be implemented externally from the TL parser, or on the contrary built directly into it via codegen.
All actions »: actions are instructions used to refresh expired file references according to file sources » stored in the file source table
The file reference table can be represented by the following type:
HashMap<FileId, bytes>
This table maps a file ID » (which uniquely identifies a photo or a document) to a file reference ».
This table is populated by the location extractors » contained in the map file ».
The file reference can alternatively be stored, for example, in the document and photo DBs along with other metadata, or in any other way that uniquely associates a FileId to a file reference.
HashMap<FileId, Vector<FileSource>>
It maps a file ID » to one or more file sources ».
Later, when a file is used (for example when sending or downloading a media file), and a FILE_REFERENCE_EXPIRED RPC error is received for a specific FileId, the client must fetch the List<FileSource> associated to the used FileId, and execute the method call associated to each FileSource, until either:
FileId is updated with a new value, in which case the failed method call is retried with the new file reference ORfileReferenceMap#4dba8c2c db_schema:string db_schema_json:string locations:Vector<Location> sources:Vector<Source> skipped:Vector<SkippedSource> actions:Vector<Action> = FileReferenceMap;
locationIncoming#e18770f4 predicate:string stored_constructor:string = Location;
locationOutgoing#e7740ae0 predicate:string stored_constructor:string = Location;
source#dd27c696 flags:# predicate:string is_constructor:flags.0?true stored_constructor:string stored_params:Vector<FieldExtractor> skipped_flags:Vector<string> needs_parent:flags.3?string parent_is_constructor:flags.4?true = Source;
skippedSource#1d53cd15 flags:# predicate:string is_constructor:flags.0?true why:string = SkippedSource;
action#9539f410 stored_constructor:string action:ActionOp = Action;
The map file is composed of a single fileReferenceMap constructor, which contains:
The db_schema and db_schema_json fields with the TL database schema of FileIds and FileSources.
db_schema is text TL schema, db_schema_json is the same TL schema in the JSON format also used for the API schema ».
A list of Location constructors, containing the full list of constructors that have a file_reference field, and info on how to generate a FileId from them.
A list of source constructors, containing instructions on how to generate FileSources from received constructors or method responses.
A list of action constructors, containing the method call to invoke when refreshing a file reference using a previously stored FileId and FileSource.
A list of skippedSource constructors.
These constructors indicate that the source should be ignored completely (including during codegen): skippedSource sources are used internally to make sure all file reference paths are still covered in some way during validation, including paths for ephemeral media like inline results, or for media without any associated source (for example, media uploaded using messages.uploadMedia but not yet sent anywhere obviously does not have any associated source).
skippedSource.predicate - Indicates the name of the ignored constructor or method.skippedSource.is_constructor - If set, predicate points to a constructor, otherwise it points to a method.skippedSource.why - A human-readable reason as to why this source should be ignored.Each skipped predicate can have one associated reason.
A file ID (in the context of the file reference DB) is an object that, when serialized to a string, can easily and uniquely identify a media file.
Currently, the following file ID objects are available (taken from the DB schema inside of the map file):
fileIdPhoto#47a0bd49 id:long = FileId;
fileIdDocument#461b1d89 id:long = FileId;
Location extractors » specify how to generate a file ID object from all media objects.
A file reference is a string of bytes that is be encountered in the file_reference fields of
Incoming media constructors, such as document and photo objects.
A file ID can be generated from all incoming media constructors, as specified by the location extractors ».
Outgoing media constructors like inputDocument, inputPhoto, etc, where it must be populated by the client in order to download or resend all media.
A file ID can be generated from all outgoing media constructors, as specified by the location objects ».
A file reference may expire, in which case it cannot be used in outgoing constructors: it must be refreshed by refetching the message, story, etc where the media last appeared, as specified below.
Locations extractors » specify how to
FileId constructors and...from both incoming and outgoing media objects.
Locations are used to populate the file reference dictionary », which maps a file ID to exactly one file reference.
locationIncoming#e18770f4 predicate:string stored_constructor:string = Location;
locationOutgoing#e7740ae0 predicate:string stored_constructor:string = Location;
Locations contain the full list of constructors (never methods) that have a file_reference and an id field.
The id field of predicate must be used to generate a FileId object of type stored_constructor.
locationIncoming references constructors that can only be received from the API.
locationOutgoing references constructors that can only be sent to the API.
Params:
predicate - The name of the API constructor.stored_constructor - The name of the FileId object to be used for database read/writes. Each unique predicate can only have one stored_constructor.
For example, current API layers have the following locations:
fileIdPhoto (outgoing)fileIdDocument (outgoing)fileIdDocument (outgoing)fileIdPhoto (outgoing)fileIdDocument (incoming)fileIdPhoto (incoming)When encountering one of the objects on the left, use the id field to generate a FileId of the type on the right.
When refreshing references, replace the file_reference field of the object on the left.
(Legacy inputFileLocation constructors are ignored by the current map file).
A file source contains information that the client may use to re-fetch the document (and the new file reference), by invoking a specific action.
Each source type is associated to a specific action ».
Locations extractors » and sources » are used together to populate the file source dictionary », which maps a file ID to one or more file sources.
Schema:
source#dd27c696 flags:# predicate:string is_constructor:flags.0?true stored_constructor:string stored_params:Vector<FieldExtractor> skipped_flags:Vector<string> needs_parent:flags.3?string parent_is_constructor:flags.4?true = Source;
A source contains instructions on how to extract and store file sources from an incoming method response or update.
Note: The following sections assume that all Updates constructors have already been converted to a vector of Update constructors, including short variants like updateShortMessage, updateShortSentMessage, updateShortChatMessage, which must be pre-converted to Update constructors by the client using information extracted from the method call.
While this operation could be done within the map file, it would needlessly increase the number of paths: given that most clients already convert short constructors to Update constructors, the map file only considers paths starting from the Update constructors.
Parameters:
predicate - Indicates the name of the constructor or of the method where extraction of the fields must start.is_constructor - If set, predicate points to a constructor, otherwise it points to a method.needs_parent - If set, contains the name of a constructor which needs to appear as a parent in the deserialized object or a method whose response we're deserializing, as it will be used by one or more of the paths in the action.parent_is_constructor - If set, needs_parent points to a constructor; otherwise it points to a method.stored_constructor - Contains the name of a FileSource constructor from the TL schema specified in fileReferenceMap.db_schema, which should be stored to the database after being populated by stored_params as specified below.stored_params - Contains info on how to populate the FileSource constructor specified in stored_constructorskipped_flags - In some cases, some flag fields of stored_constructor aren't referenced by action: this field contains the full list of flags which are never referenced by the current source and must be unset. Each unique predicate can have one or more stored_constructors with different params.
If the is_constructor flag is set, when deserializing a constructor with predicate equal to predicate, contained in an incoming Update or in at any depth in a method call response, do the following:
FileSource of type stored_constructor to the file source stack.FileSource if needs_parent is set but we don't have a constructor or method of type needs_parent in our parents.FileIds (generated according to the locations schema) to all sources of all types currently on the stack.FileSource of type stored_constructor, make sure it can be processed according to stored_params (all required flag fields are set, all paths can be correctly extracted), and if yes, populate the FileSource of type stored_constructor accordingly and commit the source to the database (using the FileIds associated at step 2 as keys, and stored_constructor as value).needs_parent is set but we don't have a constructor or method of type needs_parent in our parents.If the is_constructor flag is not set, when deserializing the response of the method with name equal to predicate, do the following:
FileSource of type stored_constructor to the file source stack.FileIds (generated according to the locations schema) to all sources of all types currently on the stack.FileSource of type stored_constructor, make sure it can be processed according to stored_params (all required flag fields are set, all paths can be correctly extracted), and if yes, populate the FileSource of type stored_constructor accordingly and commit the source to the database (using the FileIds associated at step 2 as keys, and stored_constructor as value). Note that method sources cannot make use of needs_parent.
// Extractor
extractAndStore#72069549 from:Path to:string = FieldExtractor;
extractInputStickerSetFromDocumentAttributesAndStore#369d8d14 from:Path to:string = FieldExtractor;
extractInputStickerSetFromStickerSetAndStore#c167d470 from:Path to:string = FieldExtractor;
extractPeerIdFromPeerAndStore#7d33019c from:Path to:string = FieldExtractor;
extractPeerIdFromInputPeerAndStore#a51acfb4 from:Path to:string = FieldExtractor;
extractChannelIdFromChannelAndStore#5675bc97 from:Path to:string = FieldExtractor;
extractChannelIdFromInputChannelAndStore#b662660e from:Path to:string = FieldExtractor;
extractUserIdFromUserAndStore#4778ec63 from:Path to:string = FieldExtractor;
extractUserIdFromInputUserAndStore#7720aa2e from:Path to:string = FieldExtractor;
// Paths
paramNotFlag#acd9d5cf = ParamFlag;
paramIsFlagAbortIfEmpty#f8fe9fee = ParamFlag;
paramIsFlagFallback#202b77a1 fallback:TypedOp = ParamFlag;
paramIsFlagPassthrough#1dc6e17d = ParamFlag;
pathPart#8dc6ff46 constructor:string param:string flag:ParamFlag = PathPart;
path#c3586a2 parts:Vector<PathPart> = Path;
pathParent#58f13684 parts:Vector<PathPart> = Path;
Field extractors are used to extract a parameter from one or more constructors, i.e. updateStory.story -> storyItem.media -> messageMediaDocument.document -> document.file_reference and store it to the specified field of a FileSource.
extractAndStore#72069549 from:Path to:string = FieldExtractor;
extractInputStickerSetFromDocumentAttributesAndStore#369d8d14 from:Path to:string = FieldExtractor;
extractInputStickerSetFromStickerSetAndStore#c167d470 from:Path to:string = FieldExtractor;
extractPeerIdFromPeerAndStore#7d33019c from:Path to:string = FieldExtractor;
extractPeerIdFromInputPeerAndStore#a51acfb4 from:Path to:string = FieldExtractor;
extractChannelIdFromChannelAndStore#5675bc97 from:Path to:string = FieldExtractor;
extractChannelIdFromInputChannelAndStore#b662660e from:Path to:string = FieldExtractor;
extractUserIdFromUserAndStore#4778ec63 from:Path to:string = FieldExtractor;
extractUserIdFromInputUserAndStore#7720aa2e from:Path to:string = FieldExtractor;
extractAndStoreExtracts the value at the specified path and stores it to $FileSource.$to.
extractInputStickerSetFromDocumentAttributesAndStoreTakes the Vector<DocumentAttribute> at from, looks for a documentAttributeSticker, and stores the InputStickerSet contained in documentAttributeSticker.stickerset into $FileSource.$to.
Aborts if there is no attribute of type documentAttributeSticker in the passed vector.
extractInputStickerSetFromStickerSetAndStoreTakes the stickerSet at from, transforms it into an InputStickerSet, and stores it into $FileSource.$to.
extractPeerIdFromPeerAndStoreTakes the Peer at from, transforms it into a bot API peer ID (of type long) », and stores it into $FileSource.$to.
extractPeerIdFromInputPeerAndStoreTakes the InputPeer at from, transforms it into a bot API peer ID (of type long) », and stores it into $FileSource.$to.
If from points to an inputPeerSelf, stores the ID of the current user, instead.
Aborts if from points to an inputPeerEmpty.
extractChannelIdFromChannelAndStoreTakes the channel at from, takes the id field and stores it into $FileSource.$to.
extractChannelIdFromInputChannelAndStoreTakes the InputChannel at from, takes the channel_id field and stores it into $FileSource.$to.
Aborts if from points to an inputChannelEmpty.
extractUserIdFromUserAndStoreTakes the user at from, takes the id field and stores it into $FileSource.$to.
extractUserIdFromInputUserAndStoreTakes the InputUser at from, takes the user_id field and stores it into $FileSource.$to.
If from points to an inputUserSelf, stores the ID of the current user, instead.
Aborts if from points to an inputUserEmpty.
The first part of the path always points to:
source (source.predicate), for pathsource.needs_parent, for pathParentA path is composed of multiple pathParts.
Each pathPart contains the following fields, which describe how to extract the field.
constructor - Indicates the required constructor/method predicate.
If a different constructor type is encountered (i.e. documentEmpty instead of document), abort extraction.param - Indicates the required parameter; if it's an empty string, it indicates the return value of a method. flag - Contains exactly one of the following constructors:paramNotFlag - The current parameter is not a flagparamIsFlagAbortIfEmpty - The current parameter is a flag, and if it's not set, abort extraction. paramIsFlagFallback - The current parameter is a flag, and if it's not set, use the specified TypedOp as fallback value.copyOp, getInputChannelByIdOp, getInputUserByIdOp and getInputPeerByIdOp are not allowed in this context.paramIsFlagPassthrough - The current parameter is a flag, and its value should be copied/returned verbatim; can only be used on the last element of a path and within the arguments of a constructorOp/callOp/getMessageOp only if the argument that uses this path is a flag of the same type. Actions are used to refresh an expired file reference of a media file according to the file sources » stored in the file source dictionary », associated to the FileId.
file_reference field may appear, for example updateNewMessage.message -> message.media -> messageMediaDocument.document -> document.file_reference. For example for the file reference path specified below it's getMessage{peer: updateNewMessage.message.peer, id: updateNewMessage.message.id}, which can be used to refetch the document using either messages.getMessages or channels.getMessages depending on the type of the Peer.
action#9539f410 stored_constructor:string action:ActionOp = Action;
// Actions
callOp#c2ff3383 method:string args:Vector<TypedOpArg> = ActionOp;
getMessageOp#237849e8 peer:TypedOp id:TypedOp from_scheduled:TypedOp quick_reply_shortcut_id:TypedOp = ActionOp;
// For string => TypedOp dictionaries
typedOpArg#3a2930c2 key:string value:TypedOp = TypedOpArg;
// Typed constructors, the type is specified to simplify codegen,
// but isn't strictly necessary as it can be inferred from the TypedOpOp.
// It is fully pre-validated during the generation of the definition file.
typedOp#705b10ec type:string op:TypedOpOp = TypedOp;
Actions are used to refresh an expired file reference of a media file according to the FileSources stored in the file source dictionary », associated to the FileId.
Params:
stored_constructor - The name of an object of type FileSource, specified in the db_schema.action - The action to execute when refreshing a file source of type stored_constructor. There can only be one action per constructor.
The arguments of the action are composed of a set of typedOp constructors.
typedOp » is a wrapper for a TypedOpOp constructor which also contains the TL type of the associated TypedOpOp; this isn't strictly necessary for evaluation, but it can be useful during automatic code generation from the definition file.
callOpcallOp is a generic action which invokes the method specified in method with the arguments specified in args.
callOp.args will always contain at least all of the required parameters, and possibly some flagged parameters as well.
getMessageOpgetMessageOp is a specialized action which invokes either messages.getMessages or channels.getMessages depending on the type of the peer (which is always a subtype of InputPeer), passing the id as the only element to the vector id parameter.
If the flag to which the from_scheduled points is set, messages.getScheduledMessages should be executed instead.
If the flag to which the quick_reply_shortcut_id points is set, messages.getQuickReplyMessages should be executed instead, passing the value of the quick_reply_shortcut_id flag to messages.getQuickReplyMessages.shortcut_id and the id as the only element of a vector, passed to messages.getQuickReplyMessages.id.
copyOp#f48f418f from:string = TypedOpOp;
getInputChannelByIdOp#3cb47531 from:string = TypedOpOp;
getInputUserByIdOp#c0ee4326 from:string = TypedOpOp;
getInputPeerByIdOp#19813750 from:string = TypedOpOp;
// Literals & constructors (methods not allowed or needed here)
constructorOp#107f8d8a constructor:string args:Vector<TypedOpArg> = TypedOpOp;
vectorOp#f8fb8f72 values:Vector<TypedOp> = TypedOpOp;
intLiteralOp#cbfabe7c value:int = TypedOpOp;
longLiteralOp#d08b8d3a value:long = TypedOpOp;
stringLiteralOp#2b56ea8e value:string = TypedOpOp;
bytesLiteralOp#fdb395a4 value:bytes = TypedOpOp;
boolLiteralOp#37e07911 value:Bool = TypedOpOp;
doubleLiteralOp#3651e3bf value:double = TypedOpOp;
themeFormatLiteralOp#8e4f9208 = TypedOpOp;
Action parameters are represented by TypedOpOp constructors.
copyOpThe most commonly used type, copies the value(s) from the stored $FileSource.$from.
getInputChannelByIdOpReturns an InputChannel constructor from the client's peer database, based on the channel ID of type long specified in the stored $FileSource.$from.
getInputUserByIdOpReturns an InputUser constructor from the client's peer database, based on the channel ID of type long specified in the stored $FileSource.$from.
getInputPeerByIdOpReturns an InputPeer constructor from the client's peer database, based on the bot API peer ID » of type long specified in the stored $FileSource.$from.
constructorOpConstructs the constructor of type (predicate) constructor using the arguments specified in args.
vectorOpConstructs a vector of the constructors passed in values.
intLiteralOpConstructs a literal int with the value passed in value.
longLiteralOpConstructs a literal long with the value passed in value.
stringLiteralOpConstructs a literal string with the value passed in value.
bytesLiteralOpConstructs a literal bytes with the value passed in value.
boolLiteralOpConstructs a literal Bool with the value passed in value.
doubleLiteralOpConstructs a literal double with the value passed in value.
themeFormatLiteralOpConstructs a string, indicating the theming engines supported by the client (used when working with theme-related media, can be an empty string if the client doesn't support themes).