Interface StorageTransportExtension

    • Method Summary

      All Methods Instance Methods Abstract Methods 
      Modifier and Type Method Description
      StorageTransportConfiguration getStorageConfiguration()
      void initialize​(java.lang.String jobId, org.apache.spark.SparkConf conf, boolean isOnDriver)
      Initializes the instance of this class after it has been created.
      void onAllObjectsPersisted​(long objectsCount, long rowCount, long elapsedMillis)
      Notifies the extension that all the objects have been persisted to the cloud storage successfully.
      void onImportFailed​(java.lang.String clusterId, java.lang.Throwable cause)
      Notifies the CoordinatedTransportExtension implementation that it fails to import objects into the cluster.
      void onImportSucceeded​(java.lang.String clusterId, long elapsedMillis)
      Notifies the CoordinatedTransportExtension implementation that all objects have been imported into the cluster.
      void onJobFailed​(long elapsedMillis, java.lang.Throwable throwable)
      Notifies the extension that the job has failed with exception throwable.
      void onJobSucceeded​(long elapsedMillis)
      Notifies the extension that the job has completed successfully.
      void onObjectApplied​(java.lang.String bucket, java.lang.String key, long sizeInBytes, long elapsedMillis)
      Notifies the extension that the object identified by the bucket and key has been applied, meaning the SSTables included in the object is imported into Cassandra and satisfies the desired consistency level.
      void onObjectPersisted​(java.lang.String bucket, java.lang.String key, long sizeInBytes)
      Notifies the extension that the objectURI has been successfully persisted to the blob store.
      void onStageFailed​(java.lang.String clusterId, java.lang.Throwable cause)
      Notifies the CoordinatedTransportExtension implementation that it fails to stage objects on the cluster.
      void onStageSucceeded​(java.lang.String clusterId, long elapsedMillis)
      Notifies the CoordinatedTransportExtension implementation that all objects have been staged on the cluster.
      void onTransportStart​(long elapsedMillis)
      Notifies the extension that data transport has been started.
      void setCoordinationSignalListener​(CoordinationSignalListener listener)
      Set the CoordinationSignalListener to receive coordination signals from CoordinatedTransportExtension implementation
      void setCredentialChangeListener​(CredentialChangeListener credentialChangeListener)
      Sets the CredentialChangeListener to listen for token changes.
      void setObjectFailureListener​(ObjectFailureListener objectFailureListener)
      Sets the ObjectFailureListener to listen for token changes.
    • Method Detail

      • initialize

        void initialize​(java.lang.String jobId,
                        org.apache.spark.SparkConf conf,
                        boolean isOnDriver)
        Initializes the instance of this class after it has been created. The initialization implementation could differentiate based on whether is it running on Spark driver or executor
        Parameters:
        jobId - the unique identifier for the job. It could either be supplied by customer with WriterOptions.JOB_ID, or a unique id string generated by the job on starting up, if no jobId is supplied.
        conf - the spark configuration
        isOnDriver - indicate whether the role of the runtime is Spark driver or executor
      • onObjectPersisted

        void onObjectPersisted​(java.lang.String bucket,
                               java.lang.String key,
                               long sizeInBytes)
        Notifies the extension that the objectURI has been successfully persisted to the blob store. This method will be called from each task during the job execution.
        Parameters:
        bucket - the bucket to which the file was written
        key - the key to the object written
        sizeInBytes - the size of the object, in bytes
      • onTransportStart

        void onTransportStart​(long elapsedMillis)
        Notifies the extension that data transport has been started. This method will be called from the driver.
        Parameters:
        elapsedMillis - the elapsed time from the start of the bulk write run until this step for the job in milliseconds
      • onAllObjectsPersisted

        void onAllObjectsPersisted​(long objectsCount,
                                   long rowCount,
                                   long elapsedMillis)
        Notifies the extension that all the objects have been persisted to the cloud storage successfully. This method is called from driver when all executor tasks complete.
        Parameters:
        objectsCount - the total count of objects persisted
        rowCount - the total count of rows persisted
        elapsedMillis - the elapsed time from the start of the bulk write run until this step for the job in milliseconds
      • onObjectApplied

        void onObjectApplied​(java.lang.String bucket,
                             java.lang.String key,
                             long sizeInBytes,
                             long elapsedMillis)
        Notifies the extension that the object identified by the bucket and key has been applied, meaning the SSTables included in the object is imported into Cassandra and satisfies the desired consistency level.
        The notification is only emitted once per object and as soon as the consistency level is satisfied.
        Parameters:
        bucket - the belonging bucket of the object
        key - the object key
        sizeInBytes - the size of the object in bytes
        elapsedMillis - the elapsed time from the start of the bulk write run until this step for the job in milliseconds
      • onJobSucceeded

        void onJobSucceeded​(long elapsedMillis)
        Notifies the extension that the job has completed successfully. This method will be called from the driver at the end of the Spark Bulk Writer execution when the job succeeds.
        Parameters:
        elapsedMillis - the elapsed time from the start of the bulk write run until this step for the job in milliseconds
      • onJobFailed

        void onJobFailed​(long elapsedMillis,
                         java.lang.Throwable throwable)
        Notifies the extension that the job has failed with exception throwable. This method will be called from the driver at the end of the Spark Bulk Writer execution when the job fails.
        Parameters:
        elapsedMillis - the elapsed time from the start of the bulk write run until this step for the job in milliseconds
        throwable - the exception encountered by the job
      • onStageSucceeded

        void onStageSucceeded​(java.lang.String clusterId,
                              long elapsedMillis)
        Notifies the CoordinatedTransportExtension implementation that all objects have been staged on the cluster. The callback should only be invoked once per cluster
        Parameters:
        clusterId - identifies a Cassandra cluster
        elapsedMillis - the elapsed time from the start of the bulk write run in milliseconds
      • onStageFailed

        void onStageFailed​(java.lang.String clusterId,
                           java.lang.Throwable cause)
        Notifies the CoordinatedTransportExtension implementation that it fails to stage objects on the cluster. The callback should only be invoked once per cluster
        Parameters:
        clusterId - identifies a Cassandra cluster
        cause - failure
      • onImportSucceeded

        void onImportSucceeded​(java.lang.String clusterId,
                               long elapsedMillis)
        Notifies the CoordinatedTransportExtension implementation that all objects have been imported into the cluster. The callback should only be invoked once per cluster
        Parameters:
        clusterId - identifies a Cassandra cluster
        elapsedMillis - the elapsed time from the start of the bulk write run in milliseconds
      • onImportFailed

        void onImportFailed​(java.lang.String clusterId,
                            java.lang.Throwable cause)
        Notifies the CoordinatedTransportExtension implementation that it fails to import objects into the cluster. The callback should only be invoked once per cluster
        Parameters:
        clusterId - identifies a Cassandra cluster
        cause - failure