Context-Aware Temporal Embeddings For Text And Video Data